Trusted Research Environments¶
Secure computing for sensitive research data, aligned to the Five Safes framework.
What Is a TRE?¶
A Trusted Research Environment (TRE) is a secure computing environment designed to allow researchers to analyse sensitive data -- such as patient health records, genomic data, or confidential government datasets -- without the data ever leaving a controlled boundary.
TREs are increasingly required by funding bodies, data custodians, and regulatory frameworks (GDPR, NHS Digital, UKRI) as a condition of data access.
The Five Safes Framework¶
RosettaHub maps its platform capabilities to the Five Safes -- the widely adopted model for governing access to sensitive data:
| Safe | Principle | RosettaHub Capability |
|---|---|---|
| Safe People | Only authorized users access the environment | IAM integration, SSO (SAML 2.0, LDAP, OAuth), role-based access control |
| Safe Projects | Work is governed by approved project scope | Project isolation with dedicated accounts and budgets |
| Safe Settings | Compute runs in a secure, controlled environment | Private engines, VPC-isolated formations, encrypted networking |
| Safe Data | Data is protected at rest and in transit | Full encryption key lifecycle, encrypted storages, access-controlled mounts |
| Safe Output | Results are reviewed before leaving the environment | Egress controls, output review workflows |
RosettaHub VRE Capabilities¶
RosettaHub's Virtual Research Environment (VRE) provides the compute and collaboration layer within a TRE architecture:
- Formations -- reproducible, auditable environment templates
- Private Engines -- dedicated compute isolated from shared infrastructure
- Workbenches -- Jupyter, RStudio, VS Code running inside the secure boundary
- Docker Formations -- containerized analysis pipelines with pinned dependencies
- Kubernetes Clusters -- orchestrated multi-container workloads for large-scale analysis
- Encrypted Storages -- object, file, and block storage with managed encryption keys
Formations as TRE Workspace Delivery¶
RosettaHub Formations are a natural mechanism for TRE-centric workspace delivery. Formations deliver a variety of infrastructure patterns, allowing the workspace to control a container, a machine, a cluster, or any other infrastructure built using CloudFormation, Terraform, CDK, or Pulumi.
This means a TRE workspace is not limited to a single VM -- it can be a fully orchestrated environment with multiple components, all defined as code and deployed consistently.
Machine Agents¶
RosettaHub machine agents provide fine-grained access management within workspaces and advanced container orchestration for delivering complex research environments. Agents enable:
- Session management -- manage researcher sessions centrally across all TRE workspaces
- Collaboration controls -- allow or restrict collaboration between researchers within the secure boundary
- Container orchestration -- deliver multi-container workspaces with precise resource and access controls
Architecture¶
A RosettaHub-based TRE is organized into three zones:
┌─────────────────────────────────────────────────┐
│ Management Zone │
│ Organization admin, user onboarding, budgets, │
│ compliance policies, audit logs │
├─────────────────────────────────────────────────┤
│ Portal Zone (VRE) │
│ Formations, workbenches, containers, │
│ Kubernetes, encrypted storage │
├─────────────────────────────────────────────────┤
│ Airlock Zone (Egress) │
│ Output review, data classification, │
│ approved export workflows │
└─────────────────────────────────────────────────┘
- Management Zone -- administrators manage users, cloud accounts, budgets, and compliance via Cloud Operations
- Portal Zone -- researchers work within governed compute environments using the MetaCloud
- Airlock Zone -- results pass through review and classification before leaving the environment
RosettaHub-Operated Airlock¶
The Airlock Zone implements controlled egress using RosettaHub's platform components. The architecture is modelled on the DRTC (Data-Return Transfer Controller) pattern:
- Gitea-based output review -- researchers submit results via pull requests; data custodians review and approve before data leaves the boundary
- Amazon Macie integration -- automated classification scans output for sensitive data (PII, PHI, credentials) before release
- Audit trail -- every export request, review decision, and data transfer is logged
Airlock Roadmap
- Gitea airlock workflows -- March 2026
- Amazon Macie data detection -- May 2026
The TRE Trilogy¶
RosettaHub's TRE approach combines three code-driven pillars:
| Pillar | What It Delivers |
|---|---|
| Compliance-as-Code | 207 Cloud Custodian policies enforcing HIPAA (564 controls), ISO 27001 (138 controls), CIS, and NIST continuously |
| Infrastructure-as-Code | Formations define reproducible, auditable environments using CF, TF, CDK, or Pulumi |
| Frontend-as-Code | RosettaHub dashboard perspectives, views, and marketplace are configurable per institution |
Together, these ensure that security, infrastructure, and user experience are all version-controlled, auditable, and repeatable.
Compliance and Data Protection¶
| Capability | Description |
|---|---|
| Cloud Custodian | Automated policy enforcement across all cloud accounts |
| GDPR Alignment | Anonymous cloud accounts decouple researcher identity from cloud-level billing |
| Encryption Key Lifecycle | Full key creation, rotation, and revocation managed through the platform |
| Audit Logging | All user actions, launches, and data access events are recorded |
| Budget Governance | Real-time enforcement prevents uncontrolled resource creation |
Platform Strengths for TRE¶
Why RosettaHub
- Mature platform -- 7+ years in production with research institutions
- User onboarding in seconds -- SSO integration, no manual account setup
- Real-time cost governance -- event-driven budget enforcement, not billing-lag
- Flexible hierarchy -- organizations, sub-organizations, projects map to any institutional structure
- Multi-cloud -- deploy TREs on AWS, Azure, or GCP without changing the workflow
- Research institution experience -- 8+ years in production with research institutions
Competitive Advantages¶
| Differentiator | RosettaHub TRE | Typical TRE Providers |
|---|---|---|
| Multi-cloud support | AWS, Azure, GCP, Alibaba Cloud, OVH, OpenStack | Usually single-cloud |
| Governance + compute | Closed-loop in one platform | Separate tools, manual integration |
| Formation-based environments | Cloud-agnostic, reproducible, shareable | Provider-specific templates |
| Real-time cost control | Event-driven enforcement | Billing-lag reporting |
| User onboarding | Seconds via SSO | Days/weeks with manual provisioning |
| Organization hierarchy | Unlimited nesting, budget delegation | Flat or two-level structures |
Next Steps¶
- The RosettaOps Model -- understand tiered governance
- Formations -- learn how environments are defined
- Projects -- isolate work by study or grant
- Enterprise & SMB -- governance capabilities in depth