Skip to content

Trusted Research Environments

Secure computing for sensitive research data, aligned to the Five Safes framework.

What Is a TRE?

A Trusted Research Environment (TRE) is a secure computing environment designed to allow researchers to analyse sensitive data -- such as patient health records, genomic data, or confidential government datasets -- without the data ever leaving a controlled boundary.

TREs are increasingly required by funding bodies, data custodians, and regulatory frameworks (GDPR, NHS Digital, UKRI) as a condition of data access.

The Five Safes Framework

RosettaHub maps its platform capabilities to the Five Safes -- the widely adopted model for governing access to sensitive data:

Safe Principle RosettaHub Capability
Safe People Only authorized users access the environment IAM integration, SSO (SAML 2.0, LDAP, OAuth), role-based access control
Safe Projects Work is governed by approved project scope Project isolation with dedicated accounts and budgets
Safe Settings Compute runs in a secure, controlled environment Private engines, VPC-isolated formations, encrypted networking
Safe Data Data is protected at rest and in transit Full encryption key lifecycle, encrypted storages, access-controlled mounts
Safe Output Results are reviewed before leaving the environment Egress controls, output review workflows

RosettaHub VRE Capabilities

RosettaHub's Virtual Research Environment (VRE) provides the compute and collaboration layer within a TRE architecture:

  • Formations -- reproducible, auditable environment templates
  • Private Engines -- dedicated compute isolated from shared infrastructure
  • Workbenches -- Jupyter, RStudio, VS Code running inside the secure boundary
  • Docker Formations -- containerized analysis pipelines with pinned dependencies
  • Kubernetes Clusters -- orchestrated multi-container workloads for large-scale analysis
  • Encrypted Storages -- object, file, and block storage with managed encryption keys

Formations as TRE Workspace Delivery

RosettaHub Formations are a natural mechanism for TRE-centric workspace delivery. Formations deliver a variety of infrastructure patterns, allowing the workspace to control a container, a machine, a cluster, or any other infrastructure built using CloudFormation, Terraform, CDK, or Pulumi.

This means a TRE workspace is not limited to a single VM -- it can be a fully orchestrated environment with multiple components, all defined as code and deployed consistently.

Machine Agents

RosettaHub machine agents provide fine-grained access management within workspaces and advanced container orchestration for delivering complex research environments. Agents enable:

  • Session management -- manage researcher sessions centrally across all TRE workspaces
  • Collaboration controls -- allow or restrict collaboration between researchers within the secure boundary
  • Container orchestration -- deliver multi-container workspaces with precise resource and access controls

Architecture

A RosettaHub-based TRE is organized into three zones:

┌─────────────────────────────────────────────────┐
│                  Management Zone                 │
│  Organization admin, user onboarding, budgets,   │
│  compliance policies, audit logs                 │
├─────────────────────────────────────────────────┤
│                   Portal Zone (VRE)              │
│  Formations, workbenches, containers,            │
│  Kubernetes, encrypted storage                   │
├─────────────────────────────────────────────────┤
│                  Airlock Zone (Egress)           │
│  Output review, data classification,             │
│  approved export workflows                       │
└─────────────────────────────────────────────────┘
  • Management Zone -- administrators manage users, cloud accounts, budgets, and compliance via Cloud Operations
  • Portal Zone -- researchers work within governed compute environments using the MetaCloud
  • Airlock Zone -- results pass through review and classification before leaving the environment

RosettaHub-Operated Airlock

The Airlock Zone implements controlled egress using RosettaHub's platform components. The architecture is modelled on the DRTC (Data-Return Transfer Controller) pattern:

  • Gitea-based output review -- researchers submit results via pull requests; data custodians review and approve before data leaves the boundary
  • Amazon Macie integration -- automated classification scans output for sensitive data (PII, PHI, credentials) before release
  • Audit trail -- every export request, review decision, and data transfer is logged

Airlock Roadmap

  • Gitea airlock workflows -- March 2026
  • Amazon Macie data detection -- May 2026

The TRE Trilogy

RosettaHub's TRE approach combines three code-driven pillars:

Pillar What It Delivers
Compliance-as-Code 207 Cloud Custodian policies enforcing HIPAA (564 controls), ISO 27001 (138 controls), CIS, and NIST continuously
Infrastructure-as-Code Formations define reproducible, auditable environments using CF, TF, CDK, or Pulumi
Frontend-as-Code RosettaHub dashboard perspectives, views, and marketplace are configurable per institution

Together, these ensure that security, infrastructure, and user experience are all version-controlled, auditable, and repeatable.

Compliance and Data Protection

Capability Description
Cloud Custodian Automated policy enforcement across all cloud accounts
GDPR Alignment Anonymous cloud accounts decouple researcher identity from cloud-level billing
Encryption Key Lifecycle Full key creation, rotation, and revocation managed through the platform
Audit Logging All user actions, launches, and data access events are recorded
Budget Governance Real-time enforcement prevents uncontrolled resource creation

Platform Strengths for TRE

Why RosettaHub

  • Mature platform -- 7+ years in production with research institutions
  • User onboarding in seconds -- SSO integration, no manual account setup
  • Real-time cost governance -- event-driven budget enforcement, not billing-lag
  • Flexible hierarchy -- organizations, sub-organizations, projects map to any institutional structure
  • Multi-cloud -- deploy TREs on AWS, Azure, or GCP without changing the workflow
  • Research institution experience -- 8+ years in production with research institutions

Competitive Advantages

Differentiator RosettaHub TRE Typical TRE Providers
Multi-cloud support AWS, Azure, GCP, Alibaba Cloud, OVH, OpenStack Usually single-cloud
Governance + compute Closed-loop in one platform Separate tools, manual integration
Formation-based environments Cloud-agnostic, reproducible, shareable Provider-specific templates
Real-time cost control Event-driven enforcement Billing-lag reporting
User onboarding Seconds via SSO Days/weeks with manual provisioning
Organization hierarchy Unlimited nesting, budget delegation Flat or two-level structures

Next Steps