The Data Science Workbench¶
Overview¶
The RosettaHub Data Science Workbench is a real-time collaborative IDE for data scientists, built into the Supercloud platform. It enables interaction in a stateful manner with hybrid engines that glue together Python, R, Scala/Spark, SQL, MATLAB, Mathematica, and other environments -- allowing different tools to share workspace and variables in memory. The workbench breaks the silos between data science environments, letting teams work across languages and tools within a single, unified interface.
The workbench runs on dedicated cloud compute provisioned through RosettaHub formations, giving you full control over instance type, GPU resources, region, and cloud provider -- all governed by your organization's budget controls via Cloud Operations.
Rapid Data Science Application Prototyping¶
Build interactive dashboards and data science web applications without leaving the workbench.
The workbench includes an interactive widgets and collaborative applications designer that simplifies the path from prototype to production. Using a reactive programming model combined with hybrid engines, you can:
- Mix languages through macros -- call Python functions from R, pass Scala results to SQL, and chain computations across environments
- Bind variables to widgets -- create sliders, dropdowns, and input fields that are automatically connected to code variables
- Compose visualization components -- combine charts, tables, and maps into interactive layouts
- Deploy to the web with one click -- publish your prototype as a standalone web application accessible by stakeholders
This approach drastically reduces the time between exploring data and delivering actionable applications to end users.
Scientific Collaborative Spreadsheets¶
A web-based collaborative spreadsheet that interacts directly with R, Python, and Scala -- bridging the gap between spreadsheet workflows and programmatic analysis.
Key capabilities:
- Variable import/export -- push spreadsheet ranges into Python, R, or Scala variables, and pull computed results back into cells
- Cell mirroring -- link spreadsheet cells to code outputs so they update automatically when computations run
- Automatic formula mapping -- generate spreadsheet formulas from Python, R, or Scala functions, making complex computations accessible through familiar spreadsheet syntax
- Code cells -- individual cells can contain executable code alongside their results, turning the spreadsheet into a multi-language notebook
This hybrid model is ideal for teams that need the accessibility of spreadsheets with the power of programmatic data science.
Notebooks at Scale¶
Serve production-ready notebook and application environments to multiple users simultaneously with real-time collaboration built in.
Supported tools include:
| Tool | Description |
|---|---|
| Jupyter Notebook / JupyterLab | Interactive Python, R, and Julia development |
| RStudio | Statistical analysis and R development |
| Apache Zeppelin | Multi-language notebook with built-in Spark support |
| Spark Notebook | Scala-native interactive Spark development |
| Shiny Apps | Interactive R web applications |
| ParaView | Scientific visualization and 3D rendering |
| VNC Desktops | Full graphical Linux desktops in the browser |
Each environment runs on isolated cloud infrastructure managed by RosettaHub, with support for GPU instances, spot/preemptible pricing, and automatic hibernation. Administrators can deploy environments to entire teams using formation batch actions.
Excel as Front-End¶
An Excel add-in that transforms Microsoft Excel into a front-end for the RosettaHub Supercloud -- turning spreadsheets into universal notebooks connected to cloud compute.
Features of the Excel add-in:
- Real-time connectivity -- connect Excel to running RosettaHub sessions and execute code on cloud instances directly from the spreadsheet
- Cell mirroring -- link Excel cells to server-side variables that update in real time as computations run
- Ranges-to-variables conversion -- select Excel ranges and push them into Python, R, or Scala as named variables for further processing
- Automatic Excel formula generation -- define functions in Python, R, or Scala and generate corresponding Excel formulas, making cloud-powered computations available to any Excel user
A Microsoft Word add-in is also provided, enabling report generation and document automation connected to the same cloud compute backends.
Note
The Excel and Word add-ins connect to running RosettaHub sessions. You must have an active machine running a compatible workbench formation to use these features.
Getting Started¶
To start using the Data Science Workbench:
- Launch a notebook formation -- select a Jupyter Lab, RStudio, or Zeppelin formation from the Formations panel and launch it. See Launching Your First Formation for a step-by-step walkthrough.
- Connect to the running session -- click the machine in the Machines panel or right-click and select Get Connectivity Info to open the web interface.
- Start working -- write code, visualize data, and collaborate with teammates in real time.
For GPU-accelerated workloads, choose an instance type with NVIDIA GPUs (e.g., AWS P4d, Azure NC series, GCP A2) when launching your formation.
Tip
If you are new to the platform, start with the Getting Started tutorial to learn dashboard navigation, then return here to set up your data science environment.
Related Topics¶
- For Data Science Teams - Industry solution overview with GPU instances, spot hibernation, and ML workflows
- Formations - Create and manage cloud-agnostic IaC recipes for workbench environments
- Sessions - Manage running sessions with real-time cost tracking
- Images - Machine images used by workbench formations
- Storages - Mount cloud storage for datasets and model artifacts