Amazon SageMaker Unified Studio is a single integrated development environment (IDE) that brings together your data tools for analytics and AI. As part of the next generation of Amazon SageMaker, it contains integrated tooling for building data pipelines, sharing datasets, monitoring data governance, running SQL analytics, building artificial intelligence and machine learning (AI/ML) models, and creating generative AI applications. Recently, AWS announced two additional options that enhance the development experience for analytics, ML, and generative AI teams: Code Editor and multiple spaces. These new IDE options can help developers and data scientists speed up delivery of ML workloads by offering familiar IDE layouts, using popular extensions to enhance development, and using critical debug and test options, all within a unified environment.
Code Editor, based on Code-OSS (Visual Studio Code – Open Source), provides a lightweight and powerful IDE with familiar shortcuts and terminal access, along with advanced debugging capabilities and refactoring tools. The VSCode IDE, and Code-OSS variants like Code Editor, remain the most popular development tool in recent years. Teams can boost their productivity by accessing thousands of Code Editor-compatible extensions from the Open VSX extension gallery. The Code Editor IDE within SageMaker Unified Studio supports version control and cross-team collaboration through GitHub, GitLab, or Bitbucket repositories, while offering preconfigured SageMaker distribution for popular ML frameworks.
Within SageMaker Unified Studio, a space is a work environment that runs a particular IDE. To maximize the benefits of Code Editor alongside other coding interfaces in SageMaker Unified Studio, including JupyterLab, SageMaker now supports multiple spaces per user per project. With multiple spaces, users can manage parallel workstreams with different computational needs. Each space maintains a 1-to-1 relationship with an application instance, so users can efficiently organize their storage and resource requirements. This enhancement provides the flexibility to access multiple applications and instances simultaneously, improving workflow management and productivity.
In this post, we walk through how you can use the new Code Editor and multiple spaces support in SageMaker Unified Studio. The sample solution shows how to develop an ML pipeline that automates the typical end-to-end ML activities to build, train, evaluate, and (optionally) deploy an ML model.
Features of Code Editor in SageMaker Unified Studio
Code Editor offers a unique set of features to increase the productivity of your ML team:
Fully managed infrastructure – The Code Editor IDE runs on fully managed infrastructure. SageMaker takes care of keeping the instances up-to-date with the latest security patches and upgrades.
Dial resources up and down – With Code Editor, you can seamlessly change the underlying resources (such as instance type or EBS volume size) on which Code Editor is running. This is beneficial for developers who want to run workloads with changing compute, memory, and storage needs.
SageMaker provided images – Code Editor is preconfigured with Amazon SageMaker Distribution as the default image. This container image has the most popular ML frameworks supported by SageMaker, along with the SageMaker Studio SDK, SageMaker Python SDK, Boto3, and other AWS and data science specific libraries installed. This significantly reduces the time you spend setting up your environment and decreases the complexity of managing package dependencies in your ML project.
Amazon Q Developer – Code Editor also comes with generative AI capabilities powered by Amazon Q Developer. You can boost your productivity by generating inline code suggestions within the IDE. In addition, you can use Amazon Q chat to ask questions about building at AWS and for assistance with software development. Amazon Q can explain coding concepts and code snippets, generate code and unit tests, and improve code, including debugging or refactoring.
Extensions and configuration settings – Code Editor also includes persistence of installed extensions and configuration settings.
When you open Code Editor, you will notice that the space has been bootstrapped with the current state of your project’s repository. Navigate to the file explorer, and you will find a getting_started.ipynb Jupyter notebook, as shown in the following screenshot.
You can choose Run All to execute this notebook. Select Python Environments when prompted to select the kernel and then choose the recommended Python environment named base. Now the getting_started notebook will be executed, and you can explore the output of the various cells.
Architecture of Code Editor in SageMaker Unified Studio
When you open Code Editor in SageMaker Unified Studio, it creates an application container that runs on an Amazon Elastic Compute Cloud (Amazon EC2) instance. This instance type matches your selection during Code Editor space configuration. The underlying infrastructure management happens automatically in a service-managed account controlled by SageMaker Unified Studio. The following diagram shows the infrastructure as it relates to end-users and how instances are provisioned. User A has configured two spaces, and User B is using a single space. Both users have the option to create additional spaces as needed. Currently, these spaces are isolated private environments, with shared space functionality planned for a future release.
SageMaker Unified Studio lets you create multiple spaces with Code Editor or JupyterLab as the IDE, each configurable with different ML instance types, including those with accelerated computing capabilities. For each space, you must specify three core elements: the EBS volume size, your chosen instance type, and the application type you want to run (such as Code Editor or JupyterLab). When you initiate a space, SageMaker Unified Studio automatically provisions a compute instance and launches a SageMaker Unified Studio Code Editor application using your specified container image. The storage system is designed for continuity: your EBS volume persists across sessions, even when you stop and restart the IDE. This means that when you stop the Code Editor application to save on computing costs, although the compute resources shut down, your EBS volume is preserved. Upon restart, the system automatically reattaches this volume, so your work remains intact.
Solution overview
In the following sections, we show how to develop an ML project with Code Editor on SageMaker Unified Studio. For this example, we run through a Jupyter notebook that creates an ML pipeline using Amazon SageMaker Pipelines, which automates the usual tasks of building, training, and (optionally) deploying a model.
In this scenario, Code Editor can be used by an ML engineering team who needs advanced IDE features to test and debug their code, create and execute a pipeline, and monitor the status in SageMaker Unified Studio.
Prerequisites
To prepare your organization to use the new Code Editor IDE and multiple spaces support in SageMaker Unified Studio, complete the following prerequisite steps:
Create an AWS account.
Configure AWS IAM Identity Center accordingly.
By default, authentication and authorization for a SageMaker Unified Studio domain is controlled through IAM Identity Center, which can only be configured in a single AWS Region that must be the same Region as your SageMaker domain. See Setting up Amazon SageMaker Unified Studio for additional information.
Create a SageMaker Unified Studio domain using the quick setup. A virtual private cloud (VPC) is required; one will be created for you (if needed) during setup.
After you create the domain, you can enable access to SageMaker Unified Studio for users with single sign-on (SSO) credentials through IAM Identity Center by choosing Configure next to Configure SSO user access in the Next steps for your domain section.
After you configure user access for your newly created domain, navigate to the SageMaker Unified Studio URL and log in using SSO.
You can find the URL on the SageMaker console, as shown in the following screenshot.
By default, IAM Identity Center requires multi-factor authentication on user accounts, and you might be prompted to configure this upon first login to SageMaker Unified Studio, as shown in the following screenshot. For more details about this requirement, refer to Registering your device for MFA.
After you log in, choose Create Project and follow the prompts to create your first SageMaker Unified Studio project. We choose the All Capabilities project profile during setup.
We abstract away some of the concepts around project profiles in this post for simplicity. For more information, refer to Project profiles in Amazon SageMaker Unified Studio.
After you create a project, you can create your space (an IDE) in which Code Editor will be provisioned.
On the Compute tab of the project, choose Create Space, then enter a name and choose Code Editor.
When the Status column indicates the space is Running, open the space to be redirected to Code Editor.
Interacting with AWS services directly from your IDE
Out of the box, Code Editor comes with the AWS Toolkit for Visual Studio Code to provide you with an integrated experience to other AWS services during your project, such as viewing data within your Amazon Simple Storage Service (Amazon S3) buckets, finding container images in Amazon Container Registry (Amazon ECR), or visualizing Amazon CloudWatch logs for your SageMaker environment.
The AWS Toolkit for Visual Studio Code uses the permissions of the AWS Identity and Access Management (IAM) role assigned to the project. You can find the Amazon Resource Name (ARN) of the project role on the project details page, as shown in the following screenshot.
Use Code Editor to create and execute an ML pipeline in SageMaker
In this section, we upload and execute a Jupyter notebook that creates and starts a machine learning operations (MLOps) pipeline orchestrated with SageMaker Pipelines. The pipeline we create follows a typical ML application pattern of data preprocessing, training, evaluation, model creation, transformation, and model registration, as illustrated in the following diagram.
Begin by uploading the sample notebook directly into Code Editor. You can drag and drop the notebook, or right-click and choose Upload in the file explorer pane.
You can download and run sample notebooks using standard Git clone commands from the GitHub repository where these notebooks are located. Running the Full Pipeline notebook sample requires a few extra IAM role permissions other than the defaults assigned when the SageMaker Unified Studio project is created. The Quick Pipeline can be run as-is with the default IAM permissions.
Region availability, cost, and limitations
Code Editor and multiple spaces support are available in supported SageMaker Unified Studio domains. For more information about Regions where these features are available, see Regions where Amazon SageMaker Unified Studio is supported. Code Editor will be provisioned within a SageMaker space and run on a user-selectable instance type, anywhere from ultra low-cost instances (ml.t3.medium) up to highly performant GPU-based instances (G6 instance family).
The primary cost associated with running a Code Editor space is tied directly to the underlying compute instance type. The hourly costs for ML instance types can found on the Amazon SageMaker AI pricing page on the Instance details tab. To prevent unnecessary charges, the space will be automatically shut down after a configurable timeout when the space is idle (see SpaceIdleSettings). There will also be minimal charges tied to storage for the EBS volume that is attached to the Code Editor space.
At launch, Code Editor spaces can be configured to use a particular SageMaker Distribution image, either version 2.6 or 3.1. Additional major and minor releases of the SageMaker Distribution will be added over time.
Clean up
To avoid incurring additional charges, delete the resources created from following this post. This includes any development environments created, such as Code Editor or JupyterLab spaces, which you can delete by navigating to the Project Compute navigation pane, choosing the Spaces tab, choosing the options menu (three vertical dots) aligned with the space, and choosing Delete. You can remove project resources by deleting the project, which can be done from the SageMaker Unified Studio console. There is no charge for a SageMaker Unified Studio domain, but you can optionally delete this from the SageMaker AI console. If you created IAM Identity Center users that you no longer need, delete the users from the IAM Identity Center console.
Conclusion
The addition of the new Code Editor IDE to SageMaker Unified Studio provides a familiar working environment to thousands of data scientists and developers. With this powerful IDE, data scientists can more quickly build, train, tune, and deploy their ML models and push them into production where they can get measurable ROI. With thousands of pre-tested extensions through the VSX Registry, developers will have improved usability and productivity as they build and deploy their generative AI applications.
In addition, SageMaker Unified Studio now supports multiple spaces per user per project. These new environment options can help MLOps personas segregate workloads, isolate compute resources, and increase productivity through parallelized workstreams. Together, these enhancements help data science teams work more efficiently in bringing ML and generative AI solutions into production, where they can begin to reap the benefits of their work.
To get started using SageMaker Unified Studio, refer to the Amazon SageMaker Workshop. This workshop provides complete step-by-step instructions, plus sample datasets, source code, and Jupyter notebooks for gaining hands-on experience with the tooling.
To learn more about Code Editor, see Using the Code Editor IDE in Amazon SageMaker Unified Studio.
About the authors
Paul Hargis has focused his efforts on machine learning at several companies, including AWS, Amazon, and Hortonworks. He enjoys building technology solutions and teaching people how to leverage them. Paul likes to help customers expand their machine learning initiatives to solve real-world problems. Prior to his role at AWS, he was lead architect for Amazon Exports and Expansions, helping amazon.com improve the experience for international shoppers.
Hazim Qudah is an AI/ML Specialist Solutions Architect at Amazon Web Services. He enjoys helping customers build and adopt AI/ML solutions using AWS technologies and best practices. Prior to his role at AWS, he spent many years in technology consulting with customers across many industries and geographies. In his free time, he enjoys running and playing with his dogs!
Jayan Kuttagupthan is a Senior Software Engineer at Amazon with over 15 years of experience in backend development and design. He is currently working on improving Seller Partner Support Experience at Amazon. As a technical leader, Jayan has successfully built and mentored engineering teams across organizations, while also contributing to the broader tech community through speaking engagements such as SRECon Asia.
Majisha Namath Parambath is a Senior Software Engineer at Amazon SageMaker with 9+ years at Amazon. She’s provided technical leadership on SageMaker Studio (Classic and V2) and Studio Lab, and now leads key initiatives for the next-generation Amazon SageMaker Unified Studio, delivering an end-to-end data analytics and interactive machine learning experience. Her work spans system design and architecture, and cross-team execution, with a focus on security, performance, and reliability at scale. Outside of work, she enjoys reading, cooking, and skiing.