Using Automated Builds in ModelOps
In this installment of the ModelOps Blog Series, we will transition from what it takes to build AI models to the process of deploying into production. Think of this as the on ramp for extracting value from your AI investments—moving your model out of the lab and into an environment where it can provide new insights for your organization or add value to customers.
Front and center is the concept of continuous integration (CI) and continuous deployment (CD). This methodology can be applied to automate the process of releasing AI models in a reproducible and reliable manner. Get ready to walk away with everything you need to know in order to leverage containers to formalize and manage AI models within your organization.
The starting point for the deployment process is a source-control, versioned AI model. Need a refresher on how to get to the starting point? Review the previous blogs in this series which cover how to produce a model with responsibly sourced data and software development best-practices around model training and versioning.
Data Preparation: Putting the Right Process in Place for AI
Model Training: Our Favorite Tools in the Shed
Model Versioning. Reduce Friction. Create Stability. Automate.
Living in a containerized world
For ModelOps, containers are a standard way to package AI models to leverage in production. In essence, a container is a running software application comprised of the minimum requirements necessary to run the application, including an operating system, application source code, system dependencies, programming language libraries, and runtime. Containers are comprised of static container images that outline each resource and instruction required to bring the application to life within the container.
Your organization might already embrace containers or microservices in more traditional software and DevOps settings. But did you know containers can also be applied to the packaging and distribution of AI models for data science teams? That’s good news for leaders investing in the development of AI models because it means that models—and their difficult to install dependencies—can be packaged up into containers that can run anywhere. Upskilling and familiarizing your data science team with container technology will empower them to easily package their own AI models and participate in a robust CI/CD process—which can reduce your timeline to realize return on your AI investments.
Extending the notion of an AI model
Modzy extends the container concept to power AI models running in production. AI models are deployed through an open, standardized template that encourages developers to expose the functionality of their AI model while ensuring it can run anywhere (see example.) Keeping the focus on production deployment, a single set of best practices can be put into place. Without standardization, model developers often work in disparate development environments creating challenges with reproducing or handing off models from the research team to the production team.
Standardizing the process for how models are packaged ensures data scientists don’t need deep expertise in either software engineering or DevOps. However, they can reap the benefits of these disciplines. Data scientists can focus on developing new models to solve important problems instead of hacking together patchwork solutions every time a model is ready for deployment.
Ideally, you want a suite of standard templates for popular machine learning frameworks such as TensorFlow and PyTorch, giving data scientists the flexibility to use their tools of choice. This is a capstone to the process of model training described in Model Training: Our Favorite Tools in the Shed. Developers can make individualized decisions during the development of each model without compromising a streamlined process for model development and release.
Leveraging CI for automated builds
A CI/CD process that takes source code for a freshly developed AI model and automatically produces a containerized version of that model is the gold standard for build automation practices. Establishing such a process means that deployment is fully reproducible with no manually curated steps that could introduce error and consume valuable developer time. Modern CI frameworks such as Jenkins, CircleCI, or GitHub Actions are essential tools in the CI/CD pipeline. They keep your team’s development velocity high by allowing your data scientists to focus on developing their models instead of solving complicated deployment nuances—translating directly to an accelerated completion.
Modzy’s approach combines continuous integration best practices with containerization to build container images for models. By automating the build process, model versioning best practices are deployed to the models ensuring each model is traceable to a specific version of secure, tested code. (Check out where this was highlighted in the Model Versioning: Reduce Friction. Create Stability. Automate blog. Once a model developer checks in their code to version control, the AI model image is built, scanned, and tested making it ready for any hand-off or deployment. This simple, convenient process makes automated builds something developers will seek out, rather than a burdensome business practice.
Empowering teams of data scientists and machine learning engineers through robust practices of CI and containerization will serve to bridge the gap between AI development and deployment at scale.
Visit modzy.com to learn more.