https://staging.fullstackdeeplearning.com/images/fsdl_og_image.jpg


Lecture by Josh Tobin. Notes by James Le and Vishnu Rachakonda.

Introduction

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/d7a8a640-40ea-4abb-a628-91082746eafa/image21.png

Deploying models is a critical part of making your models good, to begin with. When you only evaluate the model offline, it's easy to miss the more subtle flaws that the model has, where it doesn't actually solve the problem that your users need it to solve. Oftentimes, when we deploy a model for the first time, only then do we really see whether that model is actually doing a good job or not. Unfortunately, for many data scientists and ML engineers, model deployment is an afterthought relative to other techniques we have covered.

Much like other parts of the ML lifecycle, we'll focus on deploying a minimum viable model as early as possible, which entails keeping it simple and adding complexity later. Here is the process that this lecture covers:

Build a prototype
Separate your model and UI
Learn the tricks to scale
Consider moving your model to the edge when you really need to go fast

1 - Build a Prototype To Interact With

There are many great tools for building model prototypes. HuggingFace has some tools built into its playground. They have also recently acquired a startup called Gradio, which makes it easy to wrap a small UI around the model. Streamlit is another good option with a bit more flexibility.

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/de3bb65c-951e-4add-a1de-65c6d766ea69/image19.png

Here are some best practices for prototype deployment:

**Have a basic UI**: The goal at this stage is to play around with the model and collect feedback from other folks. Gradio and Streamlit are your friends here - often as easy as adding a couple of lines of code to create a simple interface for the model.
**Put it behind a web URL**: An URL is easier to share. Furthermore, you will start thinking about the tradeoffs you'll be making when dealing with more complex deployment schemes. There are cloud versions of [Streamlit](<https://streamlit.io/cloud>) and [HuggingFace](<https://huggingface.co/>) for this.
**Do not stress it too much**: You should not take more than a day to build a prototype.

A model prototype won't be your end solution to deploy. Firstly, a prototype has limited frontend flexibility, so eventually, you want to be able to build a fully custom UI for the model. Secondly, a prototype does not scale to many concurrent requests. Once you start having users, you'll hit the scaling limits quickly.

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/4f394c6d-d36a-409a-a2c0-17115172873b/image18.png