Deploying Pre-trained Keras Models Using Tensorflow 2 on Amazon SageMaker

How does one go about deploying a model on Amazon SageMaker from Keras, TensorFlow, or TensorFlow Hub without first doing training?

There are countless articles and blog posts discussing how to train a machine learning model using TensorFlow or Keras, and then deploy that model right away to Amazon SageMaker. But what if you're already starting from a SavedModel, or just want to serve up a model trained on plain vanilla ImageNet from within your own AWS account? You might end up wading through tons of confusing, outdated information that will misguide you, causing you to go down rabbit holes that will make things seem unnecessarily complex. For instance, you might be inclined to use compatibility libraries to find low-level attributes of modern classes to leverage older deprecated function calls, or to build totally unnecessary infrastructure along the side that makes a Docker container for your model, which will be inevitably broken because you don't know how to invoke it. And undoubtedly, swimming through the AWS documentation yields few if any practical examples of the many classes you can instantiate in order to (presumably) achieve the same thing, and yet they don't work the same at all. When it's all said and done, you might wonder why it works in the first place, and upon learning why, uncover that something you used isn't what you intended to use after all.

Nevertheless, I had quite an adventure doing this, and have distilled it into a simple Jupyter notebook you can leverage with your own models for your own use. Check out the notebook now:

https://github.com/mrcity/mlworkshop/blob/master/sagemaker/deploy-model.ipynb

This notebook illustrates the following operations:

Instantiating a ResNet50 model from Keras, pre-loaded with ImageNet weights
Sanity-checking the model

Examining the model's layers without using TensorBoard
Setting up inference on multiple images at once

Creating a SavedModel with modern TensorFlow 2.0 APIs without requiring use of a Session
Tarring & GZipping the model
Re-importing the model

Using SageMaker 1.x library instead of SageMaker 2.x
Why can't you just make a Predictor from the Model you instantiated?

Deploying the model onto SageMaker
Invoking the model from scratch (should you restart your kernel and lose your Predictor)
Test the hosted model using the same data you passed into the local model instance

Here are some key takeaways I had from this experience that should help you if you're trying to do something slightly different from my recipe in the notebook linked to above.

Estimating My Confusion Around Estimators

In TensorFlow, the term "estimator" is reserved for a high-level abstraction of a model in Tensorflow's API. There are pre-defined Estimators for several different kinds of deep learning algorithms, and they are designed to be trained, tested, inferred from, and deployed using very simple sleek function calls. They are also easy to deploy virtually anywhere because you don't have to change much code (if at all) to go from CPU all the way to TPU deployments. That said, Estimators aren't quite as flexible as the Keras API offered by Tensorflow these days, and the use of custom Estimators is no longer recommended for modern code.

Estimators in Amazon SageMaker are more generic, and allow you to encapsulate any sort of model inside them whatsoever. You can define a SageMaker Estimator to be based on various libraries such as TensorFlow, Keras, or PyTorch, or even specify them to be loaded from a previously saved model. They also have the same convenience functions for invoking training, testing, and so forth as the TensorFlow estimators do. As such, if you are faced with instantiating an Estimator in SageMaker, just know it will not bind you to using a TensorFlow Estimator.

Check Your Bucket

Originally, I went down the path of zipping my SavedModel resources manually using tar commands in a Jupyter notebook cell, and then uploading them in the S3 console to a bucket of my choice. I had to tackle a long chain of errors in order to get to the realization that I don't need to upload the package to my own S3 bucket. There was a ton of trial and error to do before I could come to this realization:
(Note the things in the bulleted list below are things not to do, but I did them in search of the answer!)

Got rid of a bunch of warnings about old versions of TensorFlow, Python, SageMaker, or what have you. Of course it would tell you about them one at a time.
Set the role SageMaker was using to try to load the S3 resource into the Model instance.

This entails adding some AWS managed policies to the role, including AmazonSageMakerFullAccess (or using this as a template for your own policy, but limiting it to specific resources you wish to grant permission to). Originally I thought I needed such access to S3 as well, but the SageMaker role covers what you really need.
This also entails adding the identity provider sagemaker.amazonaws.com so that SageMaker services can assume the role you created.

Found out that arguments the documentation said would be removed/made optional were still required. (Gee, you're not still deploying notebook instances with old versions of SageMaker, are you?)
Specified what should have been a private variable to my model class instance (model.bucket) in order to have it use the bucket I wanted.
Set up public access to my bucket. Full public access. Barf. I was really not sure why this was required to get rid of errors when I was specifying a role for SageMaker to use that had appropriate S3 permissions.
Set my entry_point (which shouldn't have been required, but was) to a file local to the Jupyter notebook, and not something from within the archive I was uploading to S3.
Attempted to use my own credentials from ~/.aws/config and ~/.aws/credentials so it would understand my user account, roles, and permissions. (This led to an error AWS CodePipeline error: Cross-account pass role is not allowed).
Found the role it was trying to use, and added appropriate permissions to it.
Busted my head as to why it couldn't find the model data in the archive, now that it has access to the archive I uploaded S3. (This involved lots of experimentation with directory structures under which the SavedModel resources would live.)

Still coming up short after going through all these steps, I found myself searching for answers once again when a promising nugget darted before my eyes. It turns out that by opening up a Sagemaker session instance and using it to upload_data (as shown in the notebook), then during the creation of your Model object, it will actually be able to pick up on the contents of your model.*

(*) A very important nuance of the output of your SavedModel is the user of the correct export path. You'll see the path export/Servo/0000000001/ used as the set of folders under which the model resources live. export/Servo/ is required by the Amazon SageMaker framework to locate the model in the archive (although if you serve multiple models from the same endpoint, you can evidently get rid of it). The 10-digit number is for TensorFlow Serving to help pick the correct version of the model to actually serve, in case you wish to update it later or even revert back to a previous instance. You can make it sequential, a Unix timestamp, or whatever you feel like.

Incidentally, in many of my original trials, I was using the incorrect path /output/Servo rather than /export/Servo. As such, this could explain the many failures to actually find the model's data. Oh well; through all this trial and error, I think I found the superior way anyway.

SageMaker Library Version

Earlier, I hinted at all the confusion around different versions of libraries, and how you may build something unexpected. As it turns out, I expected that I was using SageMaker Python SDK 2.x, when in reality, I'm actually using 1.x. This was made apparent by looking up the distinction between the Model and TensorFlowModel classes, and realizing that TensorFlowModel is the name of the class in SageMaker 2.x. Something else that hinted at this was that the syntax for instantiating the Predictor seemed to fall in line with SageMaker SDK v1.x.

You would think that creating brand new Jupyter notebooks inside SageMaker with TensorFlow 2 and Python 3 would come with the latest version of the SageMaker SDK, but apparently not. It's up to you to update SageMaker using pip. Once you do, you can unlock the 2.x versions of all these commands. However, you may find the changes to be regressive, requiring you to fall back to having to do annoying things. For instance, TensorFlowModel may require specification of entry_point as an input argument, which is very scantily documented across the Internet. However, TensorFlowModel could have referred to different code back in SageMaker SDK 1.x, so it's possible I was just running into a way more annoying version of it. However, one thing that definitely seems to have changed (and would have required me to write a bit more code) is the instantiation of serializer and deserializer objects. With SageMaker Python SDK 1.x, the default ser/des classes are JSON-based. However, they change in SDK 2.x, so you would have to specify JSON ser/des classes if you wish to continue using them.

Eyes Glazed Over Yet?

I've used a great many eye drops during the course of this research, but hopefully you found it helpful in sorting out your SageMaker journey. Leave me a comment if you found anything particularly helpful, if you ran into yet more trouble, or when this blog post becomes out of date and it has overstayed its welcome on the Internet -- because unlike writing C++ code for the Amiga vs. PC, where you might run into hard copies of compilers for such old systems, an SDK published by a cloud company is likely to vanish without a trace, leaving only a tangled web of old unmaintained legacy code in its wake.

Search This Blog

GOSHtastic - Game shows, Options, Software, & Hardware!