Creating a TensorFlow model creation API with Lambda and APIGEE

Derek Ferguson
Aug 5, 2018
3 min read

I recently assembled a presentation on using Java with TensorFlow. In general, TensorFlow has a strong preference for Python as its driving language but, by jumping through the hoops I explain in my presentation, one is able to do just about everything from Java.

However, there is one glaring gap: the definition of an initial model and optimization method. So, to address this, I have started the process of exposing some APIs via Lambda and APIGEE to serve up some models matching passed-in parameters. Along the way, I encountered several "gotchas" that I'd like to share here.

First, the full TensorFlow system is entirely too large to fit into a single Lambda deployment package. The main approach to resolving this is outlined in this blog post. I will say that, since the time of that article's publication -- or possibly because of my exact use case -- I had to make a few adjustments. Specifically, I found it necessary to run my script locally and delete entire directories from TensorFlow - restoring when things broke - until I got below the 50 MB limit. The "excludes" only worked for the file extensions to get additional headroom within the remaining directories.

TensorFlow's "reset_default_graph" method becomes profoundly important when you make the move to TensorFlow. In almost all cases, you'll find your TF graph kept in global memory in Python. In Lambda, this translates to state that is kept around for a period of time between method executions. So, in order to make sure you don't get flawed results (in my case, a model that grows in size with every run), you have to be sure to run "reset_default_graph" at the start of each method execution.

The AWS API Gateway is a monumental pain. This isn't specific to TensorFlow, but something I've fought for a long time. So, to me, it is actually better - both more powerful and easier to use - to hook up APIGEE as your gateway, even though it is from a different vendor. There is an excellent article here on the best way to do that. However, I would advise skipping to the comments from Kurt Kanaskie at the bottom when you are ready to implement this. The overall approach of putting your credentials in an KVM and creating a small Node.js wrapper to do the actual invocation makes a lot of sense. The only challenge reason to follow the comments instead of the main article is that the AWS API changed a bit since the original article was written, so Kurt's comments give the updated pattern.

Working with the APIGEE gateway is *much* easier if you start with a sample project - so you get the right folder structure - and then upload your changes using the "apigee" tool. I put the source code for my project here and the command I run to promote it (with my username and password cleared) is...

apigeetool deployproxy -u ***username*** -p ***password*** derekferguson-6c9ab059-eval -e test -n hello -d .

In order to get your Python code running properly on Lambda, you need to bundle up all the dependent libraries along with it. I found the instructions here most helpful in figuring out how to do that. In a nutshell, you want a clean location on your file system, then using pip3 to install all the dependent libraries in a subfolder before zipping them up along with your own source for deployment to Lambda as a Zip file. For example, the command I use is...

zip -r9 ../MyZip.zip . --exlude *.DS_Store *.pyc

Note the exclusions from the BBC IPlayer article on TF under Lambda above - this is part of slimming it down to fit in under 50 MB.

kNative from Scratch - a failed attempt

Recording multi-track TD-50 Drums with Pro Tools

Notes on GPT-2 (345M) use with custom text

Creating a TensorFlow model creation API with Lambda and APIGEE

Commentaires