Creating an ML based Python API hosted on Heroku

Since long I’ve been thinking of creating an API which fellow developers could use over cloud. But I didn’t want it to be the traditional Hello World API or simple SQL Flask API  —  supporting the classical user name and email ID GET, PUT, POST, DELETE REST requests. Since AI and ML are so pervasive now, I thought of giving ML a try — and it is easy : )

Image Classification using PyTorch

PyTorch Logo

Since everything’s on the cloud and the free versions I needed an AI/ML project with small, lightweight dependencies for the cloud. I forked a project named Img2vec that uses PyTorch to generate feature vectors for a dataset of images and then does a simple match(cosine similarity in sklearn) of a test image with others using pretrained models. The readme provides enough snippets to make the demo work on your system.

Training the image data set

For the sake of simplicity, I’m skipping the data set training part — there are lot of docs over the internet. For our API purpose, I’m generating image vectors and storing them as csv files. Since computations will happen over cloud, if we generate vectors on runtime, the API will always timeout. But the API will generate vector for the image data you’ll ping and read other vectors from csv files.

Setting up the Flask app

Flask Logo

For a simple image match, I needed a POST or PUT request. There are many frameworks that support REST features, I used flask on Python 3.6. As the official site reads

Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy

So here’s the idea:

Here’s the gist of the API :

# endpoint to detect image
@app.route("/image_clustering", methods=["PUT"])
def image_clustering():

    # Image converted in Base64 encoded byte stream
    bytes = request.get_data()
    results = search(bytes)  # returns a list of image matches
    return jsonify(results)

Note: The test Image posted is not normal byte stream, its base64 encoded image byte stream — a standard way to ship binary data across networks.

Note: Get the source code here.

Setting up the Heroku server

Heroku Logo

Setting up the server is pretty straight forward, but there are a few issues I encountered:

Demo Time

You can take a demo in many ways

For the sample cat image shown, the results are

cat

[
  [
    0.8155140106062471, 
    "https://www.googleapis.com/download/storage/v1/b/python-clustering-api.appspot.com/o/images%2FFace%2F124.jpg?generation=1522585329188523&alt=media"
  ], 
  [
    0.8145577585207011, 
    "https://www.googleapis.com/download/storage/v1/b/python-clustering-api.appspot.com/o/images%2FFace%2F242.jpg?generation=1522585299229997&alt=media"
  ], 
  [
    0.7914929138145477, 
    "https://www.googleapis.com/download/storage/v1/b/python-clustering-api.appspot.com/o/images%2FFace%2F212.jpg?generation=1522584727478100&alt=media"
  ], 
  [
    0.7806927914191767, 
    "https://www.googleapis.com/download/storage/v1/b/python-clustering-api.appspot.com/o/images%2FFace%2F099.jpg?generation=1522585251917855&alt=media"
  ], 
  [
    0.6948463995381056, 
    "https://www.googleapis.com/download/storage/v1/b/python-clustering-api.appspot.com/o/images%2FFace%2F119.jpg?generation=1522584693369035&alt=media"
  ]
]

If the article helped you, please comment below. Feel free to ask if faced any problems reading the article.