In this blog post I will be taking a lap around IBM Watson’s Vision API. IBM Watson is a set of services provided to you under SaaS model i.e. Software as a Service. The IBM Watson services are readily available to you to consume in your applications. All the services provide under IBM Watson can be seen here: https://www.ibm.com/watson/products-services/.

About Vision API:

The IBM Watson Vision API service uses deep learning algorithms to analyze images for scenes, objects, faces, and other content. When we use the Vision API service, the response from the API includes keywords that provide information about the content.

Using IBM Watson Vision API, you can find meaning in any visual content. Vision API provides you with the capability of analyzing images and detect the scenes, objects, faces, and other content within the image. Out of the box, Watson API will provide you with a set of default visual recognition model to start with. These default models are nothing but a set of built-in classes that provide highly accurate results without training. You can also create your own custom vision classifier and let Watson use that to classify the images. Some of the things you can do with Vision API are: finding similar images in a collection, analyzing the visual content of images or video frames to understand what is happening in a scene, etc.

Features of Vision API:

  • General Classification
    Generate class keywords that describe the image. Use your own images, or extract relevant image URLs from publicly accessible webpages for analysis.
  • Face Detection
    Given an image, you can detect human faces in the image. Also, you can get a general indication of age range and gender of faces.
  • Visual Training
    Create custom, unique visual classifiers. Use the service to recognize custom visual concepts that are not available with general classification.

Use cases for using Vision API:

Some of the use cases for Vision API are as follows:

  • Manufacturing
    Use images from a manufacturing setting to make sure products are being positioned correctly on an assembly line
  • Visual Auditing
    Look for visual compliance or deterioration in a fleet of trucks, planes, or windmills out in the field, train custom classifiers to understand what defects look like
  • Insurance
    Rapidly process claims by using images to classify claims into different categories.
  • Social listening
    Use images from your product line or your logo to track buzz about your company on social media
  • Social commerce
    Use an image of a plated dish to find out which restaurant serves it and find reviews, use a travel photo to find vacation suggestions based on similar experiences
  • Retail
    Take a photo of a favorite outfit to find stores with those clothes in stock or on sale, use a travel image to find retail suggestions in that area
  • Education
    Create image-based applications to educate about taxonomies

Above use cases are read from the Vision API documentation.

Getting Started with Watson Vision API:

Before we can use the Watson Vision API, first we need to create an instance of the service on Bluemix. Bluemix provides resources for your applications through a service instance. To use the Vision API, we will need to create what is known as “Watson Visual Recognition” service instance. For the rest of the blog post, i assume that you already have a Bluemix account. If not, do head over to IBM Bluemix and create a new account. Lets see the steps to create a Visual Recognition service below:

  1. Log in to Bluemix account and click Catalog
  2. In the left menu, under Platform section click Watson.
  3. Select Visual Recognition from the options available.
    image
  4. Next, we will be presented with Create screen. Change the service and credential name if you want to or accept the default values. Make sure you have selected the free pricing plan and click Create.
    image
  5. After the service is created, you will be presented with the Manage screen. Select the tab named Service Credentials. Then select View Credentials. We will need the API key to work with the service.
    image
  6. Copy the API Key.

Image Classification & Face Detection:

In this blog post, i will be making use of Watson Vision API’s out of the box Image classification and Face detection capabilities. We will use Node JS environment to perform the following tasks:

  1. Classify an image using pre-trained classifier and perform general classification
  2. Detect faces in an image

Node JS Application & Watson Developer Cloud:

For this blog post, i will be using Node JS as an environment. If you are following along with me, you will need to make sure that you have Node JS installed on your. Rest of the post assumes that you have Node JS installed. Create a new Node JS application by following the below steps:

  1. Create a directory for application. Open a command prompt and navigate to the newly created folder path
  2. Run command npm init on the command line. Accept the default values or provide your custom values. Once npm finishes the task, we will have a package.json file in the folder.

Now that we have a Node app created, its time to install Watson Developer Cloud Node JS SDK in to our project. From the command line execute the following command:

npm install –save watson-developer-cloud

When the above command finishes, our project is ready to work with Watson Vision API. Next lets see how to classify a text image for generic information.

Classifying Images:

In order to classify images and to retrieve general information about the image, we will use the Visual Recognition API. Follow the below steps to set things ready for image classification:

  1. Add a new file to the root of the project. Name it index.js
  2. We will reference watson-developer-cloud, http libraries.
  3. We create a parameters object which will contain URL of an image and the API key which we copied during the Watson Visual Recognition service creation. We will need to pass this to Watson API
  4. We create an instance of VisualRecognitionV3 object found in watson-developer-cloud library. We will need to provide API key during instantiation.
  5. Once we have the instance of VisualRecognitionV3 object, we can then make use of classify() method to classify the images. To the classify method we pass the parameters object we created earlier.
  6. We then need to handle the response that comes back from Watson. Watson Vision API would have analyzed the image from the URL we passes, classify the things it can find in the image and return that as a JSON response.

Below is the complete code for classifying image using Watson Visual Recognition API:

var watson = require(“watson-developer-cloud”);

var fs = require(‘fs’);

var http = require(‘http’);

var parameters = {

“apikey”:”<Your API Key>”,

“url”:”https://cdn.pixabay.com/photo/2015/10/27/16/39/taj-mahal-1009262_960_720.jpg”

}

var visualRecognition = new watson. VisualRecognitionV3({

api_key : parameters.apikey,

version_date:’2016-05-20′

})

visualRecognition.classify(parameters, (err,response) =>{

if(err) {

console.log(‘Error:’,err);

if(typeof callback !== ‘undefined’ && typeof callback == “function”)

return callback(err);

}

else {

console.log(JSON.stringify(response,null,2));

if(typeof callback !== ‘undefined’ && typeof callback == “function”)

return callback(response);

}

});

For this blog post i used the following image URL as input:

image

The image is of Taj Mahal – one of the wonders of the world.

From a command line, just execute the command “npm index.js”. This will run our code and print the JSON response from Watson.

Here is the JSON response i got back from Watson for this image:

image

Watson Vision API has been able to identify the image as that of Taj Mahal. We are using the default classifier available with Watson and it is spot on in identifying the image. Now all that is left for you is to make use of the data returned by Watson and use it in your application logic.

Whats nice is the simplicity of using the API and the ease with which you can get up & running with Watson Vision API. This was a simple example but i hope you got an overview of the Vision APIs image classification feature.

Detecting Faces:

While we saw how classify() method is used to classify images for generic information, let us now see what it takes to detect faces in any given image. The Visual Recognition API provides us with a method called detectFaces(). This method can be used to pass in any image containing human faces and detect the faces, age, gender and names of celebrities. Lets see what code we will need to write to detect faces. Follow the below steps to detect faces:

  1. Add a new file called faces.js
  2. We will need to reference watson-developer-cloud and http library. Use import statement to import the libraries.
  3. Create a parameter object.
  4. Add apikey and URL properties to the object. Provide your Watson Visual Recognition service api key to parameter object apikey property.
  5. For the URL, provide URL of any image which contains human faces.
  6. Next, we instantiate VisualRecognitionV3 object found in watson-developer-cloud library.
  7. Call detectFaces() method to detect faces. Pass the parameter object.
  8. Handle the response of the detectFaces() method. The response will be a JSON payload with information of the identified faces.

Lets see the code snippet for detecting faces:

var watson = require(“watson-developer-cloud”);

var http = require(‘http’);

var parameters = {

“apikey”:”<YOUR API KEY>”,

“url”:”https://upload.wikimedia.org/wikipedia/commons/a/a1/Rahul_Dravid.jpg”

}

var visualRecognition = new watson. VisualRecognitionV3({

api_key : parameters.apikey,

version_date:’2016-05-20′

})

visualRecognition.detectFaces(parameters, (err,response) =>{

if(err) {

console.log(‘Error:’,err);

if(typeof callback !== ‘undefined’ && typeof callback == “function”)

return callback(err);

}

else {

console.log(JSON.stringify(response,null,2));

if(typeof callback !== ‘undefined’ && typeof callback == “function”)

return callback(response);

}

});

For this exercise of detecting faces, i used the following image:

The image is of one of the greatest cricketing legends India has produced – Rahul Dravid a.k.a The Wall as we call him. You can find this image here: https://upload.wikimedia.org/wikipedia/commons/a/a1/Rahul_Dravid.jpg

And when i execute the code on a command line, below is the JSON response from Watson Vision API:

image

The response from Watson Vision API provides information related to Age, Gender, Face location and Identity of the face. In my case it was spot on in recognizing Rahul Dravid. Well thats all it takes to detect human faces in any given image using Watson Vision API.

Summary:

In this blog post, i took a lap around IM Bluemix Watson Visual Recognition/Vision API. As we saw in the sections above, its very easy to get started with the Watson APIs. You just create a service on Bluemix, get your API key and you are all set. Watson Vision API provies classify() method which can be used to get general information from any given image and detectFaces() method which can be used to get information such as age, gene, face location and identity of any human face in any given image. If you have followed along the blog post you will agree with me that the whole process of creating the service on bluemix, installing watson SDK for Node environment and consuming the service was effortless.

Hope this blog post provided you an overview of what Watson Vision API can do out of the box. Do give it a try and let me know what you are building.

Till next time – Happy Coding. Code with Passion, Decode with Patience