7 Best Image Recognition APIs

In 2016, Mark Zuckerberg laid out details at Facebook’s annual developer’s conference about their quest to launch AI that is better at recognizing images than people are. These image processing algorithms could be used for everything from narrating images for the visually impaired to avoiding car accidents to automated image tagging. These are just a few of the nearly-infinite applications of image processing APIs, which fall under the umbrella term computer vision.

Below we delve into some of the best image recognition APIs out there, covering a wide range of different applications and features.

7 Best Image Recognition APIs

Image recognition APIs are part of a larger ecosystem of computer vision. Computer vision can cover everything from facial recognition to semantic segmentation, which differentiates between objects in an image.

Working with a large volume of images ceases to be productive, or even possible, without some sort of image recognition in place. Certain tasks, like detecting similar images or landmark identification, are even next to impossible without advanced AI tools.

For example, consider GrubHub’s use of image recognition APIs for automating images being added to their platform. The simple task of posting images of food to an app is surprisingly fraught. GrubHub developers express a need for image recognition APIs for everything from detecting explicit content to finding similar images.

For the scope of this article, we’ll be focusing on image processing APIs as there are a lot out there. Some of the image processing APIs can be used for other computer vision applications. They’re still worth a look if you’re developing a different kind of computer vision tool.

1. CloudVision API

Google’s CloudVision API is about as close to a plug-and-play image recognition API as you can get. It’s pre-configured to tackle the most common image recognition tasks, like object recognition or detecting explicit content.

The CloudVision API is also able to take advantage of Google’s extensive data and machine-learning libraries. That makes it ideal for detecting landmarks and identifying objects in images, which are some of the most common uses for the CloudVision API.

It also can access image information in a variety of ways. It can return image descriptions, entity identification, and matching images. It can also be used to identify the predominant color from an image.

The CloudVision API’s most exciting feature is its OCR recognition. The API can detect printed and handwritten text from an image, PDF, or TIFF file. You can use it to generate documentation straight from graphics and hand-written notes. This alone makes it worthy of investigation.

The only real downside to Google’s CloudVision API is that it’s a bit expensive. Prepare to pay if you’re going to be using it extensively.

Google Cloud Vision API correctly identifies a cassette tape, listing most probable web entities. Try the demo here.

2. Amazon Rekognition

Amazon’s Rekognition API is another nearly plug-and-play API. It also handles the common image recognition tasks like object recognition and explicit content detection. It has some other features which make it useful for video processing, however. The Celebrity Recognition feature also makes it useful for apps or websites which display pop culture content.

The Capture Movement feature is one of the first standout features of Recogniktion. The Capture Movement feature tracks an object’s movement through a frame. Although largely useful for video processing, it’s worth having in your API toolkit.

The Detect Text In Image feature is also worthy of mention and likely to be more useful for static image processing. The Rekognition API analyzes images for text, assessing everything from license plate numbers to street names to product names.

Rekognition has a number of payment levels. It does offer a free tier, which makes it noteworthy. Rekognition users can analyze up to 1,000 minutes of video; 5,000 images; and store up to 1,000 faces each month, for the first year.

Amazon Rekognition’s pricing also varies by region. If you’re going to use more than their free service, you can request a quote via the pricing page.

Amazon Rekognition being used to detect text within images.

3. IBM Watson Visual Recognition

IBM’s Watson Visual Recognition API combines an image recognition API with the power of machine learning. Users can build, train, and test custom machine learning models, either in or outside of Watson Studio.

It comes with several pre-trained object detection models. These include the General Model, which provides a classification for thousands of predefined objects. The Explicit Model detects inappropriate content. The Food Model recognizes food objects in images. The Text Model recognizes text, similar to Amazon Rekognition.

IBM Watson recognizes some elements of the banana, as seen in this JSON response. Try the demo here.

4. Microsoft Image Processing API

Microsoft Azure Cloud offers a number of tools as part of their Cognitive Services. It’s nearly a one-stop-shop for any kind of Computer Vision processing you might need.

Microsoft Azure Cloud’s Computer Vision API offers a number of the same image recognition tools as the other APIs on our list. It also offers some innovative other features that make it worthy of inclusion on our list of best image recognition APIs. Image properties definition can assess the dominant hue of an image, and whether or not it’s black-and-white. Image Content Description and Categorization describe an image as a complete sentence as well as categorizing that content.

Microsoft Azure Cloud’s image recognition API is priced according to the region as well as by the number of transactions.

Microsoft Azure Image Processing API correctly identifies “headphones” with a 93% degree of confidence. Try the demo here.

5. Clarifai

Clarifai is another image recognition API that takes advantage of machine learning. Clarifai features 14 pre-built models of computer vision for analyzing visual data. It’s also simple to use. Simply upload your media and Clarifai returns predictions based on the model you’re running.

Clarifai has a number of noteworthy features. Its fashion identification system is one of the most in-depth out there, being able to identify thousands of fashion items and accessories using the Fashion computer model. It also features an extensive food algorithm, being able to analyze over 1,000 food items down to the ingredient level.

Clarifai is also capable of most of the basic computer vision functions mentioned on our list. It can detect explicit content, identify celebrities, and recognize faces. Clarifai can also determine the dominant color of an image.

What working with the Clarifai API looks like in curl.

6. Imagga

Companies using visual recognition and processing APIs often deal in huge volumes of visual media. Imagga API is an automated image tagging and categorization API to help you deal with that quantity of media.

Imagga is categorized as a Digital Asset Management API. It features an asset library, allowing for asset categorization and metadata management. Finding assets in the library is simple thanks to a Search/Filter function.

It also allows for reporting and analytics. It’s comparable to other digital asset management APIs like Box, Airtable, or Canto Digital Asset Management. Imagga’s the new digital asset management API on the block, though, making it more affordable than a number of the other options out there.

Imagga identifies a cactus… sort of. Try the Imagga auto-tagging demo here.

7. Filestack Processing API

If you’re processing large amounts of photos, Filestack Processing API is a good tool to have in your toolkit.

Filestack Processing API can be used to store files, compress files, and file conversion. It can also automatically integrate with file-sharing platforms like Google Drive, Dropbox, and Facebook. It can also perform many of the other tasks that the other image processing APIs mentioned on our list, like detecting inappropriate content and character recognition.

Filestack Processing has a few other distinctive features that are worth noting. It can be used to tag videos and detect copyrighted images. It can also be used to size or resize images, crop, resize, compress, or rotate images.

Filestack Processing API is 96% percent sure this is a cactus, and we have to agree. Try the Filter Content demo here.

Image Recognition APIs: Final Thoughts

As you can see, there are a lot of different image recognition APIs to choose from. A number of them perform many of the same basic image recognition functions. Each one has its own unique capabilities as well, though.

To help you decide which image recognition API is right for you, here’s a short synopsis of the features of the APIs we’ve covered in this article.

  • For an extensive library of pre-configured recognition models, and quality handwriting recognition, consider Google Google CloudVision API.
  • For image recognition with celebrity recognition or movement capture, consider Amazon Rekognition.
  • For powerful machine learning from IBM Watson, and a dedicated Food Model, consider IBM Watson Visual Recognition.
  • For similar features plus dominant hue and human-readable content description and categorization, consider Microsoft Image Processing API.
  • For image recognition that includes fashion and food identification, consider Clarifai.
  • For a more affordable API that focuses on a large quantity of media and digital asset management, and NSFW filters, consider Imagga.
  • For OCR & NSFW filtering, plus additional file management features like social upload and image transformation, consider Filestack Processing API.

Considering how visual humans are, and how much visual data we’re surrounded by on any given day, it’s safe to say that image recognition APIs aren’t going anywhere anytime soon. It’s technology’s job to make our jobs more efficient, not create an endless array of new tasks to fill our days with endless busywork.

Image recognition APIs automate a lot of the tasks around working with visual data and media, so we can focus on building our apps, developing our businesses, and finding outstanding visual content without becoming glorified file clerks.

APIFeatures
Cloud Vision API1. Object Recognition
2. Explicit Content Detection
3. Landmark Detection
4. Object Recognition
5. Return Image Descriptions
6. Entity Identification
7. Image Matching
8. OCR Recognition
Amazon Rekognition1. Object Recognition
2. Explicit Content Detection
3. Celebrity Recognition
4. Motion Capture
5. Detect Text In Image
IBM Watson Visual Recognition1. Compatible with Machine Learning
2. Several Pre-loaded Object Identification Machine Learning Models
Microsoft Image Recognition API1. Face Detection
2. Landmark Detection
3. Celebrity Detection
4. Text Recognition
5. Information Extraction From Documents
6. Image Properties Description
7. Image Content Description and Categorization
Imagga1. Automated Image Tagging
2. Automated Image Categorization
3.Creates Analytics
Filestack Processing API1. Stores Files
2. Compresses Files
3. Converts Files
4. Integrates With File Sharing Services
5. Explicit Content Detection
6. Video Tagging
7. Image Editing
Clarifai1. Automated Image Tagging
2. Face Detection
3. Celebrity Detection
4. Demographic Analysis
5. Moderation

Would you like your tool added to this list? Please comment below. Our policy is to create an “Honorable Mentions” list for other tools that we do not cover in-depth. 

J Simpson

About J. Simpson

J. Simpson lives at the crossroads of logic and creativity. He writes and researches tech-related topics extensively for a wide variety of publications, including Forbes Finds. He is also a graphic designer, journalist, and academic writer, writing on the ways that technology is shaping our society while using the most cutting-edge tools and techniques to aid his path. He lives in Portland, Or.