Several tools are available on the market that allow for data annotation. In this article, we will specifically focus on image labeling tools which enable computers to understand digital images or videos also known as computer vision technology. Since the image labeling tool landscape is wide and varied, we will compare six image labeling tools based on six different dimensions.
Why do we need data labeling tools?
An important step in creating machine learning models for computer vision applications is data labeling. Qualitative labels are an important driver in creating an accurate computer vision model since the labels will be used to learn your model the features you want it to recognize.
Supervised learning is a subcategory of machine learning that designs algorithms with the aim of classifying data or predicting outcomes. To make sure the algorithm can recognize the outcome on its own it has to be trained on labeled data. The trained algorithm can then be used to predict the outcome of unlabeled data. The process of labeling data for the training purpose, is called data annotation.
Six dimensions to score image annotation tools on
You might think an image annotation tool should only serve to add labels to an image, but there are many tools on the market that offer a quite extensive list of functionalities that can support your labeling process end-to-end.
We have tried to make an overview of functionalities separated in different dimensions that we will later use to make a full comparison of a selection of tools:
- Supported data types
The type of data you want to label can have an impact on the variety of annotation tools you will be able to use. Some tools only offer labeling options for image and video data. Others also allow for text or even time series. This can be important to note if your requirements for a data annotation tool could extend in the future.
- Supported annotation methods
For each data type, there are several different annotation methods available. Since we focus on image data, annotation methods like image classification, object detection and image segmentation will be the most prevalent.
- Data quality control
Managing quality control can be an essential part of your labeling process. A tool offering a process to come to a labeling consensus or that allows for benchmarking can have a significant impact on the quality of the labels. Some tools also allow for review of already annotated tasks.
- Monitoring of data labeling tasks
Some tools may offer monitoring of the labeling tasks, with an overview of labeling tasks, the progress, quality and other KPI's.
- Integration with cloud storage services
Checking the integration possibilities of annotation tools with cloud services can be important since you will most likely require data to be imported from and stored in a cloud storage.
- Integration with other services
Depending on how and what you want to automate in the labeling process, integration with other services can be an important feature. A tool can have an API or SDK available or even have a direct integration in a machine learning environment.
- AI-assisted labeling
Human labelers can be assisted by models that provide automated annotations such that they only need to review the predictions. This can reduce labeling time significantly.
- Integration external workforce
Some data labeling services offer the integration of external labelers that can do the labeling for you. This can be a useful feature if your company does not have the labeling resources itself.
Prioritization of labeling tasks
- Ability to prioritize tasks
Some tools allow you to create new datasets for labeling that consist of data instances similar to the ones your model previously could not correctly predict. Based on model diagnostics, an iterative process could be designed to improve your training data. This will enable your model to train on very specific instances and improve its accuracy.
Comparison of image annotation tools on the market
Based on the identified features in the section above, we will make a comparison of a selection of tools that can be used for image annotation. We selected image annotation tools based on their popularity, availability of a web-based interface and ability to connect to the cloud.
The following tools are included in the selection:
- Open-source annotation tools:
|Open-source tool that allows for a wide variety of data types and annotation methods for these data types
|Open-source tool that specifically focusses on image annotation and is developed by Microsoft
|Free and open-source tool that focusses on image and video annotations and is developed by Intel
- Commercial annotation tools:
|Online platform with a user-friendly and intuitive interface that focusses on an iterative and integrated labeling process
|3 tiers: free, pro and enterprise
|Online platform with a focus on high-quality annotations and quality management
|2 tiers: pro and enterprise
Azure ML is an Azure cloud service from Microsoft that helps managing machine learning projects. It now also has a data labeling component which could be used on its own. An Azure subscription is however needed
|Low but not transparent, AI-assisted labeling will need a cluster and therefore can incur additional costs
An extensive comparison of the different tools on each feature category can be found in the schema below.
As we can conclude from the overview, the commercial tools offer the most complete set of functionalities. LabelBox seems like the best-in-class, performing well on all feature categories. The main advantage of this tool is the ability to analyse your model results and adapt the labeling tasks based on it.
SuperAnnotate scores good as well on all dimensions and focusses on high-quality annotations. However, it does not allow for prioritization of annotation tasks.
Azure ML is a feasible alternative for companies with an existing Azure Subscription who do not have advanced requirements for labeling options and are looking for a fast and low-cost labeling process.
An alternative for companies that do not yet have an Azure Subscription or have more advanced labeling requirements would be LabelStudio. LabelStudio is a good open-source alternative to its commercial counterparts, but not all of its features are available in the free version. The tool offers a wide variety of labeling options and has an API available.
VoTT and CVAT are both focused on annotations of images and video data. They offer advanced annotation functionality, but are lacking in other departments like project management, quality control and integration with other (cloud) services.
Which image annotation tool to choose for your project?
We would recommend to a company looking for an image annotation tool to first of all clearly and critically list up your requirements (and possible future requirements) of such a tool. Some features could be nice to have, but assessing whether they are truly necessary and worth the cost is an important step.
The most all-round tool providing everything one can wish for in image annotation, would be LabelBox. As expected, this comes at a price as well. Therefore, if the cost does not outweigh the benefit of the additional features LabelBox offers, companies can choose one of the alternatives depending on their needs.
Lastly, we like to note that this article only provides a comparison of a small selection of annotation tools. There are many more tools out there that could be worth considering for your organization. Scoring them each on the dimensions we listed in this article, can help you find the best fit for your project.