Maximize the Human capital in your AI


Imagine, you are taking a picture of a cat in your neighborhood and there is an AI-based filter you have been using in your camera to detect animals. Unfortunately, it has detected the cat as a dog. So, you try again on another cat but this time focusing more closely. Surprisingly, this time filter does not make the mistake and correctly detects the cat as a cat. Consequently, the question that generally arises what has made the same AI cat detection algorithm to fail in the earlier context while it passes with flying colors in the latter context. The answer is not straight forward and often entailed with much complexity. However, such problems in AI models arise out of misclassification due to the mislabeling of data or non-uniform distribution.

Concurrent AI technologies are equipped with very powerful and robust deep learning models which essentially show really good performance when large amounts of data are fed to them. So, one of the key aspects of the deep learning models in use is the quality of the input data to the model for training, validation, and testing. The model is trained, tuned and evaluated on this labeled data. The labeling of data controls the quality and correctness of the data input to the model which invariably introduces the human element in the pipelines of AI technology development. Humans can not only control the accuracy and precision up to which the input data is labeled but also take the pragmatic decisions to solve the problems stemming from mislabeling. Taking this into account it is quite natural to feel that AI technologies are more humane than automated does it not?


To overcome the problem of mislabeling, it is necessary to have a human interference and refinement of the existing labels. Some aspects of the solution to this problem are as follows:

  • Error Analysis: If we are effectively able to detect there is a problem of misclassification associated with the existing dataset, then the first approach can be to make a rigorous error analysis. This involves determining the general error rate of the deep learning models and then determining the error due to the specific misclassifications due to the mislabeling of data. This can effectively provide us with an insight as to which samples require relabeling or rectifying the labels as such they can contribute to lower error rates when the models are run again.

  • Manual labeling: After error analysis, we can make the changes in the labels ourselves manually iterating over all the samples in the dataset. An important rule we should maintain in this case is that we should consider the number of labels incorrect as well as the number of labels incorrectly predicted by the model.

  • Automated labeling: Performing manual labeling can be quite tedious and time-consuming. Moreover, it is an inefficient use of the data scientists at hand if they are to take part in labeling works. So, a more modern and sophisticated approach is to make the changes in the labels in an automated way according to some specified criterion. Some companies also have separate groups of people or in-house staff or provisions for outsourcing or performing the work of labeling as a gig.

The other problem is the non-uniform distribution of validation and test sets. Elaborately speaking, when the deep learning model has a training set and validation/test set commencing from different distributions then the distributions are non-uniform. When this happens, the models are unable to generalize well to the dev and test sets as they come from a different distribution and thereby, the models make a good number of misclassifications. As the problem of mislabeling, human intervention is indispensable when it comes to tackling the problem of non-uniform distributions. This provides another avenue in which the human element in the AI pipeline calls all the shots.


The following aspects introduce the human interventions that can be made to tackle the problem of non-uniform distribution and decrease the error rate of the models sufficiently.

  • Training on different distributions: If we can avail a dev/test from a distribution different from the training set distribution then we can mix the data from dev/test set distribution into the training set proportionately. The residue of the dev/test data is used for validation and evaluation. This effectively enables the model to learn from both the distributions and hence perform well on the target dataset.

  • Bias-variance trade-off: A good and robust deep learning model requires a super optimization algorithm with an optimum bias-variance tradeoff. When we mix datasets from different distributions, we can isolate part of the newly formed training set called the train-dev set. Then we run our model again to determine the training error, dev error, test error with the additional train-dev error. The difference in train error and train-dev error reflects the bias problem and the difference between dev error and test error reflects the variance error. A good balance between the two errors paves the way for an effective deep learning model.

  • Data mismatch: The difference between the train-dev and dev errors reflect the error of what is known as data mismatch. This is unique to the case of non-uniform distributions. When the data mismatch is high, it means that the model is generalizing well to the data it has not seen but is not optimized enough to make very accurate predictions. This is an inherent bias problem and so the model requires further tuning.

AI can reshape human lives and the outlook towards the world of technology. The main impact of AI lies in the fact that it can replace, reduce and remove human efforts to introduce maximum levels of automation. However, qualitative problems of the dataset and their subsequent solutions require human intervention. Consequently, there is more human effort to AI technology pipeline than it is generally apparent. Hence, AI is more humane than we let on and the human element is still the key element in many cases to improve the performance of AI technology.

Abelling is a data enrichment service provider. We understand the impact of quality input for a better future with AI. Tell us about your project, we would love to help out in any way possible. We provide consultancy and fully managed services for all approaches to labeling.







Want to stay updated with the latest AI developments and blog posts aboutthe machine learning world?

Signup to our monthly newsletter!