Intro to Image Annotations

Intro to Image Annotations

What is image annotation? Image annotation, in the context of machine learning, is when you take any image you want to train and label it in a way you can train a model. So annotation is synonymous with label or class. Here is a simple example:

Taco or Cowboy Hat?

If you want to train an Image Classification Model, to recognize a taco vs a cowboy_hat in an image, you could label the images like so:

data/
    cowboy_hat/
        0.png
        1.png
        ...
    taco/
        0.png
        1.png
        ...

In this simple example an engineer will write a script that will associate the images with the directory names the images are contained in. These directory names would be used to train the model that would then be able to recognize if there is either a taco or a cowboy_hat in an image. This is a very basic example, and there are many variations on how image annotations are created and consumed.

When it comes to image annotation, there aren't any strict rules about how data is saved and organized, but there are common techniques that can help make it much easier for you to consume the data for training models later. We will dive deep into these techniques in future posts.

But Why Though?

Now that you know what image annotation is, why is it useful? Or better yet, specifically when is it useful? There are a lot of computer vision techniques that don't require any image annotation. Annotations are only useful for training machine learning models using supervised learning.

What is supervised learning? Anytime you need to tell a model what a specific thing is, like a taco or a cowboy hat, you are supervising the machine learning process.

Conclusion

To wrap it up, Image annotation is when you label images in a way that they can be trained using supervised learning for computer vision.

If there are a lot of terms you don't recognize above, that's ok. This is meant to be a very high level introduction into image annotation, and deeper dives and clearer definitions are coming in future posts.

The way we organize our data often depends on the type of model we intend to train. In my next post, I will enumerate the most popular machine learning models in computer vision that require supervised learning.