Food Image Dataset

A food image dataset with labeled ingredient information

TAGS
Bounding Box
Image
Food
A food image dataset with labeled ingredient information

This project, which involved the collection and processing of food photos, aimed to develop an AI that can automatically generate a variety of images for a given name. The data consists of food images that are labeled with multiple attributes. By learning the hierarchical structure, it is anticipated that image editing can be done without the need to select a source image for the attribute.

About

Collecting images of food from around the world

Collecting images of food from around the world

Datumo provides high-quality training data for smarter artificial intelligence. This dataset was built for free as part of the “AI Dataset Sponsorship Program” organized by Datumo, in collaboration with Computer Vision Lab.

Computer Vision Lab is a research team equipped with artificial intelligence-based technologies for recognizing the 3D structure of objects and generating images. They participated in SelectStar's "AI Dataset Sponsorship Program" for research in 2021, and their paper was accepted and presented at the international conference CVPR.

The research team has not only re-implemented existing image generation technologies but also introduced a new open-source platform called StudioGAN. They addressed the challenges faced by traditional image generation techniques, where reproducing the performance mentioned in papers was problematic. By providing an integrated image creation platform and module design, they proposed a practical image generation benchmark technology. StudioGAN is publicly available on GitHub and has received 2.3k GitHub stars.

Testimonial

"Despite the difficulties in collecting food pictures from various countries, it was fascinating that through multi-attribute labeling, we could potentially develop an AI that understands food at the ingredient level."

Datumo / PM Park Ye-jin

Dataset specification

  • 1024 X 1024 pixels or higher resolution images

  • 100 food classes, 1000 photos per class

  • Hierarchically structured ingredient labeling data (json file)

Data Collection and Processing Method

  • Collection: crowd-sourcing platform 'Cash Mission' and web crawling

  • Processing: Cash Mission and in-house operators

Data Collection

Datumo's crowd-sourcing platform, Cash Mission(Web), was used for collection and processing of some data.

Sample Data

{
    "instance_num": 2,
    "country": "hawaii",
    "food_class": "poke",
    "ingredients": [
        {
            "subtype": "sea_products",
            "ingredient": "seaweed"
        },
        {
            "subtype": "vegetable",
            "ingredient": "other_green_leaf_vegetable"
        },
        {
            "subtype": "bean_wasabi",
            "ingredient": "corn_kernel"
        },
        {
            "subtype": "grain_nuts",
            "ingredient": "rice(steamed_or_fried)"
        }
    ]
}

Applications

  • Development of an AI system that automatically generates diverse images based on given object names.
  • Development of an AI system that understands objects at the ingredient level based on visual data.
  • Development of an AI system that generates images of new types of objects not present in the classification system.

CC BY-SA

Reusers are allowed to distribute, remix, adapt, and build upon the material in any medium or format, even commercially, so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.

https://creativecommons.org/licenses/by-sa/3.0/deed.en

Food Image Dataset

A food image dataset with labeled ingredient information