Real-world Dataset

A real-world dataset for object material recognition

TAGS
Bounding Box
Image
Object
Real-world Dataset for Object Material Recognition

"This project involved filming a variety of everyday objects from 360-degree angles and creating and distinguishing polygon regions for each material from 50 different viewpoint.

Through this project, we have constructed a dataset that visualizes the distinct materials that make up an object. It is expected that this will enable more accurate implementation of the material when creating 3D object models based on videos."

About

Research Bridging Design and Machine Learning

Datumo provides high-quality training data for smarter artificial intelligence. This dataset was built for free as part of the “AI Dataset Sponsorship Program” organized by Datumo, in collaboration with RebuilderAI.

RebuilderAI provides AI-based 3D automation solutions and a 3D virtual space creation platform. Their mission is to allow anyone to easily create reality-based 3D and provide people with realistic virtual experiences online. They possess innovative technologies such as VRIN, an AI model that creates objects and spaces in 3D in real-time based on video data.

Testimonial

"Throughout the project, as a Project Manager, I constantly pondered on 'how can we provide high-quality data to our clients within the given constraints?'. The answer to this dilemma, I believe, always lies with the crowd workers, or in other words, the users of Cash Mission(Datumo’s crowd-sourcing platform). In that sense, most of our energy in this project was spent on 'how can we enable our users to produce high-quality data more easily?'

Instead of solely focusing on reducing costs directly, we considered the user's perspective and sought ways to improve the difficulty level, convenience, and satisfaction with the tasking process. As a result, I felt we were able to produce higher quality data within the given constraints. This proved to be a meaningful experience that set a direction for future projects."

Datumo / PM LEE Jong-ho

Dataset specification

  • Source video data: 1,000 360-degree videos, each containing one object

  • Source image data: 44,500 images extracted from 890 of the source videos (50 images per object)

  • Label (polygon) data for the source image data

  • 44,500 mask files for the entire object

  • 42,078 mask files for the opaque parts of the object

  • 19,893 mask files for the transparent parts of the object

  • 11,188 mask files for the empty parts of the object

Data Collection and Processing Method

All data collection and processing for this project was done through Datumo's crowdsourcing platform, Cash Mission. In the collection phase, we were able to quickly gather the necessary raw data through Cash Mission's wide user pool. During the processing phase, we maximized the accuracy and consistency of the dataset by selecting skilled workers as reviewers using our own developed algorithm and conducting a complete inspection.

In addition to these technical advantages, we were able to efficiently produce high-quality data through a sophisticated and systematic project management and accuracy managing process, built upon previous project experiences.

Data Collection

In this project, Datumo's crowd-sourcing platform, Cash Mission(Web), was used for collection and processing of part of the data.

'캐시미션(웹)'에서 전문 가이드 팀이 작성한 크라우드 유저들의 미션 이해를 돕기 위한 가이드
A guide created by the specialist guide team on 'Cash Mission (Web)' to assist crowd workers in understanding their missions

Sample Data

Source image data

Full object mask files

Mask files for opaque parts of objects

Mask files for transparent parts of objects

Mask files for background portions within the object areas

Applications

Enhancement of material implementation in the creation of 3D models based on real-world imagery

CC BY-SA

Reusers are allowed to distribute, remix, adapt, and build upon the material in any medium or format, even commercially, so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.

https://creativecommons.org/licenses/by-sa/3.0/deed.en

Real-world Dataset

A real-world dataset for object material recognition