Datasets
date: 2024-12-05 title: Datasets for Robotics —
The Datasets section compiles various open-source datasets and guides for creating custom datasets, providing essential resources for robotics research and applications. These datasets cover a wide range of domains, such as object detection, traffic modeling, and robotic perception, enabling researchers and developers to train, validate, and deploy machine learning models effectively.
This section is actively maintained and welcomes contributions to expand the dataset collection and usage guides.
Key Subsections and Highlights
-
Open Source Datasets A comprehensive guide to open-source datasets across diverse applications, such as general-purpose datasets (e.g., ImageNet, COCO) and domain-specific datasets (e.g., Lego Bricks, Thermal Cheetahs). The subsection also provides a tutorial on creating custom datasets using tools like Yolo_mark or Innotescus.
-
Traffic Modelling Datasets A curated list of datasets specifically designed for traffic modeling, including data captured from UAVs, traffic cameras, and autonomous vehicles. Examples include Argoverse, NuScenes, and High-D, with detailed descriptions of their attributes like lane boundaries, pedestrian data, and traffic light information.
Dataset Highlights
Open Source Datasets
- General Datasets: OpenImages, MS COCO, ImageNet, CIFAR-10
- Specific Applications: Chess Pieces, Fruit, Masks, Thermal Dogs and People
Traffic Modelling Datasets
- Argoverse: Includes 3D tracking and motion forecasting data with HD maps.
- Interaction: Contains lane merging and roundabout scenarios; no video data.
- High-D: Focuses on highways, with vehicle trajectories and HD map data.
- NuScenes: Provides detailed 3D annotations, global coordinates, and a variety of object types.
- Apollo: Features trajectories with object metadata like length, width, and heading.
Development Needs
We aim to expand this section by:
- Adding more application-specific datasets (e.g., for healthcare robotics, marine robotics)
- Developing tutorials on dataset preprocessing and augmentation
- Enhancing visualization tools and API usage guides for existing datasets
- Including benchmark datasets for emerging fields like robot learning and human-robot interaction
If you have recommendations for datasets or wish to contribute tutorials and guides, please reach out or submit your content.
Summary
The Datasets section is a centralized hub for dataset resources in robotics. It supports diverse research areas by providing access to curated datasets, tools for creating custom datasets, and guides for effective usage. Contributions from the community will play a crucial role in enhancing this repository.