Autopilot Dataset Summary

preface

There are many autopilot data sets shared online, scattered in various forums, Zhihu In github and blogs, each data set is also split to share, which makes it very inconvenient. I need to summarize it to facilitate my subsequent reading and updating. At the same time, I hope that the data set will be published after 2018 as soon as possible. So this article has been published. The data set classification method is the same as others. It is divided into eight categories. I think this classification method is very good, So I used it directly.

Autopilot dataset classification:

Target detection data set
Semantic segmentation dataset
Lane line detection data set
Optical flow data set
Panoramic dataset
Positioning and Map Datasets
Driving behavior data set
Simulation data set

Target detection data set

Waymo data set

Issued by: Waymo
Download address: https://waymo.com/open/
Time of release: perceptual data set will be released in 2019, and motion data set will be released in 2021
Size: 1.82TB
Introduction: Waymo data set is the largest and most diversified data set so far. Compared with previous data sets, Waymo has greatly improved in sensor quality and dataset size, and the number of scenes is three times that of nuScenes dataset
Perception Dataset
- 1950 video clips of automatic driving, each of which includes 20s of continuous driving pictures
- Four types of labels for cars, pedestrians, bicycles and traffic signs
- 12.6 million 3D frames, 11.8 million 2D frames
- Sensor data: 1 medium range lidar, 4 short range lidars, 5 cameras
- The collection scope covers the downtown and suburban areas of Phoenix, Kirkland, Mountain View, San Francisco and other regions in California. At the same time, it involves data under various driving conditions, including day, night, dawn, dusk, rainy day and sunny day
Motion Dataset
- Including 574 hours of data, 103354 data fragments with maps
- There are three types of labels for cars, pedestrians and bicycles. Each object is marked with a 2D box
- Mining behaviors and scenes for behavior prediction research, including turning, merging, lane changing and intersection
- Locations include: San Francisco, Phoenix, Mountain View, Los Angeles, Detroit and Seattle

PandaSet

Publisher: Hesai Technology&Scale AI
Download address: https://scale.com/resources/download/pandaset
Published: 2019
Size: 16.0 GB
Introduction: Pandaset is open to scientific research and commercial applications. For the first time, the mechanical rotating and image level forward lidar are used simultaneously for data acquisition, and the point cloud segmentation results are output
features
- 48000+camera images
- 16000 lidar scanned point cloud images (more than 100 8-second scenes)
- 28 notes per scene
- 37 semantic segmentation tags for most scenarios
- Sensor: 1 mechanical LiDAR, 1 solid-state LiDAR, 5 wide-angle cameras, 1 telephoto camera, on-board GPS/IMU

nuScenes

Released by: Autonomous Driving Technology Co., Ltd
Download address: https://scale.com/open-datasets/nuscenes/tutorial
Address: https://arxiv.org/abs/1903.11027
Published: 2019
Size: 547.98GB
Introduction: The nuScenes dataset is one of the most widely used public datasets in the field of automated driving, and it is also the most authoritative automated driving pure vision 3D target detection evaluation set at present. The nuScenes dataset, inspired by kitti, is the first dataset containing a full sensor suite. It includes 1000 complex driving scenes in Boston and Singapore. This data set is prohibited for commercial use
features
- Full sensor kit: 1 laser radar, 5 radars, 6 cameras GPS 、 IMU
- 1000 scenes, 20 seconds each (850 for model training, 150 for model testing)
- 400000 key frames, 1.4 million camera pictures, 390000 lidar scanned point cloud images, 1.4 million radar scanned point cloud images
- 14 million 3D callouts for 23 object classes

Lyft Level 5

Publisher: Lyft Transportation Network Company
Download address: https://level-5.global/register/
Address: https://arxiv.org/pdf/2006.14480v2.pdf
Release time: release Lyft preference dataset in 2019 and Lyft preference dataset in 2020
Lyft-perception
- Introduction: Laifu's autonomous car is equipped with an internal sensor kit, which can collect the original sensor data of other cars, pedestrians, traffic lights, etc
- features
  - More than 55000 frames are manually 3D labeled
  - 1.3 million 3D annotation
  - 30000 lidar point cloud data
  - 350 scenes of 60-90 minutes
Lyft-prediction
- Introduction: This data set includes the action records of cars, cyclists, pedestrians and other traffic actors encountered by the driverless team. These records come from raw lidar, camera and radar data, and are ideal for training motion prediction models
- features
  - 1000 hour driving record
  - 170000 scenes: each scene lasts about 25 seconds, including traffic lights, aerial maps, sidewalks, etc
  - 2575km: 2575km data from public roads
  - 15242 labeled pictures: including high-definition semantic segmentation picture and high-definition aerial view of the area

H3D - HRI-US

Issued by: Honda Research Institute
Download address: https://usa.honda-ri.com//H3D
Address: https://arxiv.org/abs/1903.01568
Published: 2019
Introduction: Use 3D LiDAR scanner to collect large full surround 3D multi-target detection and tracking data set, which is only for university researchers
features
- 360 degree LiDAR dataset
- 160 crowded and complex traffic scenes
- 27721 frames, 1071302 3D callouts
- Manual annotation of 8 types of common objects in automatic driving scenarios
- Sensors: 3 HD cameras, 1 laser radar, GPS / IMU

Boxy vehicle detection data set

Publisher: Bosch
Download address: https://boxy-dataset.com/boxy/
Address: https://openaccess.thecvf.com/content_ICCVW_2019/papers/CVRSUAD/Behrendt_Boxy_Vehicle_Detection_in_Large_Images_ICCVW_2019_paper.pdf
Published: 2019
Size: 1.1TB
Introduction: Large vehicle detection data set, the highlight of which is its high resolution of 5 million pixels, but does not provide 3D point cloud data and urban road traffic data
features
- 2.2 million 1.1TB high-resolution images
- 5 megapixel resolution
- 1990806 vehicle labels, including 2D box labels and 2.5D labels
- Including sunny, rainy, dawn, daytime, evening and other scenes
- Covering the scene of traffic jam and unblocked expressway

BLVD

Issued by: Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University
Download address: https://github.com/VCCIV/BLVD/
Address: https://arxiv.org/pdf/1903.06405.pdf
Published: 2019
Introduction: the world's first five dimensional driving scene understanding dataset. BLVD aims to provide a unified verification platform for dynamic 4D tracking (speed, distance, horizontal angle and vertical angle), 5D interactive event recognition (4D+interactive behavior) and intention prediction. Collected by the Kuafu Unmanned Vehicle of Xi'an Jiaotong University
features
- Label 654 sequences containing 120000 frames, with 5D semantic annotation for the whole sequence
- 249129 3D target frames, 4902 effective traceable independent individuals
- The total length is about 214900 tracking points
- 6004 effective fragments for 5D interaction event identification, 4900 targets for 5D intention prediction
- Rich scenes: cities and highways, day and night
- Multiple objects: pedestrians, vehicles, cyclists (including cyclists and motorcyclists)
- Sensor: a Velodyne HDL-64E three-dimensional laser radar GPS / IMU、 Two high-resolution multi view cameras

SODA10M data set

Released by: Huawei Noah Ark Lab&Sun Yat sen University
Download address: https://soda-2d.github.io/download.html
Address: https://arxiv.org/pdf/2106.11118.pdf
Published: 2021
Size: 5.6GB (labeled data), 2TB (unlabeled data)
Introduction: semi/self supervised 2D benchmark data set, which mainly includes 10 million pictures of various unlabeled road scenes and 20000 pictures with labels collected from 32 cities
features
- 10 million unlabeled pictures and 20000 tagged pictures, which are captured every 10 seconds by mobile phone or DASH CAM (1080P+)
- Six main types of pedestrian and vehicle scenes: pedestrians, bicycles, cars, trucks, trams, tricycles
- Covering 32 cities in China
- Diversity coverage of scenes: sunny/cloudy/rainy days; Urban streets/highways/rural roads/residential areas; Day/Night/Dawn/Dusk
- The horizon is kept at the center of the image, and the occlusion in the car is not more than 15% of the whole image

D ² - City data set

Publisher: Didi
Download address: https://www.scidb.cn/en/detail?dataSetId=804399692560465920
Published in 2019
Size: 131.21 GB
Introduction: D ² - City is a large-scale driving video dataset. Compared with existing data sets, D ² - City wins in the diversity of its data set, which is collected from Didi operating vehicles operating in five cities in China and covers different weather, road and traffic conditions
features
- More than 10000 videos, all of which are recorded in high definition (720P) or ultra-high definition (1080P) resolution, and the raw data provided are stored as short videos with a frame rate of 25fps and a duration of 30 seconds
- There are about 1000 videos of 2D frame labeling and tracking labeling for 12 types of objects, including cars, trucks, buses, trucks, pedestrians, motorcycles, bicycles, open and closed tricycles, forklifts and obstacles
- The raw data provided is stored as short video with frame rate of 25fps and duration of 30s
- Rich scenes: covering different weather, roads and traffic conditions, especially the extremely complex and diverse traffic scenes, such as insufficient light, rainy and foggy weather, road congestion, low image definition, etc

Apollo Scape Dataset

Publisher: Baidu
Download address: http://apolloscape.auto/scene.html
Release time: 2018-2020
Introduction: Baidu Apollo data set includes trajectory prediction, 3D lidar target detection and tracking, scene analysis, lane semantic segmentation, 3D car instance segmentation, stereo and repair data sets, etc
features
- Scene segmentation data: The entire data set published by Apollo Scape contains 3384 x 2710 high-resolution image data with hundreds of thousands of frames of pixel by pixel semantic segmentation annotation
- Lane semantic segmentation: 110000 frames of high-quality pixel level semantic segmentation data
- 3D object detection and tracking data set: collected under various lighting conditions and traffic densities in Beijing, China

BDD100K

Issued by: AI Laboratory, University of California, Berkeley (BAIR)
Download address: https://bdd-data.berkeley.edu/
Address: https://arxiv.org/pdf/1805.04687.pdf
Release time: 2018
Size: 57.45GB
Introduction: BDD100K has won great attention due to the diversity of its data set, which is collected by tens of thousands of drivers through crowdsourcing, covering cities including New York, San Francisco Bay Area and other regions. BAIR researchers sample keyframes on the video and provide annotations for these keyframes
features
- 100000 high-definition videos, more than 1100 hours of driving records, each video is about 40 seconds long, the definition is 720p, and the frame rate is 30
- The video also contains GPS location information IMU data and timestamp
- It covers sunny, cloudy, rainy, snowy, foggy and cloudy weather; Day and night; Different driving scenes such as urban roads, tunnels, highways, residential areas, parking lots and gas stations
- Researchers sample keyframes for the 10th second of each video
- It includes the following annotation types: image annotation, lane line annotation, driving area annotation, road object detection, semantic segmentation, instance segmentation, multi-target detection and tracking, etc

DAIR-V2X Dataset

Released by: Tsinghua University Intelligent Industry Research Institute (AIR), Beijing High level Automatic Driving Demonstration Zone, Beijing Chewang Technology Development Co., Ltd., Baidu Apollo, Beijing Zhiyuan Artificial Intelligence Research Institute
Download address: https://thudair.baai.ac.cn/cooptest
Published in 2022
Introduction: DAIR-V2X data set is the world's first large-scale, multi-modal, multi perspective data set used for the research of vehicle road cooperative automatic driving. All data are collected from real scenes, including 2D&3D annotations
features
- 71254 frame image data and 71254 frame point cloud data in total
- DAIR-V2X collaborative data set (DAIR-V2X-C), including 38845 frame image data and 38845 frame point cloud data
- DAIR-V2X wayside data set (DAIR-V2X-I), including 10084 frame image data and 10084 frame point cloud data
- DAIR-V2X vehicle end data set (DAIR-V2X-V), including 22325 frame image data and 22325 frame point cloud data
- It is the first time to realize spatio-temporal synchronous annotation of vehicle road collaboration
- The sensor types are rich, including vehicle end camera, vehicle end LiDAR, road end camera, road end LiDAR and other types of sensors
- Obstacle target 3D annotation has comprehensive attributes, marking common obstacle targets of 10 types of roads
- Collected from 10 kilometers of urban roads, 10 kilometers of expressways, and 28 intersections in Beijing High level Automatic Driving Demonstration Zone
- Data covers sunny/rainy/foggy days, day/night, urban roads/highways and other rich scenes
- Complete data, including original image and point cloud data after desensitization, annotation data, time stamp, calibration file, etc
- The training set and verification set have been released, and the test set will be released along with the subsequent Challenge activities

Argoverse

Released by: Argo AI, Carnegie Mellon University, Georgia Institute of Technology
Download address: https://www.argoverse.org/av1.html
Address: https://arxiv.org/pdf/1911.02620.pdf
Published in 2019
Introduction: The Argverse dataset includes 3D Tracking and Motion Forecasting. Argoverse dataset is somewhat different from Waymo. Although it also contains laser radar and camera data, it only covers 113 scenes recorded in Miami and Pittsburgh. It is the first dataset containing high-definition map data
features
- The first dataset containing high-definition map data: contains 290km lane maps of Pittsburgh and Miami, such as location, connection, traffic signal, altitude and other information
- Sensors: 2 lidars, 7 high-resolution ring cameras (1920 × 1200), 2 stereo cameras (2056 × 2464)
- Argoverse 3D tracking
- The 3D tracking annotation contains 113 scenes, each clip is 15-30 seconds long, and a total of 11052 tracking objects are included
- Mark objects within 5 meters, 15 labels in total
- 70% of the marked objects are vehicles, 30% of pedestrians, bicycles, motorcycles, etc
Argoverse Motion Forecasting
- Obtained from 1006 hours of driving records in Miami and Pittsburgh, a total of 320 hours
- It contains 324557 scenes, 5 seconds each, and 2D aerial view of each tracking object sampled at 10 Hz

Urban Object Detection

Publisher: The Robotics and Stereoscopic Vision Group, RoViT， University of Alicante ）
Download address: http://www.rovit.ua.es/dataset/traffic/#explore (Get it by email)
Address: https://www.mdpi.com/2079-9292/7/11/301#
Release time: 2018
Introduction: The data in this dataset comes from the existing dataset, such as PASCAL VOC UDacity、Sweden， At the same time, a part of data (about 1%) was collected through the HD camera installed on the vehicle, and the label category was added to the public data set. Some of the data are weakly labeled data, which can be used to test weakly supervised learning technology
features
- The dataset is divided into two parts: traffic objects and traffic signs
- Traffic objects dataset is annotated by 2D, including cars, motorcycles, people, signal lights, buses, bicycles and traffic signs
- Traffic signs include a total of 43 common traffic signs on European streets, with data from GTSRB and Sweden
- 12000 traffic signs

Road Damage Dataset 2018-2020

Issued by: University of Tokyo
Download address: https://github.com/sekilab/RoadDamageDetector/
Address: https://arxiv.org/abs/1801.09454 、 https://www.sciencedirect.com/science/article/pii/S2352340921004170
Release time: 2018-2020
brief introduction
- Road Damage Dataset 2018: This data set is the first time to collect large-scale road damage data set, which has collected more than 40 hours of data in 7 cities in Japan. It is composed of 9053 road disease images taken by smart phones installed on cars. These road images contain 15435 road disease instances, including 8 disease types. In each image, the location and type of road disease are marked
- Road Damage Dataset 2020: This dataset is taken by car smart phones and contains 26336 road images from India, Japan and the Czech Republic, including more than 31000 road damage instances. This dataset collects four types of road damage: longitudinal cracks, transverse cracks, crocodile cracks and potholes

Mapillary Traffic Sign Dataset

Publisher: Mapillary
Download address: https://www.mapillary.com/dataset/trafficsign
Address: https://arxiv.org/abs/1909.04422
Published in 2020
Introduction: Mapillary traffic sign data set is the largest and most diversified public traffic sign data set in the world, which is used to detect and classify traffic signs around the world
features
- 100000 high-resolution images, 52000 fully labeled and 48000 partially labeled
- 313 traffic sign categories with boundary box, 320000 traffic signs
- Diversity: covering all kinds of weather, seasons and time of the day, as well as the global geographical scope of urban and rural roads, images and traffic sign categories, covering 6 continents

Chinese Traffic Sign Database

Issued by: Beijing Jiaotong University
Download address: http://www.nlpr.ia.ac.cn/pal/trafficdata/recognition.html
features
- 6164 traffic sign images, including 58 sign categories
- The data set is divided into training set and test set. The training set contains 4170 pictures, while the test set contains 1994 pictures
- The data set is collected in different weather, lighting conditions and other environments, including partial occlusion and other conditions

Semantic segmentation dataset

SemanticKITTI

Published by: University of Bonn
Download address: http://www.semantic-kitti.org/dataset.html#overview
Address: https://www.researchgate.net/profile/MartinGarbade/publication/332168840_A_Dataset_for_Semantic_Segmentation_of_Point_Cloud_Sequences/links/5cac76d0299bf184605517a1/A-Dataset-for-Semantic-Segmentation-of-Point-Cloud-Sequences.pdf
Published in 2019
Introduction: SemanticKITTI is a sub data set of KITTI in the direction of semantic segmentation, one of the important benchmarks for semantic segmentation of lidar, and the largest data set with sequence information so far. It marks all the sequences in KITTI Vision Odometry Benchmark, and provides intensive point by point marking for the complete 360 ° field of view of the used automotive laser radar. Based on this dataset, the R&D team proposed three benchmark tasks: (i) semantic segmentation of point clouds using a single scan, (ii) semantic segmentation of multiple past scans using sequences, and (iii) semantic scenario completion
features
- 23 201 full 3D scans for training and 20 351 for testing
- The dataset contains 28 annotation categories, which are divided into static objects and dynamic objects, including traffic participants such as pedestrians and vehicles, as well as ground facilities such as parking lots and sidewalks
- The data set includes 518 tiles, with a total of more than 1700 hours of labeling work
- The R&D team also opened source the point cloud marking tools used in the data collection process

Highway Driving

Publisher: Korea Advanced Institute of Science and Technology
Download address: https://sites.google.com/site/highwaydrivingdataset/
Address: https://arxiv.org/pdf/2011.00674.pdf
Published in 2019
Introduction: Highway driving data set is a benchmark for semantic video segmentation task with dense annotation. The annotation provided by it is more dense than other existing data sets in space and time. The annotation of each frame takes into account the correlation between adjacent frames
features
- It is composed of 20 60 frame sequences, and the frame rate is 30Hz
- The data set is divided into training set and test set. The training set consists of 15 sequences, while the test set consists of the remaining five sequences
- The short video clip with a frame rate of 30Hz was shot while driving on the highway
- Contains 10 categories of labels for roads, lanes, sky, fences, buildings, traffic signs, cars, trucks, vegetation, and unknown. Unknown classes include undefined objects, the hood of the vehicle that collected data, and blurry edges

Wilddash

Issued by: Austrian Institute of Technology
Download address: https://wilddash.cc/accounts/login?next=/download
Address: https://openaccess.thecvf.com/content_ECCV_2018/papers/Oliver_Zendel_WildDash_-_Creating_ECCV_2018_paper.pdf
Release time: 2018
Size: 10.8G
Introduction: This is a new test data set for semantic and instance segmentation in the automotive field. It has the following advantages: (i) it allows backtracking of failed tests to discover visual risk factors; (ii) Add negative test cases to avoid false positives; (iii) With low area deviation and low camera setting deviation
features
- Covers the global diversity of traffic conditions, including test cases from all over the world
- By having a large number of road scenes, road layout, weather and lighting conditions from different countries, the deviation of the data set is reduced
- Scenarios with visual hazards and improved meta information clarify which hazards are covered for each test image
- Mark: roads, sidewalks, parking lots, rails, people, riders, cars, trucks, buses, rails, motorcycles, bicycles, caravans, buildings, walls, fences, guardrails, bridges, tunnels, poles, traffic signs, traffic lights, vegetation, terrain, sky, ground, dynamic and static
- Test cases including negative algorithm failures

IDD

Issued by: IIIT Hyderabad
Download address: http://idd.insaan.iiit.ac.in/accounts/login/?next=/dataset/download/
Address: https://sci-hub.se/10.1109/wacv.2019.00190
Release time: 2018
Introduction: This is a dataset used to understand road scenes in unstructured environments. It reflects the label distribution of road scenes that are significantly different from existing datasets
features
- Composed of 10004 images, 34 categories are used for fine annotation. These categories are collected from 182 driving sequences on Indian roads (end)
- The resolution is mainly 1080p, but also 720p and other resolutions
- Acquisition environment: Hyderabad, Bangalore and other cities and their suburbs, with on-board camera

Lane line detection data set

Unsupervised Llamas

Publisher: BoschN A.Research
Download address: https://unsupervised-llamas.com/llamas/
Address: https://openaccess.thecvf.com/content_ICCVW_2019/papers/CVRSUAD/Behrendt_Unsupervised_Labeled_Lane_Markers_Using_Maps_ICCVW_2019_paper.pdf
Published in 2019
Introduction: This data set is one of the largest high-quality lane marking data sets. Through this data set, the publisher provides a benchmark and baseline
features
- Including 100042 labeled lane marking images from about 350 km driving records
- The pipeline generating the marker image uses the automatically created map to project the marker into the camera image, and relies on the optimization program to improve the accuracy of the label
- It contains pixel level dotted line annotation, 2D and 3D endpoints of each marker and lane association connecting markers

BDD

Published by: University of California, Berkeley
Download address: https://bdd-data.berkeley.edu/
Address: https://arxiv.org/pdf/1805.04687.pdf
Release time: 2018
Introduction: This was the largest and most diversified open driving video dataset used for computer vision research at that time. In addition, this data set is also suitable for pedestrian recognition, because it contains more pedestrian instances than the previous dedicated data set
features
- The dataset samples a keyframe from the 10th second of each video and provides comments for these keyframes
- They are labeled at multiple levels: image markers, corridor object bounding boxes, drivable regions, lane markers, and full frame instance segmentation
- These comments can help understand the diversity of data and object statistics in different types of scenarios

ApollpScape

Publisher: BoschN A.Research
Download address: http://apolloscape.auto/scene.html
Address: https://arxiv.org/pdf/1803.06184.pdf
Release time: 2018
Introduction: Baidu Apollo data set includes trajectory prediction, 3D lidar target detection and tracking, scene analysis, lane semantic segmentation, 3D car instance segmentation, stereo and repair data sets, etc
features
- Lane semantic segmentation: 110000 frames of high-quality pixel level semantic segmentation data
- 3D object detection and tracking data set: collected under various lighting conditions and traffic densities in Beijing, China

CULane

Issued by: University of Hong Kong
Download address: https://pan.baidu.com/s/1KUtzC24cH20n6BtU5D0oyw#list/path=%2F
Address: https://ojs.aaai.org/index.php/AAAI/article/view/12301
Release time: 2018
Introduction: This is a large-scale challenging dataset for academic research on traffic lane detection
features
- Collected by cameras installed on six different vehicles driven by different drivers in Beijing
- Capture more than 55 hours of video and extract 133235 frames
- The data set is divided into 88880 images as the training set, 9675 images as the verification set, and 34680 images as the test set. The test set is divided into normal and 8 challenge categories

Optical flow data set

Crowd-Flow

Published by: Berlin University of Technology
Download address: https://github.com/tsenst/CrowdFlow
Address: http://elvera.nue.tu-berlin.de/files/1548Schr%C3%B6der2018.pdf
Release time: 2018
Introduction: Crowd Flow dataset aims to provide an optical flow benchmark, focusing on the sequence of crowd behavior analysis
features
- The sequence contains 371 to 1451 individuals with independent movements
- The dataset consists of 10 sequences with a length range of 300 to 450 frames, all of which are presented at a frame rate of 25 hz and HD resolution
- Compared with the previous optical flow dataset, this dataset not only improves the resolution and the number of frames, but also groups in a continuous sequence rather than a single frame pair
- Weaving, which allows time consistency to be assessed, e.g. in the form of tracks

Panoramic dataset

Complex Urban

Issued by: Korean Academy of Science and Technology
Download address: http://irap.kaist.ac.kr/dataset
Address: https://ieeexplore.ieee.org/document/8460834
Release time: 2018
Introduction: This data set provides light detection and ranging (laser radar) data and stereo images with various position sensor targets in highly complex urban environments. The data set provided captures the characteristics of urban environments such as metropolitan areas, complex buildings and residential areas. The data of 2D and 3D lidar are given, which are typical lidar sensors. The raw sensor data for vehicle navigation is provided in a file format. In order to facilitate development, development tools are provided in the robot operating system (ROS) environment
features
- Provide data from different environments, such as complex metropolitan areas, residential areas, and apartment complexes
- Provide sensor data with two levels of accuracy (ordinary low precision sensor and expensive high precision sensor)
- Provide baseline through SLAM algorithm using high-precision navigation sensors and manual iterative closest point (ICP)
- Provide development tools for general robot community through ROS
- Use WebGL to provide raw data and 3D preview for different robot applications

ApolloScape

Publisher: Baidu
Download address: http://apolloscape.auto/stereo.html#to_data_href
Official website address: http://apolloscape.auto/index.html
Release time: 2018
Introduction: Apollo Scape is a part of Apollo autopilot project. It is a research oriented project aimed at promoting innovation in all aspects of autopilot, including perception, navigation and control. It provides public access to semantic annotation (pixel level) street view images and simulation tools, and supports user-defined policies. This is an ongoing project. New data sets and new capabilities will be added regularly
features
- There are 5165 image pairs and corresponding difference maps, of which 4156 are for training and 1009 are for testing
- The true value is obtained by accumulating 3D point cloud from laser radar and fitting the 3D CAD model to the independently moving car
- Including different traffic conditions and severe blocking

ONCE

Publisher: Huawei
Download address: https://once-for-auto-driving.github.io/download.html#downloads
Address: https://arxiv.org/abs/2106.11037
Published in 2021
Introduction: To solve the problem of insufficient data, The ONCE (One millionN sCenEs) dataset contains 1 million 3D scenes and 7 million corresponding 2D images, which is 5 times more than the largest Waymo Open dataset in terms of quantity, and the recording time of 3D scenes is 144 driving hours, 20 times longer than the existing dataset, covering more different weather conditions, traffic conditions, time periods and regions
features
- 200 square kilometers of driving area, 144 hours of driving time
- 15000 fully annotated scenarios with 5 categories (cars, buses, trucks, pedestrians, bicycles)
- Diverse environment (day/night, sunny/rainy, urban/suburban)

Positioning and Map Datasets

StreetLearn

Publisher: DeepMind
Download address: http://streetlearn.cc
Address: https://xueshu.baidu.com/usercenter/paper/show?paperid=1r4304w0c47q0xx0sb3m06c0fc659763
Published in 2019
Introduction: In order to support research on direct learning of navigation strategies through exploration and interaction with the environment (such as using end-to-end deep reinforcement learning), DeepMind designed StreetLearn, which is an interactive, first person, partially observed visual environment. It uses Google Street View's photo content and wide coverage, including Pittsburgh and New York City, and provides a performance baseline for challenging target driven navigation tasks
features
- High photo resolution
- Displayed a variety of urban scenes
- Cross city scale area with real street connection map
- The author has developed several traversal tasks that require the agent to navigate from one target to another within a long distance

UTBM RoboCar

Published by: Belfort Monbelial University of Technology
Download address: https://epan-utbm.github.io/utbm_robocar_dataset/
Address: https://arxiv.org/abs/1909.03330
Published in 2019
Introduction: The data set collects data through a multi-sensor platform, which integrates 11 heterogeneous sensors, including various cameras and lidars, radars IMU (Inertial Measurement Unit) and GPS-RTK (Global Positioning System/Real Time Kinematics), meanwhile, use software based on ROS (Robot Operating System) to process sensory data. This data set is used for autonomous driving to meet many new research challenges (such as high dynamic environment), especially long-term autonomy (such as creating and maintaining maps)
features
- Fully ROS based
- It records the map data of cities and suburbs, and contains many new features of urban and suburban driving, such as highly dynamic environment (a large number of moving objects in vehicle mileage measurement), roundabout, ramp roads, building detours, aggressive driving, etc
- The real ground track recorded by GPS-RTK is provided for vehicle positioning
- It captures daily and seasonal changes, especially suitable for long-term independent vehicle research
- Realized the laser radar mileage measurement benchmark with loam velodyne and LeGO-LOAM as baselines
- Various detour data are provided to carry out relevant research on vehicle behavior prediction and help reduce loading accidents under such circumstances

Multi Vehicle Stereo Event Camera

Published by: University of Pennsylvania
Download address: https://daniilidis-group.github.io/mvsec
Address: https://xueshu.baidu.com/usercenter/paper/show?paperid=db1a3beb87eca6e87b799b1c1d87111d
Release time: 2018
Introduction: The data set is collected by a synchronous event based stereo pair camera system under various lighting levels and environments. The system is carried by a handheld device, flown by a six rotor aircraft, driven on the top of the car, and installed on a motorcycle. In each camera, event flow, grayscale image and IMU reading are provided. In addition, the author uses the combination of IMU, a rigidly mounted lidar system, indoor and outdoor motion capture and GPS to provide accurate pose and depth images for each camera at a frequency up to 100Hz
features
- Event based cameras perceive the world by detecting the logarithmic intensity changes of images, accurately recording tens of microseconds of changes and asynchronous, almost timely feedback, allowing extremely low delay response
- Provides event streams from two synchronized and calibrated dynamic vision and active pixel sensors, with long indoor and outdoor sequences under various lighting and speeds
- Accurate depth image and frequency up to 100Hz

Comma2k19

Publisher: comma.ai
Download address: https://github.com/commaai/comma2k19
Address: https://arxiv.org/abs/1812.05752
Release time: 2018
Introduction: comma2k19 is a data set of more than 33 hours of commute on the 280 highway in California. It divides the data recorded on the 20 km highway between San Jose and San Francisco, California, into 2019 segments, each segment lasting one minute. It is a fully reproducible and extensible dataset. This data set is collected using comma EONs. Its sensors are similar to those of any modern smart phone, including a road facing camera, mobile GPS, thermometer and 9-axis IMU. In addition, EON also uses the comma grey panda to capture the original GNSS measurement value and all CAN data sent by the car
features
- There is a road facing camera, a 9-axis IMU, the CAN information transmitted by the vehicle and the log of the original GNSS measurement
- All data are concentrated in a very small area. Repeated observation of the same place under various conditions combined with the original GNSS log makes this data set more suitable for developing high-performance positioning and mapping algorithms

Driving behavior data set

DBNet

Issued by: Shanghai Jiaotong University
Download address: http://www.dbehavior.net
Address: http://www.dbehavior.net/data/egpaper_release.pdf
Release time: 2018
Introduction: The network provides large-scale high-quality point cloud scanned by Velodyne laser, video recorded by dashboard camera and standard driver behavior
features
- Large scale: the dataset consists of more than 10k frames of real streetscape, and the total data exceeds 1TB
- Diversity: record continuous and changeable scenes in real traffic, such as seaside roads, school areas and even mountain roads, which contain a large number of intersections, pedestrians and traffic signs
- High quality: the point cloud, video and driver's behavior in the data set are obtained through high-resolution sensors, which can better restore the real driving conditions

HDD

Issued by: Honda Research Institute
Download address: https://usa.honda-ri.com/hdd
Address: https://usa.honda-ri.com/documents/248678/249773/CVPR_18_HDD_Yi_Ting-v2.pdf/bb391444-9687-7b3b-0b34-c3534f15904f
Release time: 2018
Introduction: The clear goal of this data set is to learn how humans perform actions and interact with traffic participants. The author collected 104 hours of real human driving records in the San Francisco Bay Area, using a vehicle with instruments. The recording consists of 137 sessions, and each session represents the navigation task performed by a driver
features
- 104 hours of real human driving records
- High resolution camera: 1920 x 1200 pixels, frame rate 30Hz
- Data collection scope includes downtown, suburb and expressway of San Francisco Bay Area

DADA

Issued by: Xi'an Jiaotong University, Chang'an University
Download address: https://github.com/JWFangit/LOTVS-DADA
Address: https://arxiv.org/abs/1912.12148v1
Published in 2019
Introduction: The author has developed this data set to study driver attention prediction. The author searched almost all public data sets and mainstream video websites, and obtained about 3 million frames of videos. After cleaning, he obtained 2000 videos with a resolution of 1584 x 660 (equal to 6.1 hours, 30 fps). The videos do not do any trimming work to make the collection of attention more natural
features
- Divide the videos into 54 categories according to the participants in the accident
- Scenarios involve expressways, urban roads, rural roads, tunnels, etc

Simulation data set

SHIFT

Publisher: Visual Intelligence and Systems Group of Zurich Federal Institute of Technology
Download address: www.vis.xyz/shift
Address: https://arxiv.org/abs/2206.08367
Published in 2022
Introduction: The largest automated driving multi task composite data set, which presents the discrete and continuous changes in cloud cover, rain and fog intensity, time of day, and vehicle and pedestrian density. SHIFT has a comprehensive sensor suite and annotations for several mainstream perception tasks, which can investigate the degradation of perception system performance when the level of domain transfer increases, promote the development of continuous adaptation strategies, alleviate this problem, and evaluate the robustness and versatility of the model
features
- Largest composite dataset
- A multi task driven dataset
- It has the most important perception task under various conditions, and has comprehensive sensor settings
- Provides the most comprehensive set of comments and conditions

Livox

Publisher: Livox
Download address: https://livox-wiki-cn.readthedocs.io/zh_CN/latest/data_summary/dataset.html
Official website address: https://www.livoxtech.com/cn/simu-dataset
Published in 2021
Introduction: Livox simulation data set is point cloud data and corresponding annotation generated based on automatic driving simulation test platform, which supports 3D target detection and semantic segmentation tasks. The sensor is configured with 5 lidars and 1 ultra long range lidar. The entire data set contains 14445 frames of 360 ° Lidar point cloud data, 3D bounding box labels for 6 kinds of targets and 14 kinds of point cloud semantic labels. The dataset scenario is mainly the urban wide road scenario, including two-way 12 lane and two-way 8 lane. Accordingly, the simulation scene also includes a variety of vehicle and pedestrian models, as well as traffic flow simulation closer to the real scene. In addition, abundant traffic lights, traffic signs, barriers (including barriers, green belts, isolation piers, etc.), trees and buildings all make the whole simulation scene closer to the actual driving conditions
features
- Traffic flow simulation with rich scene modeling, which is closer to the real scene
- 14445 frame 360 ° point cloud data
- 3D bounding box and target tracking ID marking of six targets
- 14 kinds of point cloud semantic annotation

51WORLD

Issued by: 51WORLD
Download address: https://gitee.com/OpenSimOne
Official website address: https://www.51aes.com/
Published in 2020
Introduction: 51WORLD virtual annotation data is generated and annotated based on its self-developed automated driving simulation test platform 51Sim One. The platform integrates static and dynamic data import, sensor simulation, dynamic simulation, visualization, test and playback, virtual data set generation, and in loop testing. The function module covers the whole process of automatic driving simulation test, with both scale, high precision and high realism, and can automatically complete synchronous output of multi-sensor original data (images and point clouds) and truth value
features
- It covers common camera simulation data sets, as well as laser radar simulation data sets
- Including extreme conditions, complex roads and weather conditions that are challenging for the automatic driving system
- Fully automatic dimensioning

OPV2V

Publisher: UCLA
Download address: https://mobility-lab.seas.ucla.edu/opv2v/
Address: https://arxiv.org/pdf/2109.07644.pdf
Published in 2022
Introduction: This data set is the first large-scale open simulation data set for vehicle to vehicle perception. It contains more than 70 interesting scenes, 11464 frames and 232913 annotated 3D vehicle bounding boxes, collected from 8 towns in CARLA and a digital town in Calver City, Los Angeles. The author has constructed a comprehensive benchmark with 16 models to evaluate several information fusion strategies (i.e. early, late and intermediate fusion) and the most advanced lidar detection algorithms. In addition, the author also proposes a new focus intermediate fusion pipeline to collect information from multiple connected vehicles
features
- There are 73 different scenes, each with multiple autonomous vehicles and approaching real traffic flow
- Including 6 road types and 9 different cities
- Including 12000 point cloud images, 48000 RGB images and 230000 labeled 3D detection frames

reference resources

This article is written by Chakhsu Lau Creation, adoption Knowledge Sharing Attribution 4.0 International License Agreement.
All articles on this website are original or translated by this website, except for the reprint/source. Please sign your name before reprinting.