Traffic Modelling Datasets

Traffic modelling is a hot topic in the field of autonomous cars currently. Here you will find a list of open datasets which can be used as source for building a traffic model. The list will include data captured from a variety of sources as listed below:

UAV/Drones
Traffic Camera
Autonomous Cars

This is not a list of datasets for learning-to-drive. This list is more focused towards dataset which provide global perspective of the traffic scenarios, rather than ego-vehicle perspective. Though few ego-vehicle datasets can be used for traffic modelling as well.

Datasets

Argoverse
- 3D tracking data: 113 segments of 15-30secs each.
- Motion Forecasting: 324,557 segments of 5secs each (Total 320hrs)
- Python APIs available for visualization on HD maps
- Data not split into signalized and non-signalized intersections.
Interaction
- Contains roundabout, lane merging - which do not require Traffic Light info.
- “DR_” recorded via drones. “TC_” are track files recorded via fixed cameras.
- HD map has drive-able area and lane markings
- No video or image available
- Position, velocity, orientation, bbox dimension in csv
- Python APIs available for visualization on HD maps
In-D
- Python script to visualize data available.
- bbox dimension, center (x,y), velocity (x,y and lat-long), acceleration (x,y and lat-long), heading in csv.
- xx_background.png contains image of the road. No other images/ videos.
- Lane markings, drive-able area part of HD maps (.osm file).
- Entire data at signalized intersection.
High-D
- Python script to visualize data available.
- bbox dimension, center (x,y), velocity (x,y and lat-long), acceleration (x,y and lat-long), heading in csv.
- xx_background.png contains image of the road. No other images/ videos.
- Lane markings, drive-able area part of HD maps (.osm file).
- Entire data at highway.
NGSIM
- Vehicle ID, local x,y and global x,y, vehicle width and length.
- Video available for road and lane data. Video quality very bad!
- CAD files of area available.
Stanford
- Contains data at 8 (~ * 5) scenes.
- 100% view not available for all intersection. In-campus roads not a good representation of normal traffic scenarios. More pedestrian and bike data.
- More info available here.
NuScenes
- 1000 scenes.
- All location data is given with respect to the global coordinate system.
  - Global x, y, z
  - Bbox l, b, h
  - Rotation in Quaternion
  - Contains Pedestrian Data
  - Publication
Apollo
- The trajectory dataset consists of 53min training sequences and 50min testing sequences captured at 2 FPS
  - Global x, y, z positions
  - object length, width, height, heading
  - frame_id, object_id, object type
- Data not split into signalized and non-signalized intersections.
Round-D
- Python script to visualize data available.
- bbox dimension, center (x,y), velocity (x,y and lat-long), acceleration (x,y and lat-long), heading in csv.
- xx_background.png contains image of the road. No other images/ videos.
- Lane markings, drive-able area part of HD maps (.osm file).
- Full data on roundabouts.

Below we mention several parameters crucial for learning behavior and interactions between vehicles in a recorded scenario against each dataset.

S.No.	Data Source	Road	Lane Boundary	Vehicle	Pedestrian	Traffic Light
1	Argoverse	✔️	✔️	✔️	✔️	❌

2	Interaction	✔️	✔️	✔️	❌	❌

3	In-D	✔️	✔️	✔️	✔️	❌

4	High-D	✔️	✔️	✔️	✔️	❌

5	NGSIM	✔️	✔️	✔️	❌	✔️

6	Stanford	❌	❌	✔️	✔️	❌

7	NuScenes	✔️	✔️	✔️	✔️	❌

8	Apollo	❌	❌	✔️	✔️	❌

9	Round-D	✔️	✔️	✔️	✔️	❌