Welcome to MIKKI!

Hi there! We are releasing an open dataset for the computer vision community and hope to work together to solve real-world challenges in detection and recognition for autonomous driving. We partnered with fleets of taxis and car rental companies to install dash cams and collect real roads data. The first batch of 1000 images was selected to show some of the less seen objects on roads, and objects that pose detection challenges for driving decisions. In addition, we included another 200 images of screenshots taken when our researchers played GTA (for work).

We understand releasing this dataset is only the beginning, and in the future, we will keep adding more labelled images for more varieties of tasks and publishing evaluation results for the community. If you are interested in looking at how your algorithms work on this dataset, you can submit your results to mikki@momenta.ai. Please see below for how we evaluate car and person detections.



Currently we have ground truth 2D bounding-box labelled in two categories: car and person. A detector's predictions will be evaluated by the metric:

AP @ IoU=0.5:0.05:0.95

The 10 IoU thresholds are used for determining whether a predicted bounding-box is true-positive or false-positive. A high IoU threshold can reward detectors with more precise localization. The final AP will be averaged across all 10 IoU thresholds (0.5:0.05:0.95) and all images.(For each IoU threshold, we select prediction-ground truth pairs with IoU>threshold. Then, we select the pair with highest IoU and mark the prediction as true-positive, and remove pairs with the same prediction or ground truth. Repeat this operation until the list is empty. All other predictions will be marked as false-positive. Then, sort all predictions in decreasing order of confidence and a precision-recall curve will be computed. The area between this curve, recall-axis and precision-axis is computed as AP.)


  • Each object category (car, person) is evaluated independently, that is, predictions with one category will be ignored when computing AP for the other category.
  • In some cases, the car or person is too blurred or too small to predict, even by human beings. So in ground truth, we set ignored regions. When one algorithm's prediction lies in (IoU>0.5) an ignored region, we will not count it as true positive, nor false positive.
  • Detection result should be save into a JSON file for submission. A sample submission is attached here: Sample JSON. Currently please email your result to us. We would enable online submission later.


Click to download: (813.7M)

Here we present some typical challenging cases as preview.

Challenge: You might not want your algorithms to believe there are seven real people right in front of the car, so that it does not make unnecessary brakes.

Challenge: Difficult to localize a vehicle with an irregular shape.

Related Datasets:


Creative Commons License The MIKKI dataset follows the Creative Commons Attribution-ShareAlike 4.0 International License. It means you can create and distribute derivative works but only under the same or similar license.


  • 4/1/2017: The MIKKI Dataset is publicly released with car and person object detection tasks.
  • 4/5/2017: Link to the Precarious Pedestrian dataset.


If you have any concerns with the privacy issue of this dataset, please contact us, and we will immediately act upon your request.


Momenta MIKKI Group (Sophie Chu, Hannan Xiao, Yuli Bai, Sibo Jia, Jinwei Wang, Chen Wang, Yan Xia, Huan Sun, and Xudong Cao)


We’d love to hear from you! Please feel free to send emails to mikki@momenta.ai.