New VOC Dataset for Anno-Robot, Step by Step

2 min readJun 24, 2022

How to create a new dataset for Anno-Robot

Clone TF dataset source code, no worry, we are not going to modify it, just to copy voc.py.

$ git clone https://github.com/tensorflow/datasets.git

Using tfds to create a new empty dataset template.

$ tfds new anno_dataset

Copy voc.py to replace my_dataset.py

$ cd anno_dataset
$ cp ../datasets/tensorflow_datasets/object_detection/voc.py ./anno_dataset.py

In anno_dataset.py, modify line as:

comment first 2 VocConfig of tree VocConfig, add the last one …year=”2022". Actually, only the filenames={} matters.

In docker, start http server (default port=8000):

$ cd /
$ python3 -m http.server

To test the dataset:

for the 1st time build your dataset and it was never built successfully:

# cd anno_dataset
# tfds build
# tfds build --register_checksums

for the ≥2nd time:

# tfds build --overwrite
# tfds build --register_checksums

Please note: you will fake-pass the “tfds build” if you don’t do it with — overwrite

How to make a smaller dataset for Pascal format

`-- VOCdevkit
    `-- VOC2012
        |-- Annotations
        |-- Annotations1
        |-- ImageSets
        |   |-- Action
        |   |-- Layout
        |   |-- Main
        |   `-- Segmentation
        |-- JPEGImages
        |-- JPEGImages1
        |-- SegmentationClass
        `-- SegmentationObject

It will split treeh possible dataset when running tfds build — register_checksums by following files definition:

Main
|--train.txt, test.txt, val.txt

make the train.txt and val.txt from ‘/mnt/anno_dataset/data/tmp_test/train/VOCdevkit/VOC2012/JPEGImages’

ImageSets/Main# python3 run_make_traintxt.py
# cp train.txt val.txt

make new tar:

$ cd /mnt/anno_dataset/data/tmp_test/train
$ tar -cvf VOCdevkit.tar ./VOCdevkit

How to modify code to point a new dataset For Anno-Robot

What files are needed:

VOCOtrain.tarVOCOtest.tarconfig.json

What file to modify (anno_dataset.py)

Register above files to TFDS

# tfds build --overwrite
# tfds build --register_checksums