<!-- Autogenerated by `scripts/make_examples.py` -->
<table align="left">
    <td>
        <a target="_blank" href="https://colab.research.google.com/github/voxel51/fiftyone-examples/blob/master/examples/wrangling_datasets.ipynb">
            <img src="https://user-images.githubusercontent.com/25985824/104791629-6e618700-5769-11eb-857f-d176b37d2496.png" height="32" width="32">
            Try in Google Colab
        </a>
    </td>
    <td>
        <a target="_blank" href="https://nbviewer.jupyter.org/github/voxel51/fiftyone-examples/blob/master/examples/wrangling_datasets.ipynb">
            <img src="https://user-images.githubusercontent.com/25985824/104791634-6efa1d80-5769-11eb-8a4c-71d6cb53ccf0.png" height="32" width="32">
            Share via nbviewer
        </a>
    </td>
    <td>
        <a target="_blank" href="https://github.com/voxel51/fiftyone-examples/blob/master/examples/wrangling_datasets.ipynb">
            <img src="https://user-images.githubusercontent.com/25985824/104791633-6efa1d80-5769-11eb-8ee3-4b2123fe4b66.png" height="32" width="32">
            View on GitHub
        </a>
    </td>
    <td>
        <a href="https://github.com/voxel51/fiftyone-examples/raw/master/examples/wrangling_datasets.ipynb" download>
            <img src="https://user-images.githubusercontent.com/25985824/104792428-60f9cc00-576c-11eb-95a4-5709d803023a.png" height="32" width="32">
            Download notebook
        </a>
    </td>
</table>


# Wrangling Datasets

This example provides a brief overivew of loading datasets in common formats
into FiftyOne, manipulating them, and then exporting them (or subsets of them)
to disk (in possbily different formats).

For more details, check out the resources below:

-   [Loading data into FiftyOne](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/index.html)
-   [Dataset basics](https://voxel51.com/docs/fiftyone/user_guide/basics.html)
-   [Using dataset views](https://voxel51.com/docs/fiftyone/user_guide/using_views.html)
-   [Exporting FiftyOne datasets](https://voxel51.com/docs/fiftyone/user_guide/export_datasets.html)

## Setup

If you haven't already, install FiftyOne:


In [None]:
!pip install fiftyone

Let's prepare some datasets to work with. Don't worry about the details for now.

In [3]:
import fiftyone as fo
import fiftyone.zoo as foz

# ImageClassificationDirectoryTree
dataset = foz.load_zoo_dataset("cifar10", split="test")
dataset.take(250).export(
    "/tmp/fiftyone-examples/image-classification-directory-tree",
    fo.types.ImageClassificationDirectoryTree,
)

# CVATImageDataset
dataset = foz.load_zoo_dataset("coco-2017", split="validation")
dataset.take(250).export(
    "/tmp/fiftyone-examples/cvat-image-dataset",
    fo.types.CVATImageDataset,
)

Split 'test' already downloaded
Loading existing dataset 'cifar10-test'. To reload from disk, either delete the existing dataset or provide a custom `dataset_name` to use
 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 250/250 [494.2ms elapsed, 0s remaining, 505.9 samples/s]      
Split 'validation' already downloaded
Loading existing dataset 'coco-2017-validation'. To reload from disk, either delete the existing dataset or provide a custom `dataset_name` to use
 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 250/250 [3.0s elapsed, 0s remaining, 90.6 samples/s]      


## Loading data into FiftyOne

FiftyOne provides support for loading [many common dataset formats](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/datasets.html#supported-formats) out-of-the-box.

### Image classification directory tree

You can load a classification dataset stored as a directory tree whose subfolders define the classes of the images.

The relevant dataset type is [ImageClassificationDirectoryTree](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/datasets.html#imageclassificationdirectorytree):

In [4]:
import fiftyone as fo

DATASET_DIR = "/tmp/fiftyone-examples/image-classification-directory-tree"

classification_dataset = fo.Dataset.from_dir(
    DATASET_DIR, fo.types.ImageClassificationDirectoryTree
)

print(classification_dataset)

 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 250/250 [354.0ms elapsed, 0s remaining, 706.2 samples/s]      
Name:           2020.10.25.13.23.47
Media type:     None
Num samples:    250
Persistent:     False
Info:           {'classes': ['airplane', 'automobile', 'bird', ...]}
Tags:           []
Sample fields:
    media_type:   fiftyone.core.fields.StringField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)


### CVAT image dataset

You can load a set of object detections stored in [CVAT image format](https://github.com/openvinotoolkit/cvat).

The relevant dataset type is [CVATImageDataset](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/datasets.html#cvatimagedataset):

In [5]:
import fiftyone as fo

DATASET_DIR = "/tmp/fiftyone-examples/cvat-image-dataset"

detection_dataset = fo.Dataset.from_dir(
    DATASET_DIR, fo.types.CVATImageDataset
)

print(detection_dataset)

 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 250/250 [3.5s elapsed, 0s remaining, 72.3 samples/s]      
Name:           2020.10.25.13.23.54
Media type:     image
Num samples:    250
Persistent:     False
Info:           {'created': '2020-10-25T13:23:43.618971', 'dumped': '2020-10-25T13:23:43.618971', 'task_labels': [{...}, {...}, {...}, ...], ...}
Tags:           []
Sample fields:
    media_type:              fiftyone.core.fields.StringField
    filepath:                fiftyone.core.fields.StringField
    tags:                    fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:                fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth_detections: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)


## Adding samples to datasets

Adding new samples to datsets is easy.

You can [create new samples](https://voxel51.com/docs/fiftyone/user_guide/basics.html#samples):

In [6]:
import fiftyone as fo

sample = fo.Sample(filepath="/path/to/image.jpg")

print(sample)

<Sample: {
    'id': None,
    'media_type': 'image',
    'filepath': '/path/to/image.jpg',
    'tags': [],
    'metadata': None,
}>


... [add fields dynamically](https://voxel51.com/docs/fiftyone/user_guide/basics.html#fields) to them:

In [7]:
sample["quality"] = 89.7
sample["keypoints"] = [[31, 27], [63, 72]]
sample["geo_json"] = {
    "type": "Feature",
    "geometry": {"type": "Point", "coordinates": [125.6, 10.1]},
    "properties": {"name": "camera"},
}

print(sample)

<Sample: {
    'id': None,
    'media_type': 'image',
    'filepath': '/path/to/image.jpg',
    'tags': [],
    'metadata': None,
    'quality': 89.7,
    'keypoints': [[31, 27], [63, 72]],
    'geo_json': {
        'type': 'Feature',
        'geometry': {'type': 'Point', 'coordinates': [125.6, 10.1]},
        'properties': {'name': 'camera'},
    },
}>


... [add labels](https://voxel51.com/docs/fiftyone/user_guide/basics.html#labels) that can be rendered on the media in the App:

In [8]:
sample["weather"] = fo.Classification(label="sunny", confidence=0.95)
sample["animals"] = fo.Detections(
    detections=[
        fo.Detection(
            label="cat", bounding_box=[0.5, 0.5, 0.4, 0.3], confidence=0.75
        ),
        fo.Detection(
            label="dog", bounding_box=[0.2, 0.2, 0.2, 0.4], confidence=0.51
        )
    ]
)

print(sample)

<Sample: {
    'id': None,
    'media_type': 'image',
    'filepath': '/path/to/image.jpg',
    'tags': [],
    'metadata': None,
    'quality': 89.7,
    'keypoints': [[31, 27], [63, 72]],
    'geo_json': {
        'type': 'Feature',
        'geometry': {'type': 'Point', 'coordinates': [125.6, 10.1]},
        'properties': {'name': 'camera'},
    },
    'weather': <Classification: {
        'id': '5f95b732ad3e1adb146596d8',
        'label': 'sunny',
        'confidence': 0.95,
        'logits': None,
    }>,
    'animals': <Detections: {
        'detections': BaseList([
            <Detection: {
                'id': '5f95b732ad3e1adb146596d9',
                'attributes': BaseDict({}),
                'label': 'cat',
                'bounding_box': BaseList([0.5, 0.5, 0.4, 0.3]),
                'mask': None,
                'confidence': 0.75,
                'index': None,
            }>,
            <Detection: {
                'id': '5f95b732ad3e1adb146596da',
                '

...and add them to [datasets](https://voxel51.com/docs/fiftyone/user_guide/using_datasets.html):

In [9]:
dataset = fo.Dataset()
print(dataset)

Name:           2020.10.25.13.34.44
Media type:     None
Num samples:    0
Persistent:     False
Info:           {}
Tags:           []
Sample fields:
    media_type: fiftyone.core.fields.StringField
    filepath:   fiftyone.core.fields.StringField
    tags:       fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:   fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)


In [10]:
dataset.add_sample(sample)
print(dataset)

Name:           2020.10.25.13.34.44
Media type:     image
Num samples:    1
Persistent:     False
Info:           {}
Tags:           []
Sample fields:
    media_type: fiftyone.core.fields.StringField
    filepath:   fiftyone.core.fields.StringField
    tags:       fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:   fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    quality:    fiftyone.core.fields.FloatField
    keypoints:  fiftyone.core.fields.ListField
    geo_json:   fiftyone.core.fields.DictField
    weather:    fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    animals:    fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)


In [11]:
print(dataset.first())

<Sample: {
    'id': '5f95b738ad3e1adb146596dc',
    'media_type': 'image',
    'filepath': '/path/to/image.jpg',
    'tags': BaseList([]),
    'metadata': None,
    'quality': 89.7,
    'keypoints': BaseList([BaseList([31, 27]), BaseList([63, 72])]),
    'geo_json': BaseDict({
        'type': 'Feature',
        'geometry': BaseDict({
            'type': 'Point',
            'coordinates': BaseList([125.6, 10.1]),
        }),
        'properties': BaseDict({'name': 'camera'}),
    }),
    'weather': <Classification: {
        'id': '5f95b732ad3e1adb146596d8',
        'label': 'sunny',
        'confidence': 0.95,
        'logits': None,
    }>,
    'animals': <Detections: {
        'detections': BaseList([
            <Detection: {
                'id': '5f95b732ad3e1adb146596d9',
                'attributes': BaseDict({}),
                'label': 'cat',
                'bounding_box': BaseList([0.5, 0.5, 0.4, 0.3]),
                'mask': None,
                'confidence': 0.75,
   

## Working with datasets

You can access samples in datasts by iterating over them:

In [12]:
dataset = classification_dataset.clone()
dataset.compute_metadata()

for sample in dataset:
    # Do something with the sample here

    sample.tags.append("processed")
    sample.save()

 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 250/250 [831.6ms elapsed, 0s remaining, 300.4 samples/s]      


In [13]:
print(dataset.first())

<Sample: {
    'id': '5f95b4a3ad3e1adb14658378',
    'media_type': 'image',
    'filepath': '/tmp/fiftyone-examples/image-classification-directory-tree/airplane/000045.jpg',
    'tags': BaseList(['processed']),
    'metadata': <ImageMetadata: {
        'size_bytes': 1239,
        'mime_type': 'image/jpeg',
        'width': 32,
        'height': 32,
        'num_channels': 3,
    }>,
    'ground_truth': <Classification: {
        'id': '5f95b4a3ad3e1adb14658377',
        'label': 'airplane',
        'confidence': None,
        'logits': None,
    }>,
}>


...or access them directly by ID:

In [14]:
sample = dataset.first()

same_sample = dataset[sample.id]

same_sample is sample  # True: samples are singletons!

True

You can also [create views](https://voxel51.com/docs/fiftyone/user_guide/using_views.html) into your datasets that slice and dice your data in interesting ways:

In [15]:
# Sort by filepath
view1 = dataset.sort_by("filepath")
print(view1)

Dataset:        2020.10.25.13.34.56
Media type:     None
Num samples:    250
Tags:           ['processed']
Sample fields:
    media_type:   fiftyone.core.fields.StringField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
Pipeline stages:
    1. SortBy(field_or_expr='filepath', reverse=False)


In [16]:
print(view1.first().filepath)

/tmp/fiftyone-examples/image-classification-directory-tree/airplane/000045.jpg


In [17]:
# Random sample from a dataset
view2 = dataset.take(100)
print(view2)

Dataset:        2020.10.25.13.34.56
Media type:     None
Num samples:    100
Tags:           ['processed']
Sample fields:
    media_type:   fiftyone.core.fields.StringField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
Pipeline stages:
    1. Take(size=100, seed=None)


In [18]:
print(view2.first())

<SampleView: {
    'id': '5f95b4a3ad3e1adb146585d0',
    'media_type': 'image',
    'filepath': '/tmp/fiftyone-examples/image-classification-directory-tree/horse/005678.jpg',
    'tags': BaseList(['processed']),
    'metadata': <ImageMetadata: {
        'size_bytes': 1324,
        'mime_type': 'image/jpeg',
        'width': 32,
        'height': 32,
        'num_channels': 3,
    }>,
    'ground_truth': <Classification: {
        'id': '5f95b4a3ad3e1adb146585cf',
        'label': 'horse',
        'confidence': None,
        'logits': None,
    }>,
}>


In [19]:
# Extract slice of a dataset
view3 = dataset[10:30]
print(view3)

Dataset:        2020.10.25.13.34.56
Media type:     None
Num samples:    20
Tags:           ['processed']
Sample fields:
    media_type:   fiftyone.core.fields.StringField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
Pipeline stages:
    1. Skip(skip=10)
    2. Limit(limit=20)


In [20]:
print(view3.first())

<SampleView: {
    'id': '5f95b4a3ad3e1adb14658396',
    'media_type': 'image',
    'filepath': '/tmp/fiftyone-examples/image-classification-directory-tree/airplane/004375.jpg',
    'tags': BaseList(['processed']),
    'metadata': <ImageMetadata: {
        'size_bytes': 1267,
        'mime_type': 'image/jpeg',
        'width': 32,
        'height': 32,
        'num_channels': 3,
    }>,
    'ground_truth': <Classification: {
        'id': '5f95b4a3ad3e1adb14658395',
        'label': 'airplane',
        'confidence': None,
        'logits': None,
    }>,
}>


View operations can be chained together:

In [21]:
from fiftyone import ViewField as F

complex_view = (
    dataset
    .match_tag("processed")
    .exists("metadata")
    .match(F("metadata.size_bytes") >= 1024)  # >= 1 kB
    .sort_by("filepath")
    .limit(5)
)

print(complex_view)

Dataset:        2020.10.25.13.34.56
Media type:     None
Num samples:    5
Tags:           ['processed']
Sample fields:
    media_type:   fiftyone.core.fields.StringField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
Pipeline stages:
    1. MatchTag(tag='processed')
    2. Exists(field='metadata', bool=True)
    3. Match(filter={'$expr': {'$gte': [...]}})
    4. SortBy(field_or_expr='filepath', reverse=False)
    5. Limit(limit=5)


In [22]:
print(complex_view.first())

<SampleView: {
    'id': '5f95b4a3ad3e1adb14658378',
    'media_type': 'image',
    'filepath': '/tmp/fiftyone-examples/image-classification-directory-tree/airplane/000045.jpg',
    'tags': BaseList(['processed']),
    'metadata': <ImageMetadata: {
        'size_bytes': 1239,
        'mime_type': 'image/jpeg',
        'width': 32,
        'height': 32,
        'num_channels': 3,
    }>,
    'ground_truth': <Classification: {
        'id': '5f95b4a3ad3e1adb14658377',
        'label': 'airplane',
        'confidence': None,
        'logits': None,
    }>,
}>


See the other examples in this folder for more sophisticated view operations!

## Exporting datasets

You can easily [export samples](https://voxel51.com/docs/fiftyone/user_guide/export_datasets.html) in whatever format suits your fancy:

### Exporting a classification dataset

FiftyOne natively supports exporting classification datasets as [directory trees](https://voxel51.com/docs/fiftyone/user_guide/export_datasets.html#imageclassificationdirectorytree) whose subfolders encode the class labels:

In [23]:
# Create a view
view = classification_dataset.take(100)

# Export as a classification directory tree using the labels in the
# `ground_truth` field as classes
view.export(
    "/tmp/fiftyone-examples/export-classification-directory-tree",
    fo.types.ImageClassificationDirectoryTree,
    label_field="ground_truth"
)

 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [135.0ms elapsed, 0s remaining, 741.0 samples/s]     


In [24]:
!ls -lah /tmp/fiftyone-examples/export-classification-directory-tree

total 0
drwxr-xr-x  12 Brian  wheel   384B Oct 25 13:36 [34m.[m[m
drwxr-xr-x   5 Brian  wheel   160B Oct 25 13:36 [34m..[m[m
drwxr-xr-x  19 Brian  wheel   608B Oct 25 13:36 [34mairplane[m[m
drwxr-xr-x  11 Brian  wheel   352B Oct 25 13:36 [34mautomobile[m[m
drwxr-xr-x  11 Brian  wheel   352B Oct 25 13:36 [34mbird[m[m
drwxr-xr-x  18 Brian  wheel   576B Oct 25 13:36 [34mcat[m[m
drwxr-xr-x  10 Brian  wheel   320B Oct 25 13:36 [34mdeer[m[m
drwxr-xr-x  14 Brian  wheel   448B Oct 25 13:36 [34mdog[m[m
drwxr-xr-x  13 Brian  wheel   416B Oct 25 13:36 [34mfrog[m[m
drwxr-xr-x   7 Brian  wheel   224B Oct 25 13:36 [34mhorse[m[m
drwxr-xr-x  10 Brian  wheel   320B Oct 25 13:36 [34mship[m[m
drwxr-xr-x   7 Brian  wheel   224B Oct 25 13:36 [34mtruck[m[m


In [26]:
!ls -lah /tmp/fiftyone-examples/export-classification-directory-tree/airplane | head

total 136
drwxr-xr-x  19 Brian  wheel   608B Oct 25 13:36 .
drwxr-xr-x  12 Brian  wheel   384B Oct 25 13:36 ..
-rw-r--r--   1 Brian  wheel   1.1K Oct 25 13:36 002718.jpg
-rw-r--r--   1 Brian  wheel   1.3K Oct 25 13:36 003006.jpg
-rw-r--r--   1 Brian  wheel   898B Oct 25 13:36 003279.jpg
-rw-r--r--   1 Brian  wheel   1.3K Oct 25 13:36 003343.jpg
-rw-r--r--   1 Brian  wheel   1.3K Oct 25 13:36 003498.jpg
-rw-r--r--   1 Brian  wheel   1.2K Oct 25 13:36 005828.jpg
-rw-r--r--   1 Brian  wheel   1.1K Oct 25 13:36 006222.jpg


### Exporting a detection dataset

FiftyOne natively supports exporting object detection datasets in [COCO format](https://voxel51.com/docs/fiftyone/user_guide/export_datasets.html#cocodetectiondataset):

In [27]:
# Create a view
view = detection_dataset.take(100)

# Export in COCO format with detections from the `ground_truth_detections` field of
# the samples
view.export(
    "/tmp/fiftyone-examples/export-coco",
    fo.types.COCODetectionDataset,
    label_field="ground_truth_detections"
)

 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [738.2ms elapsed, 0s remaining, 135.5 samples/s]      


In [29]:
!ls -lah /tmp/fiftyone-examples/export-coco

total 224
drwxr-xr-x    4 Brian  wheel   128B Oct 25 13:39 [34m.[m[m
drwxr-xr-x    6 Brian  wheel   192B Oct 25 13:39 [34m..[m[m
drwxr-xr-x  102 Brian  wheel   3.2K Oct 25 13:39 [34mdata[m[m
-rw-r--r--    1 Brian  wheel   112K Oct 25 13:39 labels.json


In [30]:
!ls -lah /tmp/fiftyone-examples/export-coco/data | head

total 25760
drwxr-xr-x  102 Brian  wheel   3.2K Oct 25 13:39 .
drwxr-xr-x    4 Brian  wheel   128B Oct 25 13:39 ..
-rw-r--r--    1 Brian  wheel    84K Oct 25 13:39 000014.jpg
-rw-r--r--    1 Brian  wheel   128K Oct 25 13:39 000111.jpg
-rw-r--r--    1 Brian  wheel   101K Oct 25 13:39 000172.jpg
-rw-r--r--    1 Brian  wheel   130K Oct 25 13:39 000173.jpg
-rw-r--r--    1 Brian  wheel   107K Oct 25 13:39 000270.jpg
-rw-r--r--    1 Brian  wheel   185K Oct 25 13:39 000272.jpg
-rw-r--r--    1 Brian  wheel   102K Oct 25 13:39 000316.jpg


In [37]:
!python -m json.tool /tmp/fiftyone-examples/export-coco/labels.json | tail -n 16

        {
            "id": 745,
            "image_id": 99,
            "category_id": 52,
            "bbox": [
                0.0,
                56.0,
                427.0,
                479.0
            ],
            "segmentation": null,
            "area": 76225.0,
            "iscrowd": 1
        }
    ]
}


### Exporting entire samples

In [38]:
# Create a view
view = detection_dataset.take(100)

# Export entire samples
view.export(
    "/tmp/fiftyone-examples/export-fiftyone-dataset",
    fo.types.FiftyOneDataset
)

 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [835.9ms elapsed, 0s remaining, 119.6 samples/s]      


In [39]:
!ls -lah /tmp/fiftyone-examples/export-fiftyone-dataset

total 496
drwxr-xr-x    5 Brian  wheel   160B Oct 25 13:43 [34m.[m[m
drwxr-xr-x    7 Brian  wheel   224B Oct 25 13:43 [34m..[m[m
drwxr-xr-x  102 Brian  wheel   3.2K Oct 25 13:43 [34mdata[m[m
-rw-r--r--    1 Brian  wheel   3.2K Oct 25 13:43 metadata.json
-rw-r--r--    1 Brian  wheel   242K Oct 25 13:43 samples.json


In [40]:
!ls -lah /tmp/fiftyone-examples/export-fiftyone-dataset/data | head

total 24944
drwxr-xr-x  102 Brian  wheel   3.2K Oct 25 13:43 .
drwxr-xr-x    5 Brian  wheel   160B Oct 25 13:43 ..
-rw-r--r--    1 Brian  wheel   127K Oct 25 13:43 000026.jpg
-rw-r--r--    1 Brian  wheel    90K Oct 25 13:43 000067.jpg
-rw-r--r--    1 Brian  wheel    41K Oct 25 13:43 000107.jpg
-rw-r--r--    1 Brian  wheel   128K Oct 25 13:43 000111.jpg
-rw-r--r--    1 Brian  wheel    96K Oct 25 13:43 000154.jpg
-rw-r--r--    1 Brian  wheel   101K Oct 25 13:43 000172.jpg
-rw-r--r--    1 Brian  wheel   185K Oct 25 13:43 000272.jpg


In [45]:
!python -m json.tool /tmp/fiftyone-examples/export-fiftyone-dataset/metadata.json | head -n 16

{
    "name": "2020.10.25.13.23.54-view",
    "media_type": "image",
    "sample_fields": {
        "media_type": "fiftyone.core.fields.StringField",
        "filepath": "fiftyone.core.fields.StringField",
        "tags": "fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)",
        "metadata": "fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)",
        "ground_truth_detections": "fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)"
    },
    "info": {
        "task_labels": [
            {
                "name": "airplane",
                "attributes": []
            },


In [51]:
!python -m json.tool /tmp/fiftyone-examples/export-fiftyone-dataset/samples.json | tail -n 28

                    {
                        "_id": {
                            "$oid": "5f95b4abad3e1adb14658d5f"
                        },
                        "_cls": "Detection",
                        "attributes": {
                            "area": {
                                "_cls": "NumericAttribute",
                                "value": 534.3859500000002
                            },
                            "iscrowd": {
                                "_cls": "NumericAttribute",
                                "value": 0.0
                            }
                        },
                        "label": "person",
                        "bounding_box": [
                            0.5515625,
                            0.42516268980477223,
                            0.04375,
                            0.06073752711496746
                        ]
                    }
                ]
            }
        }
    ]

## Cleanup

In [52]:
!rm -rf /tmp/fiftyone-examples