{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " Try in Google Colab\n", " \n", " \n", " \n", " \n", " Share via nbviewer\n", " \n", " \n", " \n", " \n", " View on GitHub\n", " \n", " \n", " \n", " \n", " Download notebook\n", " \n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Elderly Action Recognition Challenge

\n", "

Dataset Preparation with FiftyOne

\n", "
\n", "\n", "This notebook walks you through the process of preparing a dataset for the [Elderly Action Recognition Challenge](https://voxel51.com/computer-vision-events/elderly-action-recognition-challenge-wacv-2025). It covers essential steps such as importing data, parsing actions, assigning categories, splitting videos into clips, and exporting the dataset using [FiftyOne](https://docs.voxel51.com/).\n", "\n", "---\n", "\n", "**Useful Links:**\n", "- [Challenge Overview](https://voxel51.com/computer-vision-events/elderly-action-recognition-challenge-wacv-2025/)\n", "- [Submission Page](https://eval.ai/web/challenges/challenge-page/2427/overview)\n", "- [Join the Discussion](https://discord.com/channels/1266527359511564372/1319053378843836448)\n", "\n", "---\n", "\n", "
\n", " \"challenge-logo\"\n", " \"fiftyone-logo\"\n", "
\n", "\n", "---\n", "\n", "**Goal**: Enable participants to work with the dataset efficiently and submit meaningful solutions to the challenge, ultimately advancing the field of action recognition for the elderly.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Requirements anf FiftyOne Installation\n", "\n", "First thing you need to do is create a Python environment in your system, if you are not familiar with that please take a look of this [ReadmeFile](https://github.com/voxel51/fiftyone-examples?tab=readme-ov-file#-prerequisites-for-beginners-), where we will explain how to create the environment. After that be sure you activate the created environment and install FiftyOne there." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#!pip install fiftyone" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports\n", "In this section, we import all the necessary libraries and modules to work with the dataset, including FiftyOne, pandas, and re for regular expressions. These libraries provide the foundation for loading, processing, and interacting with the dataset." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "import fiftyone as fo\n", "import fiftyone.types as fot\n", "import pandas as pd\n", "import re" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Defining Path for Dataset and Checking if Dataset Exists\n", "Here, we define the path to the dataset and ensure we are working with a clean dataset by checking if a dataset with the same name already exists. If it does, it will be deleted to prevent conflicts.\n", "\n", "For the educational purposes, we use the [GMDCSA24 Dataset](https://doi.org/10.5281/zenodo.12921216), a dataset specifically designed for elderly fall detection and Activities of Daily Living (ADLs). Additional information can be found on the dataset’s [GitHub Project Page](https://github.com/ekramalam/GMDCSA24-A-Dataset-for-Human-Fall-Detection-in-Videos) and the associated [Scientific Paper](https://www.sciencedirect.com/science/article/pii/S2352340924008552)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Define the path to your dataset\n", "dataset_path = \"/path/to/the/GMDCSA24/folder\" # Replace with the actual path\n", "dataset_name = \"ADL_Fall_Videos\"\n", "\n", "# Check if the dataset already exists\n", "if fo.dataset_exists(dataset_name):\n", " # Delete the existing dataset\n", " fo.delete_dataset(dataset_name)\n", "\n", "# Create a FiftyOne dataset\n", "fo_dataset = fo.Dataset(dataset_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Helper Functions\n", "This section defines two critical helper functions that are essential for processing the dataset:\n", "\n", "1. Function to Parse the Classes:\n", "\n", "- Extracts action names and their corresponding time ranges from the dataset.\n", "- Splits each video into shorter clips based on these time ranges to prepare a new dataset focused on individual actions.\n", "\n", "2. Function to Get Category per Action:\n", "\n", "- Maps actions to predefined categories based on their type.\n", "- Categorization is essential to meet one of the challenge goals: grouping actions into higher-level classifications.\n", "\n", "\n", "\n", "
\n", "\n", "| **Category** | **Actions** |\n", "|------------------------------------|-----------------------------------------------------------------------------------------------|\n", "| **Locomotion and Posture Transitions** | Walking, Sitting down / Standing up, Getting up / Lying down, Exercising, Looking for something |\n", "| **Object Manipulation** | Spreading bedding / Folding bedding, Wiping table, Cleaning dishes, Cooking, Vacuuming the floor |\n", "| **Hygiene and Personal Care** | Washing hands, Brushing teeth, Taking medicine |\n", "| **Eating and Drinking** | Eating, Drinking |\n", "| **Communication and Gestures** | Talking, Phone call, Waving a hand, Shaking hands, Hugging |\n", "| **Leisure and Stationary Actions** | Reading, Watching TV |\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Function to parse the Classes column\n", "def parse_classes(classes_str):\n", " actions = []\n", " if pd.isna(classes_str):\n", " return actions\n", "\n", " # Split by ';' to handle multiple actions\n", " class_entries = classes_str.split(';')\n", " for entry in class_entries:\n", " match = re.match(r\"(.+?)\\[(.+?)\\]\", entry.strip())\n", " if match:\n", " action = match.group(1).strip() # Extract action name\n", " time_ranges = match.group(2).strip() # Extract time ranges within brackets\n", "\n", " #print(\"Action=\", action)\n", " #print(\"Time_Group=\", time_ranges)\n", "\n", " # Split time ranges by ';' and process each range\n", " ranges = time_ranges.split(';')\n", " #print(ranges)\n", " for time_range in ranges:\n", " time_match = re.match(r\"(\\d+(\\.\\d+)?) to (\\d+(\\.\\d+)?)\", time_range.strip())\n", " if time_match:\n", " start_time = float(time_match.group(1))\n", " #print(\"Starttime=\", start_time)\n", " end_time = float(time_match.group(3))\n", " #print(\"Endtime=\", end_time)\n", "\n", " # Ensure start_time is less than or equal to end_time\n", " if start_time > end_time:\n", " continue # Skip invalid ranges\n", "\n", " actions.append({\"action\": action, \"start_time\": start_time, \"end_time\": end_time})\n", "\n", " return actions\n", "\n", "# Function to assign categories based on actions\n", "def get_category(action):\n", " locomotion = [\"Walking\", \"Sitting down / Standing up\", \"Getting up / Lying down\", \"Exercising\", \"Looking for something\"]\n", " manipulation = [\"Spreading bedding / Folding bedding\", \"Wiping table\", \"Cleaning dishes\", \"Cooking\", \"Vacuuming the floor\"]\n", " hygiene = [\"Washing hands\", \"Brushing teeth\", \"Taking medicine\"]\n", " eating_drinking = [\"Eating\", \"Drinking\"]\n", " communication = [\"Talking\", \"Phone call\", \"Waving a hand\", \"Shaking hands\", \"Hugging\"]\n", " leisure = [\"Reading\", \"Watching TV\"]\n", "\n", " if action in locomotion:\n", " return \"Locomotion and Posture Transitions\"\n", " elif action in manipulation:\n", " return \"Object Manipulation\"\n", " elif action in hygiene:\n", " return \"Hygiene and Personal Care\"\n", " elif action in eating_drinking:\n", " return \"Eating and Drinking\"\n", " elif action in communication:\n", " return \"Communication and Gestures\"\n", " elif action in leisure:\n", " return \"Leisure and Stationary Actions\"\n", " else:\n", " return \"Unknown\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iteration in the Main Folders, Per Subject, and Splitting Video by Actions Using FiftyOne\n", "This section iterates through the dataset folder structure to:\n", "\n", "- Process each subject’s actions and assign relevant metadata, including categories.\n", "- Split videos into clips using FiftyOne’s advanced capabilities.\n", "\n", "
\n", "Note: The implementation of this section may vary depending on the dataset structure you are working with.\n", "
" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 2/ADL.csv\n", "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 2/Fall.csv\n", "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 3/ADL.csv\n", "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 3/Fall.csv\n", "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 4/ADL.csv\n", "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 4/Fall.csv\n", "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 1/ADL.csv\n", "/Users/paularamos/Downloads/EAR_Datasets_Temp/GMDCSA24/Subject 1/Fall.csv\n" ] } ], "source": [ "# Iterate through the main folders (one per subject)\n", "for subject_folder in os.listdir(dataset_path):\n", " subject_path = os.path.join(dataset_path, subject_folder)\n", "\n", " if not os.path.isdir(subject_path):\n", " continue\n", "\n", " # Extract the subject number from the folder name\n", " subject_number = subject_folder.split(\"_\")[-1] # Adjust the split logic if needed\n", "\n", " # Look for ADL and Fall folders and CSV files\n", " adl_folder = os.path.join(subject_path, \"ADL\")\n", " fall_folder = os.path.join(subject_path, \"Fall\")\n", "\n", " label_files = [f for f in os.listdir(subject_path) if f.endswith(\".csv\")]\n", "\n", " # Load metadata from CSV files\n", " for label_file in label_files:\n", " label_path = os.path.join(subject_path, label_file)\n", " metadata = pd.read_csv(label_path)\n", " print(label_path)\n", "\n", " for _, row in metadata.iterrows():\n", " file_name = row[\"File Name\"]\n", " length = row[\"Length (seconds)\"]\n", " time_of_recording = row[\"Time of Recording\"]\n", " attire = row[\"Attire\"]\n", " description = row[\"Description\"]\n", " classes = row[\" Classes\"]\n", "\n", " # Parse the Classes column\n", " parsed_classes = parse_classes(classes)\n", "\n", " # Determine the file's path\n", " if \"ADL\" in label_path:\n", " video_path = os.path.join(adl_folder, file_name)\n", " subset = \"ADL\"\n", " elif \"Fall\" in label_path:\n", " video_path = os.path.join(fall_folder, file_name)\n", " subset = \"Fall\"\n", " else:\n", " continue\n", "\n", " if not os.path.exists(video_path):\n", " print(f\"Video file not found: {video_path}\")\n", " continue\n", " \n", " # Create a FiftyOne sample\n", " metadata = fo.VideoMetadata.build_for(video_path)\n", " sample = fo.Sample(filepath=video_path, metadata=metadata)\n", " \n", " #temporaldetection using actions detections on labeled dataset\n", " temp_detections = []\n", " \n", " for action in parsed_classes:\n", " start_time = float(action[\"start_time\"])\n", " end_time = float(action[\"end_time\"])\n", "\n", " # Check if end_time exceeds video duration\n", " if end_time > metadata.duration:\n", " end_time = metadata.duration\n", "\n", " event = fo.TemporalDetection.from_timestamps(\n", " [start_time, end_time],\n", " label=action[\"action\"],\n", " sample=sample,\n", " )\n", " temp_detections.append(event)\n", " \n", " sample[\"events\"] = fo.TemporalDetections(detections=temp_detections)\n", " \n", " # Add metadata to the sample\n", " sample[\"subset\"] = subset\n", " sample[\"subject_number\"] = subject_number\n", " sample[\"length\"] = length\n", " sample[\"time_of_recording\"] = time_of_recording\n", " sample[\"attire\"] = attire\n", " sample[\"description\"] = description\n", " sample[\"classes\"] = classes\n", " #sample[\"events\"] = events\n", "\n", " # Assign category based on actions\n", " categories = [get_category(action[\"action\"]) for action in parsed_classes]\n", " sample[\"category\"] = list(set(categories)) # Deduplicate categories\n", "\n", " # Add the sample to the dataset\n", " fo_dataset.add_sample(sample)\n", " fo_dataset.compute_metadata()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Check the dataset in the FiftyOne APP\n", "Launching FiftyOne in the browser allows you to visually explore the dataset and its metadata. You can:\n", "\n", "- Modify fields in the metadata as needed.\n", "- Split videos into clips or adjust metadata using the FiftyOne API.\n", "- Use FiftyOne's metadata documentation [here](https://docs.voxel51.com/user_guide/basics.html#metadata) and its guide on creating [clips](https://docs.voxel51.com/user_guide/using_views.html#clip-views) for additional details." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session = fo.launch_app(fo_dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating Clips based-on individual actions\n", "\n", "Using the ```\"events\"``` field in each sample, you can split videos into clips based on their specific actions. The ```to_clips()``` unction in FiftyOne creates a view with one sample per clip, defined by the field or expression specified in the video collection.\n", "\n", "More documentation can be found [here](https://docs.voxel51.com/api/fiftyone.core.clips.html?highlight=to_clip#fiftyone.core.clips.ClipsView.to_clips)\n", "\n", "
\n", "Note: After running this section, check the \"events\" labels in the metadata menu. Clicking on each event will display the specific section of the video where the action occurs.\n", "
\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Dataset: ADL_Fall_Videos\n", "Media type: video\n", "Num clips: 335\n", "Clip fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " sample_id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " support: fiftyone.core.fields.FrameSupportField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.VideoMetadata)\n", " created_at: fiftyone.core.fields.DateTimeField\n", " last_modified_at: fiftyone.core.fields.DateTimeField\n", " events: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)\n", "Frame fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " frame_number: fiftyone.core.fields.FrameNumberField\n", " created_at: fiftyone.core.fields.DateTimeField\n", " last_modified_at: fiftyone.core.fields.DateTimeField\n", "View stages:\n", " 1. ToClips(field_or_expr='events', config=None)\n" ] } ], "source": [ "view = fo_dataset.to_clips(\"events\")\n", "session.view = view\n", "print(view)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exporting Dataset into a Video Classification Directory Tree\n", "\n", "To simplify the dataset structure, we export the GMDCSA24 Dataset as a classification dataset. The directory tree will reflect the individual labels, making it easier to train models.\n", "\n", "Using the view created from the ```\"events\"``` field, we export the dataset in the```types.VideoClassificationDirectoryTree``` format. This structure is ideal for machine learning workflows." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 100% |█████████████████| 335/335 [2.5m elapsed, 0s remaining, 3.5 samples/s] \n" ] } ], "source": [ "view.export(\n", " export_dir=\"/path/to/the/GMDCSA24/new_folder\",\n", " dataset_type=fo.types.VideoClassificationDirectoryTree,\n", ")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## New dataset\n", "\n", "Creates a new dataset containing a copy of the contents of the view." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Name: 2025.01.03.09.09.28\n", "Media type: video\n", "Num samples: 335\n", "Persistent: False\n", "Tags: []\n", "Sample fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.VideoMetadata)\n", " created_at: fiftyone.core.fields.DateTimeField\n", " last_modified_at: fiftyone.core.fields.DateTimeField\n", " sample_id: fiftyone.core.fields.ObjectIdField\n", " support: fiftyone.core.fields.FrameSupportField\n", " events: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)\n", "Frame fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " frame_number: fiftyone.core.fields.FrameNumberField\n", " created_at: fiftyone.core.fields.DateTimeField\n", " last_modified_at: fiftyone.core.fields.DateTimeField\n" ] } ], "source": [ "new_dataset= view.clone()\n", "print(new_dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exporting Dataset to FiftyOneDataset\n", "\n", "FiftyOne supports various dataset formats. In this notebook, we’ve worked with a custom dataset and added each sample manually. Now, we export it into a FiftyOne-compatible dataset to leverage additional capabilities.\n", "\n", "For more details on the dataset types supported by FiftyOne, refer to this [documentation]](https://docs.voxel51.com/api/fiftyone.types.dataset_types.html?highlight=dataset%20type#module-fiftyone.types.dataset_types)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Exporting samples...\n", " 100% |████████████████████| 335/335 [830.3ms elapsed, 0s remaining, 403.5 docs/s] \n", "Exporting frames...\n", " 100% |████████████████████████| 0/0 [185.3us elapsed, ? remaining, ? docs/s] \n" ] } ], "source": [ "export_dir = \"/path/to/the/GMDCSA24/new_folder_FO_Dataset\"\n", "new_dataset.export(\n", " export_dir=export_dir,\n", " dataset_type=fo.types.FiftyOneDataset,\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "fo_oss_env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 2 }