Subjects

Introduction

In response to the escalating demand for detailed datasets for human pose estimation (HPE), we introduce the BlendMimic3D Dataset. By providing a synthetic dataset that bridges the realism gap in current HPE datasets, BlendMimic3D stands as a pivotal tool for advancing pose estimation technologies. Its creation not only demonstrates the potential of synthetic datasets to enhance algorithm training and testing but also sets a new benchmark for dataset complexity and utility in the HPE domain.

Motivation

Traditional datasets for HPE, while invaluable, often fall short in replicating the complex, occlusion-heavy environments encountered in real-world applications. The Human3.6M dataset, despite its contributions to the field, exemplifies these limitations. Utilizing Blender, a leading open-source 3D computer graphics software, we’ve crafted BlendMimic3D to specifically address these gaps, offering a diverse array of scenarios that range from simple, controlled settings to intricate, occlusion-rich environments.

Our development of the BlendMimic3D dataset was significantly inspired by advancements in pose refinement techniques, particularly through the use of Graph Convolutional Networks (GCN). The GCN Pose Refinement Block represents a cornerstone of our approach, enhancing the accuracy of pose estimation in occluded scenarios. By training our GCN on the BlendMimic3D dataset, we’ve achieved notable improvements in pose estimation fidelity, especially in complex occlusion contexts.

Dataset Composition

BlendMimic3D is meticulously constructed with the following components:

Scenarios: From simple to complex, including extensive occlusion challenges.
Subjects: Three distinct subjects, each performing a variety of actions.
Actions: Each subject engages in 14 unique actions, captured across single and multi-person setups.
Videos: The dataset encompasses 128 videos, each with a duration of approximately 20 seconds (600 frames).

Subjects

Technological Backbone

The dataset’s creation involved positioning four cameras within a virtual environment to capture a full spectrum of movements. Characters, animated via resources from Mixamo, engage in a range of actions from arguing to greeting, with their 3D poses meticulously extracted and analyzed.

Key Features

Adaptability: BlendMimic3D’s design ensures its relevance and applicability across various occlusion scenarios and HPE models.
Comprehensive Metadata: Accompanying each video is detailed metadata, including camera calibration parameters and the 2D and 3D positions of keypoints.

BlendMimic3D Dataset

A synthetic dataset developed using Blender, designed to enhance Human Pose Estimation (HPE) research.

Introduction

Motivation

Integration with GCN Pose Refinement

Dataset Composition

Technological Backbone

Key Features

Videos

S1

S2

S3