Skip to the content.

Overview

We introduce our Graph Convolutional Network (GCN) as a plugin to enhance the estimated 3D poses. Our GCN is trained on the BlendMimic3D dataset, which provides a diverse range of occlusion scenarios. This allows the network to learn and adapt to various occlusion types, refining the pose estimation for occluded joints.

Our graph dynamics model considers six classes as neighboring nodes for each keypoint:

  1. Center (red)
  2. Physically-connected node closer to the spine (blue)
  3. Physically-connected farther from the spine (green)
  4. Symmetric node (pink)
  5. Time-forward node (orange)
  6. Time-backward (yellow)
Illustration of the graph dynamics

Results

Our results show the significant impact of the GCN block compared to previous methods in occlusion scenarios, as evidenced by the BlendMimic3D results. Below, we provide an overview of both CPN-based and Detectron2-based detections, utilizing the Human3.6M and BlendMimic3D test sets.

Human3.6M

(Avg ± σ [mm])

BlendMimic3D

(Avg ± σ [mm])

Model 2D HPE MPJPE P-MPJPE MPJPE P-MPJPE
VideoPose3D CPN 47.8 ± 9.29 37.4 ± 7.10 175.0 ± 7.20 112.0 ± 8.42
+ GCN CPN 56.3 ± 9.33 42.4 ± 7.06 112.7 ± 6.76 87.2 ± 5.29
VideoPose3D Detectron2 57.3 ± 9.96 43.6 ± 8.14 198.0 ± 7.88 122.5 ± 3.67
+ GCN Detectron2 59.7 ± 10.35 44.1 ± 8.08 127.7 ± 11.42 95.8 ± 6.90
PoseFormerV2 CPN 46.0 ± 9.08 36.4 ± 7.05 148.6 ± 8.00 107.7 ± 5.78
+ GCN CPN 49.6 ± 9.85 37.3 ± 7.16 107.5 ± 2.03 81.6 ± 4.76
PoseFormerV2 Detectron2 76.5 ± 19.97 55.6 ± 11.97 155.0 ± 9.35 112.2 ± 7.90
+ GCN Detectron2 60.9 ± 11.55 44.6 ± 9.29 106.9 ± 8.13 76.5 ± 7.04
D3DP CPN 41.4 ± 8.19 33.2 ± 6.42 100.7 ± 7.94 79.0 ± 5.88
+ GCN CPN 56.2 ± 7.20 40.9 ± 6.22 95.3 ± 3.58 72.1 ± 4.09
D3DP Detectron2 51.9 ± 7.79 40.3 ± 7.01 99.9 ± 11.19 79.6 ± 8.08
+ GCN Detectron2 58.7 ± 7.75 42.1 ± 6.89 95.3 ± 4.86 74.3 ± 4.50

Qualitative Results

Back