Efficient 2.5D Hand Pose Estimation via Auxiliary Multi-Task Training for Embedded Devices

Chidananda, Prajwal; Sinha, Ayan; Rao, Adithya; Lee, Douglas; Rabinovich, Andrew

Computer Science > Computer Vision and Pattern Recognition

arXiv:1909.05897 (cs)

[Submitted on 12 Sep 2019]

Title:Efficient 2.5D Hand Pose Estimation via Auxiliary Multi-Task Training for Embedded Devices

Authors:Prajwal Chidananda, Ayan Sinha, Adithya Rao, Douglas Lee, Andrew Rabinovich (Magic Leap, Inc)

View PDF

Abstract:2D Key-point estimation is an important precursor to 3D pose estimation problems for human body and hands. In this work, we discuss the data, architecture, and training procedure necessary to deploy extremely efficient 2.5D hand pose estimation on embedded devices with highly constrained memory and compute envelope, such as AR/VR wearables. Our 2.5D hand pose estimation consists of 2D key-point estimation of joint positions on an egocentric image, captured by a depth sensor, and lifted to 2.5D using the corresponding depth values. Our contributions are two fold: (a) We discuss data labeling and augmentation strategies, the modules in the network architecture that collectively lead to $3\%$ the flop count and $2\%$ the number of parameters when compared to the state of the art MobileNetV2 architecture. (b) We propose an auxiliary multi-task training strategy needed to compensate for the small capacity of the network while achieving comparable performance to MobileNetV2. Our 32-bit trained model has a memory footprint of less than 300 Kilobytes, operates at more than 50 Hz with less than 35 MFLOPs.

Comments:	CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Long Beach, CA, 2019
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:1909.05897 [cs.CV]
	(or arXiv:1909.05897v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1909.05897

Submission history

From: Prajwal Chidananda [view email]
[v1] Thu, 12 Sep 2019 18:33:05 UTC (628 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Efficient 2.5D Hand Pose Estimation via Auxiliary Multi-Task Training for Embedded Devices

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Efficient 2.5D Hand Pose Estimation via Auxiliary Multi-Task Training for Embedded Devices

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators