by Li Sun, Ulrich Klank and Michael Beetz
Abstract:
This paper investigates the inside-out recognition of everyday manipulation tasks using a gaze-directed camera, which is a camera that actively directs at the visual attention focus of the person wearing the camera. We present EYEWATCHME, an integrated vision and state estimation system that at the same time tracks the positions and the poses of the acting hands, the pose that the manipulated object, and the pose of the observing camera. Taken together, EYEWATCHME provides comprehensive data for learning predictive models of vision-guided manipulation that include the objects people are attending, the interaction of attention and reaching/grasping, and the segmentation of reaching and grasping using visual attention as evidence. Key technical contributions of this paper include an ego view hand tracking system that estimates 27 DOF hand poses. The hand tracking system is capable of detecting hands and estimating their poses despite substantial self-occlusion caused by the hand and occlusions caused by the manipulated object. EYEWATCHME can also cope with blurred images that are caused by rapid eye movements. The second key contribution is the of the integrated activity recognition system that simultaneously tracks the attention of the person, the hand poses, and the poses of the manipulated objects in terms of a global scene coordinates. We demonstrate the operation of EYEWATCHME in the context of kitchen tasks including filling a cup with water.
Reference:
Li Sun, Ulrich Klank and Michael Beetz, "EYEWATCHME - 3D Hand and object tracking for inside out activity analysis", In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009., vol. , no. , pp. 9-16, 2009.
Bibtex Entry:
@InProceedings{sunli-2009-cvprws,
title={EYEWATCHME - 3D Hand and object tracking for inside out activity analysis},
author={Li Sun and Ulrich Klank and Michael Beetz},
booktitle={IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009.},
year={2009},
month={June},
volume={},
number={},
pages={9-16},
abstract={This paper investigates the inside-out recognition of everyday manipulation tasks using a gaze-directed camera, which is a camera that actively directs at the visual attention focus of the person wearing the camera. We present EYEWATCHME, an integrated vision and state estimation system that at the same time tracks the positions and the poses of the acting hands, the pose that the manipulated object, and the pose of the observing camera. Taken together, EYEWATCHME provides comprehensive data for learning predictive models of vision-guided manipulation that include the objects people are attending, the interaction of attention and reaching/grasping, and the segmentation of reaching and grasping using visual attention as evidence. Key technical contributions of this paper include an ego view hand tracking system that estimates 27 DOF hand poses. The hand tracking system is capable of detecting hands and estimating their poses despite substantial self-occlusion caused by the hand and occlusions caused by the manipulated object. EYEWATCHME can also cope with blurred images that are caused by rapid eye movements. The second key contribution is the of the integrated activity recognition system that simultaneously tracks the attention of the person, the hand poses, and the poses of the manipulated objects in terms of a global scene coordinates. We demonstrate the operation of EYEWATCHME in the context of kitchen tasks including filling a cup with water.},
keywords={computer graphics, human computer interaction, image restoration, image segmentation, image sensors, object recognition, tracking3D hand tracking, 3D object tracking, EYEWATCHME, blurred images, gaze-directed camera, grasping segmentation, inside out activity analysis, integrated activity recognition system, reaching segmentation, state estimation system, substantial self-occlusion, vision-guided manipulation},
doi={10.1109/CVPR.2009.5204358},
ISSN={1063-6919},
bib2html_pubtype = {Workshop Paper},
bib2html_groups = {Cop},
bib2html_rescat = {Perception},
bib2html_domain = {Assistive Household}
}