This is a past event
Hands are primary interacting tools that provide humans an ability to interact with their outer environments. Therefore, recognizing poses and actions (gestures) of hands from images and videos is essential to comprehensively understand the daily human intentions. Challenges occur when more than one hand appear in an image, due to severe occlusions and high complexity of the problem. We will present our recent works published in top computer vision conferences (ICCV'21, CVPR'23 and CVPR'24) for 3D pose estimation and action recognition, involving multiple hands and an interacting object: In ICCV'21, we present a pipeline that recognizes the two hands in an image. In CVPR'23, we presented a Transformer-based unified framework that recognizes the 3D poses of two hands and an object, and their interaction classes from videos. In CVPR'24, we presented a diffusion-based pipeline that is able to generate proper 3D hand-object interacting motions from a given interaction description.
- Speaker
- Seungryul Baek
- Venue
- Meston G05