Training an object class detector typically requires a large set of images annotated with bounding-boxes, which is expensive and time consuming to create.
Training an object class detector typically requires a large set of images annotated with bounding-boxes, which is expensive and time consuming to create. We propose novel approach to annotate object locations which can substantially reduce annotation time. We first track the eye movements of annotators instructed to find the object and then propose a technique for deriving object bounding-boxes from these fixations. To validate our idea, we collected eye tracking data for the trainval part of 10 object classes of Pascal VOC 2012 (6,270 images, 5 observers). Our technique correctly produces bounding-boxes in 50%of the images, while reducing the total annotation time by factor 6.8× compared to drawing bounding-boxes. Any standard object class detector can be trained on the bounding-boxes predicted by our model. Our large scale eye tracking dataset is available at groups.inf.ed.ac.uk/calvin/eyetrackdataset/ .
http://link.springer.com/chapter/10.1007%2F978-3-319-10602-1_24