Understanding Images of People with Social Context

Andrew Gallagher

Andrew Gallagher


Please LOG IN to view the video.

Date: January 7, 2015


When we see other humans, we can quickly make judgments such as their demographic description and identity if they are familiar to us. We can answer questions related to the activities and relationships between people in an image. We draw conclusions based not just on what we see, but also from a lifetime of experience of living and interacting with other people. Even simple, common sense knowledge such as the fact that children are smaller than adults allows us to better understand the roles of the people we see. In this work, we propose contextual features, for modelling social context, drawn from a variety of public sources, and models for understanding images of people with the objective of providing computers with access to the same contextual information that humans use.

Computer vision and data-driven image analysis can play a role in helping us learn about people. We now are able to see millions of candid and posed images of people on the Internet. We can describe people with a vector of possible first names, and automatically produce descriptions of particular people in an image. From a broad perspective, this work presents a loop in that our knowledge about people can help computer vision algorithms, and computer vision can help us learn more about people.

Further Information:

Andy is a Senior Software Engineer with Google, working with geo-referenced imagery. Previously, he was Visiting Research Scientist at Cornell University’s School of Electrical and Computer Engineering, and part of a computer vision start-up, TaggPic, that identified landmarks in images . He earned the Ph.D. degree in electrical and computer engineering from Carnegie Mellon University in 2009, advised by Prof. Tsuhan Chen. Andy worked for the Eastman Kodak Company from 1996 to 2012, initially developing computational photography and computer vision algorithms for digital photofinishing, such as dynamic range compression, red-eye correction and face recognition.

Created: Wednesday, January 7th, 2015