Inference of sensitive data from location information
Location information (e.g., GPS trajectories) is one of the most useful
data that companies wish to monetize by sharing it with other entities with the appropriate expertise to analyze it.
It is not hard to see that location information is very sensitive information.
The list of visited places can reveal, among other things, the religion/
political beliefs or information about the health life.
While such information is often easy to infer (e.g., did the person stop near a
hospital?), it can be less apparent in other cases due to subtle data correlation.
For example, when the New York City Taxi and Limousine Commission published a
dataset of every yellow cab ride in New York in 2013, muslim taxi drivers
were easy to identify as they stopped 5 times per day for more than 20 minutes to pray.
The task is to automatize the above privacy attacks, i.e., develop a tool
which infers sensitive information, such as religion or health/sex life, from location information
using machine learning.
Required skills: programming (preferably python)
Preferred (but not required) skills: machine learning
[1] http://www.theiii.org/index.php/997/using-nyc-taxi-data-to-identify-muslim-taxi-drivers/