Privacy Risk in Machine Learning Pipelines

Principal Investigator(s): 
Michael Tschantz

ICSI researchers are working with researchers at Carnegie Mellon University on tracking private data through machine learning pipelines. They will develop stronger notions of proxy that account for why a classifier is using information by: 

  1. Enumerating weaknesses of notions of proxies that treat them as merely being correlated with sensitive features
  2. Developing methods of examining training data to determine what impact correlations in it has on the selection of features correlated with sensitive features as proxies
  3. Developing the causal models needed to make the properties of the methods developed precise

Funded by DARPA.