Just as Big Tobacco spent decades denying that smoking causes lung cancer, and Big Oil spent decades denying climate change, so Big Data has spent decades pretending that sensitive personal data can easily be ‘anonymised’ so it can be used as an industrial raw material.
Inference control research has washed through the security
community in four waves. The first wave, around 1980, was driven by
the needs of census agencies and led to the classical theory of
statistical disclosure control. The second, from the mid-1990s,
tackled the richer data in applications such as medical records. The
third, from the mid-2000s, was driven by global-scale applications
such as search and preference aggregation, while the fourth combines
the complexity of the second wave with the scale of the third in its
consideration of social networks, of location data and of what can be
done with them using tools such as machine learning. I'll also discuss
the policy responses.