Whether you realized it or not, Data Privacy Day 2019 (yes, it exists!) has already come and gone. But this year, it was perhaps more significant than most not only because topics of bias, interpretability, and transparency in AI have moved to the forefront, but because the Council of Europe amended their Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data.
Why Should I Care?
A valid question, dear reader. The Council of Europe, of course, cannot make binding laws, which means this amendment has garnered much less attention than the European Union’s General Data Protection Regulation of 2016 (enforced beginning in 2018). However, given the lengths it goes to protect the rights of humans against the injustices of AI — my words, not theirs — it is certainly a significant development in the trajectory of AI transparency.
In other words, this could be a small sign of what’s to come in terms of government regulations surrounding AI. Here’s just a small sample from the convention:
In all phases of the processing, including data collection, AI developers, manufacturers and service providers should adopt a human rights by-design approach and avoid any potential biases, including unintentional or hidden, and the risk of discrimination or other adverse impacts on the human rights and fundamental freedoms of data subjects.”
GUIDELINES ON ARTIFICIAL INTELLIGENCE AND DATA PROTECTION | PART II
Preempting the Inevitable
Though Europe has certainly been the leader in this arena, many other governments are also considering their own data privacy regulations, including several states in the U.S.
In mid-2018, the White House said it will be working with Congress to draft data privacy legislation and that it “began holding stakeholder meetings to identify common ground and formulate core, high-level principles on data privacy.”
Europe has been the leader so far, but many other governments are also considering strict data privacy regulations.
So what can enterprises do to prepare? At the core of the AI ethics debate are the concepts of transparency and interpretability, two things that any company can get started on improving in their data systems and processes right now.
Data science tools and platforms can certainly help. Specifically because they centralize control of all of the following:
- Personal data identification, documentation, and clear data lineage — that is, they allow data teams and leaders to trace (and often see at a glance) which data source is used in each project.
- Access restriction and control — including separation by team, by role, purpose of analysis and data use, etc.
- Data minimization — given clear separation in projects as well as some built-in help for anonymization and pseudonymization, only data relevant to the specific purpose will be processed, minimizing risk.
But like good classic science, there are also some more detailed principles of good data science that should be infused throughout any organization to ensure a moderate level of preparation if (or when) regulations do come. Read the ebook from O'Reilly An Introduction to Machine Learning Interpretability to get started with an expert perspective on how interpretability applies to machine learning, including fairness, accountability, transparency, and explainable AI.