If you’ve read the news in the past few weeks, you know that Facebook is in the middle of quite the PR crisis. And while it’s tempting in the wake of this news to shy away from using data for fear of a similar fate, that is a mistake. Organizations should still aim to be at the cutting edge in terms of using data in creative ways; however, given the lack of strict laws governing big data (for now), it’s up to businesses to ensure they are moving forward in a conscious way that leaves room both for innovation and ethics.
Of course, the EU’s General Data Protection Regulation (GDPR) is beginning to change the big data regulatory landscape. Enforcement is yet to begin, so it remains to be seen how it might actually affect the ways businesses work with data and deal with data breaches.
But what makes this situation so embarrassing for Facebook is that this was not a leak — in fact, at the time, Facebook’s API allowed third-party apps to collect profiles not only of their users, but also of their users’ friends. And as the details specific to Facebook’s situation have yet to emerge, it’s difficult to comment on what could have been done to prevent this particular disaster.
In general, it’s true that every organization that works with data needs to have processes and policies that determine how data is used, and to mitigate risk, embrace data transparency. The idea of data transparency might seem counterintuitive based on the name (we need fewer people seeing data, not more)... but stick with me.
The Value of Transparency
When it comes to data transparency, there are two components:
- The more obvious is external transparency — that is, with customers, partners, users, etc.
- The less-often-talked-about version is internal transparency. How easy is it within the organization to understand globally what datasets are being used where and to what end?
To build a conscious data strategy, one must consider both types of transparency. It’s about building trust not only with people using your product or service (which is certainly paramount), but also building internal trust in data and how it’s being used to instill solid data governance practices from the ground up.
Building a conscious data strategy is about trust — lose it, and you put your business at risk.
So when it comes to bringing data transparency to an organization, this doesn’t mean let everyone see all the data (which may almost certainly bring trouble). But it does mean that different data sets, or data sources, should have a clear owner, and that owner should have proper visibility into who is using that data where and for what purposes.
This allows data teams to step into more of a consultancy role, advising other teams how data should be used (and how it shouldn’t be used) so that both parties bring their expertise - on the data, but also on the business side.
With this sort of a setup, it’s possible to offer more transparency not into the specific data itself, but generally how datasets are being used — both internally, but also externally with users, if you really want to build trust
In Uncharted Waters, Everyone’s a Navigator
Facebook isn’t even 15 years old, and most of the rest of the world of big data is even younger than that. The data and analytics ecosystem is simply moving too quickly for regulators to keep up, and like it or not, anyone working on the cutting edge of this industry is in uncharted waters. That means it’s up to these people and teams to navigate the ethics and pitfalls of how and where data is used on their own.
The data and analytics ecosystem is simply moving too quickly for regulators to keep up, and like it or not, anyone working on the cutting edge of this industry is in uncharted waters.
Ultimately, it’s up to individual organizations to be responsible. It’s also up to users to be responsible when it comes to choosing to engage with businesses that have good data practices — this is the only way that companies with a conscious data strategy will rise to the top.
Practically, what does this mean for a business? What can you do today to start?
First, do everything you can to make it clear to customers, partners, users, etc., that transparency matters and that you respect and are responsible for the information they are entrusting your company with. Not just words on a page, but talk about the real efforts you’re making to take care of their data.
Second, commit to being more open and transparent with how and where data is being used (both internally and externally). Internally, that means inviting more people to be part of the design and review of data and analytics products (including from the data team, but also business, IT, etc.) Have kickoff meetings for new analytics projects with all stakeholders present — data scientists, database administrators, subject matter experts, and business line managers.
The difficult part about collaboration is making it happen — people agree it's necessary but don't know how to get there.
There’s often a lot of nodding and general agreement on the importance of collaboration, but the difficult part is actually getting people to collaborate. Describe all pieces of data used, all transformations, models, and outputs, as well as rules and policies. Take reservations and objections seriously — the team member with the least familiarity with the project is the best representative of the general public.
Third, consider implementing tools that make it easier for people in the organization to search through datasets, workflows, and outputs to understand how everything is working. This transparency can be done without compromising at all on security and governance. For example, I don’t need to be able to see the contents of a certain database to see that that database is being used in four different projects. Hint: data science platforms can help.
The people who created Facebook’s API policy weren’t necessarily bad or ethically challenged people. But chances are, if more people at Facebook were aware of that policy, someone would have voiced an objection to it. Had they done so, Zuckerberg might not be worried about a mid-April trip to Washington, D.C., right now.