I've just come back from an awesome/exhausting two days at Big Data World 2017. It was a whirlwind of interesting talks with interesting people, new product discoveries, awesomely weird goodies, and plenty of free food. I thought I would share what people were talking about at the biggest data conference in the United Kingdom , including key focus areas this year for growing data-driven organizations.
Data Is (and Should Be) Everywhere
I'm going to call this the biggest trend at the conference. Data is no longer a siloed resource accessible only by a few treasure keepers: it has to be everywhere. To become a data-driven company, everyone in your organization should have access to all of your data in an open manner to make day-to-day decisions.
However, 88% of enterprise companies don’t share their customer data between their own sales and marketing teams, according to a talk by Jeremy Waite of IBM. So we're not quite there yet!
Data Consumption Is Changing
Not just in businesses, but in life, data is being presented in cool and innovative ways everywhere. Think of this ad campaign by Spotify, or this one by Ikea. Or just of the fun dashboards you're getting everywhere from your favorite apps to your favorite news articles. If you want people in your business to use data, it should come in a format that they're used to.
Data Is the Fuel of Agile Companies
Build > Measure > Learn is pretty much what Agile development is all about. But Agile is more than just a buzzword. This means that everything anyone does in a company has to be backed by metrics.
What you want to avoid at all cost? Build > Measure > DENIAL. The only way to do that is to encourage people to fail as much as succeed! I digress a little bit, but this was the subject of a great talk by Mike Hyde of Facebook. Culture is the hardest thing to change, not technology.
This ties into the fact that data science is about iteration first and foremost. It's not about finding the method that works but using all these methods as features in a model to let the model decide what works. Trust the data!
Getting Started with Big Data
This was of course the focus of a lot of talks; everyone has to start somewhere. The consensus is this: big data is where data, people, and technology intersect. Using big data is about taking a business need, getting data to answer it, building new insights, and implementing them to bring real operational value. And then iterating on this whole process.
Also, there’s no magic in data science. It's a challenging process of exploring, cleaning, creating features, training models, and then iterating.
The Case with Data Hoarders
The most common issue when companies get started with data science seems to be... data. Whether you have it or you don't, too many focus on getting more data before looking at what they have. More often than not, companies don't even know what they have available in different departments.
However, it's a misconception that the more data you have the better. Simply accumulating data only makes sense if you have a strategy to make sense of it, link it to other data you have, and keep it accessible as it's stored.
Insights Are Not Enough: Implementation Matters
This was another big topic about how to make data science efficient: don't just explore your data to uncover hidden insights, but build projects that are actually implemented in production, and bring daily measurable value to the business. The way to master that? Data engineers. Get yourself some good data engineers.
How Do You Grow a Data Science Team
If you consider the speakers as representative of the data science world, then this is really the question on people's minds. There are two models out there to set up your data science efforts - you can either have:
- One centralized data team offering services to other departments
- One or more data person/people working in operational teams alongside the end users of the data product.
Don't Overlook Soft Skills
Teams that consist of people with very high levels of technical skills often tend to communicate in code. But more often than not, it's better to use your words. This isn't just a questions of being able to communicate your work and your results with non technical people (even though that is important).
This is also a question of framing a problem or a goal using words and therefore escaping the constraints of code. This is a great way of building a project from the ground up and being able to share it and communicate on it without having to make an extra effort.