The first time we introduced the idea of Technoslavia - the heterogeneous and fragmented picture of the technological world of data infrastructure - was nearly two years ago (I know, we can’t believe it either!). So what are the major changes in the big data ecosystem since then?
Previously, we spoke about eight technical areas, the options (and tradeoffs) within them and the connections (or lack thereof) between them. We posed the question of whether multiple republics of data would give birth to a new empire and the victory of one major player. And we wondered whether companies would continue to grapple with the long journey across Technoslavia even as the ecosystem matures.
Trends by Region
This looks eerily familiar, you might say. And you’re right - as we predicted, even after two years, no single player has emerged victorious across all of Technoslavia (though open source and cloud are most definitely common themes). But there are also some notable trends that have developed in specific regions. In particular:
- Statistician Yard: The dominance of open source, particularly Python and R, is becoming clear as they largely outgrow any commercial offerings in this area.
- Scalability District: The big trend is, of course, moving toward cloud storage. In particular, S3, Microsoft Azure, and Google Cloud Storage, etc.
- SQL Columnar County: Similarly, the trend here is a shift toward cloud data warehouses like Snowflake, MapR, Cloudera, BigQuery, etc.
- Machine Learning National Park: TensorFlow, another example of open-source dominance, is quickly emerging as the leader in machine learning framework.
- NoSQL Hills: The key trend here is that NoSQL is actually becoming increasingly less relevant for analytics use cases, although it still plays a small role, so we decided to continue to include it in Technoslavia.
- Data Cleaning Swamp: No, data cleaning hasn’t gone away (sorry!), though it’s increasingly becoming a more integrated part of other processes and included in other tools from across Technoslavia to help ease the burden.
- Real-Time Island: The most in-flux of all the regions and a hot topic in data science today, real-time island is seeing an influx of new technologies and frameworks, but no clear winner just yet. We like to think of it as the Australia of Technoslavia - it’s a bit exotic, it’s hard to get to, but it has new and interesting players (think: the technological equivalent of Australian megafauna, but without the extinction part). And the most adventurous and cutting-edge businesses today are either already exploring it or planning to soon.
Though (as predicted) Technoslavia has not been consolidated in the past few years, a larger trend has emerged that does bring the distinctly separate regions closer together in another way. That trend is overarching data platforms that touch several regions at the same time, simplifying and smoothing the overall experience.
Metaphorically, data platforms are the bridges, mountain passes, and transit systems that ease the long journey across Technoslavia. Practically speaking, they allow teams to do several (or all) data-related tasks in one place. This ultimately makes it easier for teams to not only work together, but also to keep up with rapidly evolving technologies, trying the latest and greatest the data world has to offer while staying within one overarching tool for proper governance.
Following years of Technoslavia fatigue, and perhaps lying in wait to see which dominant technologies emerge, it seems businesses are finally coming to terms with the fact that fragmented is the new normal. And perhaps they also are beginning to see that specialization within each region and dominance of open source is what keeps the field innovative, and they’re embracing it.
More and more, companies are starting to invest in platforms (here’s just one example) that bridge, rather than fight, the openness and fragmentation of Technoslavia, turning it into an asset that allows for agility and collaboration. They’re giving up on the idea of a unified experience with one, closed tool and instead moving toward a more flexible data platform that still gives that unified experience, but based on the best combination of technologies on the market that suit the particular needs of the business.
What’s to Come?
Who’s to say what this Technoslavia graphic will look like and which technologies it will feature in another two years from now. The industry is changing so quickly that we could discover entirely new land, and perhaps old lands will be abandoned. But whatever does happen, it’s companies that are prepared for the twists and turns ahead that will come out on top. Maintaining a unified experience on top of the chaos happening below is a critical component to success.
If you want to give a unified experience a try, you can download the free version of Dataiku Data Science Studio (DSS) here. Or, read more about the additional benefits of data science platforms beyond Technoslavia.