Technoslavia: The (Fragmented) World of Data Infrastructure in 2020

Scaling AI Nicolas Omont

The first time we introduced the idea of Technoslavia - the heterogeneous and fragmented picture of the technological world of data infrastructure - was back in 2015. Since then, we’ve updated and re-examined Technoslavia a few times (in 2017 and in 2018). Where are we now at the beginning of a new decade?

In the original Technoslavia, there were eight areas or regions. In 2018 that morphed to nine, and now in 2020, there are 11 (additions include new defined areas for infrastructure and orchestrator technologies). That means - as predicted - that the world of data infrastructure isn’t becoming more unified as it matures; in fact, it’s becoming even more fragmented. At this point, it’s safe to say that probably won’t stop, which makes having one layer on top of these technologies to orchestrate them in an enterprise setting even more critical than ever before.

dataiku_technoslavia-2020

Technoslavia: Key 2020 Trends

  • Various logos have been removed or added as technology has come into (and out of) vogue, which again points to the constant movement in the space. Key additions include players like SageMaker, Kubernetes, PyTorch, MLFlow, Kubeflow, and many more.
  • One of the key additions to Technoslavia 2020 is Infrastructure Territory, which includes the major cloud vendors plus Kubernetes and Openshift. This reflects the growing importance of the cloud to enterprises’ data strategy, as many major companies move to a hybrid model.

Screen Shot 2020-02-19 at 18.34.58

Source: In Q3 of 2019, Dataiku enlisted Gerson Lehrman Group (GLG) to administer an anonymous survey to 200 IT professionals
across a range of industries with the goal of uncovering data architecture trends in the enterprise. Get the full survey results here.

  • Notebooks have their own small quarter within the IDE valley as their functionality expands and they become an even more important part of enterprise data strategy. However, note that notebooks are still a small part of a larger picture and are not the be-all and end-all tool for data scientists.
  • Real-Time Island continues to grow but remains disconnected from the rest of Technoslavia, as it is still (metaphorically) hard to get to. However, cutting-edge businesses and industries with specific real-time use cases are certainly exploring it more and more. We predict that by the next version of Technoslavia, Real-Time Island will be reunited with the mainland - stay tuned!

How Dataiku Fits In

Metaphorically, data platforms (like Dataiku) are the bridges, mountain passes, and transit systems that ease the long journey across Technoslavia. Practically speaking, they allow teams to do several (or all) data-related tasks in one place. This ultimately makes it easier for teams to not only work together, but also to keep up with rapidly evolving technologies, trying the latest and greatest the data world has to offer while staying within one overarching tool for proper governance.

The list of plugins and integrations with technologies continues to grow, and the Dataiku team is committed to making sure companies around the world can continue to use the infrastructure that suits them best. 

You May Also Like

Why You Should Be Using Apache Spark + Kubernetes To Process Data

Read More

Is Yann LeCun the New Marie Curie?

Read More

How to Retain Data Scientists

Read More