Breaking Down the Barriers of Data Modeling and Predictive Analytics

Product| Corporate| business | | Brian

Dataiku DSS is breaking down the barriers of data modeling and predictive analytics one step at a time. 

Heads Up!

This blog post is about an older version of Dataiku. See the release notes for the latest version.

Let's go >


Dataiku Data Science Studio (DSS) is an intuitive software solution that enables data professionals to harness the power of big data to create predictions that make a difference. In order to bring predictive analytics to a whole new level in terms of collaboration and usability, we released our new version.


Breaking Down the Barriers of Data Modeling and Predictive Analytics

For too long there has been a disconnect between capability and reality: what’s possible and what really happens. Data modeling and predictive analytics technology has advanced at breathtaking speed, far outpacing humanity’s capability to manage and convey the information in a meaningful manner. Big data is accurately named — it’s assuredly big! — but the challenge lies in making it understandable… breaking it down into cognitive portions that can be readily swallowed by the human brain.

An even bigger challenge is making this process accessible to the masses. After all, not all of us are Data Scientists and Coders Extraordinaire. The road of predictive analytics should be inclusive, open, and engaging — not a secret path accessible to only a select few. We needed to break down the barriers… for us, that started with Data Science Studio 1.0 and has accelerated with Data Science Studio 2.0.

Big Data, the Big Umbrella

We realized that there was a need for a solution that helped data professionals turn raw data into business-impacting predictions quickly. Doing this required a realization that data teams are diverse — data scientists and software engineers share projects with marketers and salespeople. Big Data is actually a Big Umbrella that covers a wide array of specialists, all of whom need to create awe-inspiring visualizations and models capable of business implementation. So, with open arms, we sought to re-imagine Data Science Studio and make it into an inclusive solution that engages with people from all walks of life.

Big Data, the Big Umbrella

Starting from Square One

We realize that data science can be intimidating to those unfamiliar with the basic tenets of how raw data is transformed into useful information. In order to alleviate those initial hesitations, we decided to re-design the interface of Data Science Studio. The platform is now more interactive: every dataset now has its own contextual menu that enables users to easily manipulate & transform data as needed. No need to look all over for the tools you need… they’re now right at your fingertips.

In addition, we enhanced our recipes to include more visual manipulation options for your data. Recipes, by the way, offer an easy way to initiate data manipulations such as cleansing and aggregation. DSS now features the Join recipe, which supports the merging of two or more datasets, and the Stack recipe, which allows for the vertical stacking of datasets. The Group recipe has also been re-designed with a more intuitive interface and is now unified with new recipes.

Starting from Square One

To facilitate a wider variety of database connections, DSS 2.0 features an expansion of our connectors to include Oracle and Microsoft SQL Server. In total, DSS now includes twenty different connectors ranging from Excel to Hadoop.

Let’s work together

Let’s work together

Continuing with the theme of diversity among data professionals, we decided to increase DSS's collaborative features. Within a project there is a need for multiple skill-sets in order to produce relevant data-driven predictions — for example, a marketer who is intimately familiar with a client’s vision needs to interact with data scientist colleagues in order to make that vision come to life.

We addressed the need for a collaborative environment by implementing the following features:

  • More effective management of editing conflicts: Users are automatically warned in the event of simultaneous editing by multiple users;
  •  Support of simultaneous work via the Web — particularly useful for remote workers;
  •  A new LDAP connector that supports better project management rights;
  •  A comments notifications system based on project;
  •  A personal view that enables users to easily access their in-progress work.

This empowers users to take control of their own work, view status information, and perform project-level searches.

Putting you in the driver’s seat

Transforming raw data into information that is useful can be a time-intensive and step-centric process — stages are interdependent throughout. Sometimes going from one phase to another can be a pain, especially when you need to quickly iterate between different steps (e.g., preparation ↔ visualization ↔ modeling, etc.). DSS addresses this issue via a new unified module called Analysis. Specifically optimized for fast iteration cycles, the new Analysis Module enables users to transform their data and immediately view the results of their predictive models… these visualizations, in turn, allow for the rapid evaluation of model quality. By unifying machine learning and data preparation, the end-result is increased user productivity and efficiency.

Ready to try?

Enough of our rambling… why not give DSS a try yourself? We offer a free version, great support, and a new collaborative environment designed to engage with all of your team members.


Other Content You May Like