Your 2021 New Year’s Resolution: Excelling at Dataiku DSS

Dataiku Company, Dataiku Product Marie Merveilleux du Vignaux

Are you one of those people who never knows what to say when it comes their turn to share a New Year’s resolution? Fear not! We have you covered this year. You can amaze, and even encourage, your team by announcing that you have found a New Year’s resolution that will not only increase your personal data science and machine learning skills, but also boost  collaboration, efficiency, and scalability of your enterprise as a whole: excelling at Dataiku DSS (now that you’ve got the basics down).

A Responsible Resolution


As the new year begins and the digital and technological revolutions continue, companies, as well as governments and regulators worldwide, are focusing on improving their strategies of creating and implementing sustainable and smart AI solutions. You can check this box by further developing your Dataiku DSS abilities. Dataiku DSS supports the building of a Responsible AI strategy that is sustainable for the future. 

The great gatsby raising a glass

Tips and Tricks 

Inspired by its core value of collaboration, the 2020 Dataiku DSS Community worked together to prepare some tips and tricks of how to best level up your Dataiku DSS know-how. This list can come in handy when you dive even deeper into the product and advance in your 2021 New Year’s resolution!  

Top 5 User Shortcuts

Get the full list of Dataiku DSS keyboard shortcuts. 

  1. Drag and drop multiple files in the upload box to stack at import.
  2. Use the select tool (shift + click) to select multiple recipes or copy a part of your flow.
  3. Use the view button on the bottom left corner of the flow to cycle between flow views (tags, connections, recipe engines...).
  4. Hit space in the flow to cycle between right panel tabs.
  5. Use selection of some characters in a cell during a prepare recipe to add a step to split a column or to replace some strings.

Top 5 Best Practices 

Get more details on how to execute these actions  in Dataiku DSS.  

  1. Create a dataset metric that counts rows with duplicate keys in a SQL table by creating a SQL Probe (at the bottom of the metric editing page) like the following:
    1. SELECT COUNT(*) AS "Duplicate Key Value Count" FROM (SELECT REC_KEY FROM ${DKU_DATASET_TABLE_NAME} GROUP BY 1 HAVING COUNT(*) > 1) T
    2. Of course one can then create a check on the metric that fails (or warns) a dataset rebuild if value > 0
  2. When executing SQL data modification statements from a Python recipe or notebook, clear a table prior to inserting rows, delete specific rows, create or drop a table, etc.
  3. After training models in the Lab, export the code to Jupyter notebook.
  4. When dealing with multiple source data and needing to merge or stack ADS coming from the all source datasets, make sure the storage type of the respective  columns of all the datasets  are same.
  5. You can update the project description manually on each instance once a bundle is activated or before creating a bundle so that this information is embedded in it.

The Dataiku team is always here to provide additional tips, tricks, and support when needed. The Dataiku community wishes you a very happy new year and looks forward to accompanying you on your journey to excelling at Dataiku DSS. 

You May Also Like

I Have Databricks, Why Do I Need Dataiku?

Read More

Dataiku Makes Machine Learning Accessible, Transparent, & Universal

Read More

Explainable AI in Practice (In Plain English!)

Read More

Enhancing Speed to Market in Life Sciences Operations

Read More