The Secrets of Successful Data Science Education

Use Cases & Projects, Dataiku Company Eva Neuner

As data-related roles continue to be some of the most in-demand across enterprises worldwide, education for budding data scientists, analysts, engineers, etc., becomes increasingly important.

education

If you had never heard of the Dataiku programs for both academic and non-profit institutions before, then you surely got up to speed with our last blogpost featuring the WCHRI. This time, we wanted to give a glimpse into how some academic institutions are putting Dataiku to good use for teaching purposes. So why not just ask one of those very teachers?

Feature: A. Wodecki of the Warsaw University of Technology

Andrzej Wodecki has a background in Physics, AI, and Business Administration. He worked at the Maria Curie Sklodowska University in Lublin and moved to the Warsaw University of Technology, one of the leading institutes of technology in Poland and one of the largest in Central Europe, recently.

As member of the Dataiku Academic and Non-Profit Program, he has been using Dataiku in class for several years. This is why we asked him to reveal some of his best-kept secrets in data science education for you.

open book with drawings of graphs and data visualization

Q: How did you find out about Dataiku?

Since I’ve done data science for a few years already, I read different market reports. Once I noticed Dataiku positioned really good on something like the Gartner Magic Quadrant. So I gave it a try.

Q: Why did you choose Dataiku over other solutions?

I chose Dataiku for various reasons. There’s the perfect UX and great manuals, especially the 101, 102, and 103 tutorials. They are giving a great sense of a full data science process, such as CRISP for example. Also, there is the use case library based on real-life problems. Altogether, that makes the solution a perfect environment for educational purposes.

Q: What were you doing before Dataiku?

I used Python Libraries like NumPy, Pandas, scikit-learn, Matplotlib, and Keras.

Q: Who are your students?

My students either come from the Faculty of Management and are between 21 to 25 years old, or they come from MBA courses and are older than 30. So they are all non-technical analysts. I use Dataiku in classes such as Data Preparation and Analysis, Business Analytics Lab, Business Information Systems, and Big Data in Business.

 Q: Can you describe the projects they were working on?

First, together we go through Tutorials 101 to 103. Then, in groups, they choose the real business problem (mostly from their own companies, but sometimes using publicly available data sources, including Dataiku use cases) and solve it in the CRISP framework. Finally, they present their results. All that process takes around 30 hours of lab, mixed with lectures and presentations, with around 15 students in a group split into four subgroups.

Q: What are the future projects you would like to work on?

Since I specialize in (deep) reinforcement learning at the moment, I would like to try Dataiku potential in that area. 

Q: Is there any advice you want to give to other professors in the data space?

Yes! My five pieces of advice:

1. Carefully plan students’ account logistics: login names, groups, permission rights, etc. For more than 50 user accounts use Dataiku Python API — that really simplifies life.

2. Start with a careful, interactive presentation (students working on their own accounts) and discussion of Tutorials 101-103 in the CRISP framework. That gives a great hands-on feeling and is a good starting point for workshops

3. In students’ projects, recommend them to strictly follow CRISP methodology

4. As for practical exercises:

- For beginners: ask them to realise simple Excel tasks in Dataiku (to make them familiar with this environment).

- For more advanced: ask them to realise data science/machine learning Jupyter notebooks (or Python code) in Dataiku. Thats a really powerful task.

5. Focus not only on data preparation and modeling, but also (or mostly) on algorithms parameters, evaluation metrics, and results interpretation. 

You May Also Like

Taming LLM Outputs: Your Guide to Structured Text Generation

Read More

No-Code ML and GenAI With Dataiku and Fabric

Read More

The Objects of an LLM Mesh for Building LLM-Powered Applications

Read More

Data Lineage: The Key to Impact and Root Cause Analysis

Read More