Hello everyone! Below you’ll find an interview of Dataiku’s very first and very talented data scientist, Mathieu Scordia. Recruited in March 2013, Mathieu will explain who he is, what his job involves, and how he got where he is today. Plus, he’ll give you advice on how to successfully compete in a Kaggle competition.
Hello Matthieu… Let’s start from the beginning: how about introducing yourself first?
Hello, my name is Matthieu Scordia, I’m 27 years old and I have been working as a data-scientist at Dataiku for 2 years now.
Tell me about your background?
I graduated with a Master's degree in AI from Pierre & Marie Curie University in 2012 (the program is now called DAC).
This Master covers the essential concepts I deal with in my job: statistical learning or machine learning, search engines, recommendation systems... I then joined the startup Dataiku to conduct my end-of-studies internship with this subject: machine learning in e-commerce service.
I have now been at Dataiku for two years. I’ve certainly learned a lot and can now use other technologies that I had not seen during my studies.
Matthieu, the data scientist
How would you define your work as a Data Scientist?
My job is to analyze customer data. We help them understand format data and then build an application such as designing a predictive model, or creating a Data Visualization.
What aspects of your work do you prefer?
I prefer the scientific approach to data. Customers strive to maximize a variable or a factor. I love performing a series of experiments on their data to reach this goal. We're lucky to have customers working in various fields which enables us to work on very different data sets. The method is the same but the issues are always changing.
Which tools do you use?
I use Data Science Studio (DSS), it litteally has everything I need: Hadoop to aggregate large volumes of data, scikit learn for machine learning, d3js for visualizations.
How much time do you spend attending data sience oriented events?
I try to go to one meetup per month; this allows me to meet other data scientists or startups and exchange on new techniques, new technologies, etc.
Are statistical studies important to become a data scientist?
No, to become a data scientist many paths are available. University or engineering schools, mathematics, computer science, physics or biology. The data is present in all of these disciplines and one really learns how to properly analyze the data. I even have a data scientist friend who went to Sciences Po (a Parisian elite school speciallized in political science). As of last year, many schools and universities have developed specialized curricula in the data science or big data field.
Matthieu, the Kaggle competitor
Kaggle.com and Datascience.net, how do they matter?
Kaggle and Datascience are very good ways to progress while having fun because you are challenged by other expert or beginner data scientists. The Kaggle forum is a true gold mine of ideas and methods shared by all participants.
Apart from statistics and programming, what other skills do you need in your work?
A good data scientist must have the ability to remain skeptical. It is very easy to defy the stats, the blog "spurious correlations" shows for instance that you could correlate the number of people who drowned by falling into a swimming-pool with the number of movies where Nicolas Cage appeared! ;)
As a Kaggle master do you have advice for a beginner? How would you suggest increasing chances of winning a Kaggle competition in few words?
I compare Kaggle competition to sport: the more you practice the better you get. It’s a game of optimization: I constantly try to improve myself. At the conclusion of the competition, there is a massive learning opportunity with the forum, competitors share their best work for solving the problem. You compete with the best data scientist in the World so It brings out the best you have in yourself.
My advice will be: Never give up and keep learning!
It’s like tennis you can’t expect to hold the top ATP ranking after your first tournament.
If you want to know more about Matthieu and more about his tips to win Kaggle Competition, read this French article.
Matthieu Scordia working on DSS