Hello everyone! Below you’ll find an interview of Dataiku’s very first and very talented data scientist, Mathieu Scordia. Recruited in March 2013, Mathieu will explain who he is, what his job involves, and how he got where he is today. Plus, he’ll give you advice on how to successfully compete in a Kaggle competition.
Introducing Matthieu
Matthieu Scordia working on Dataiku DSS
Hello Matthieu… Let’s start from the beginning: how about introducing yourself first?
Hello, my name is Matthieu Scordia, I’m 27 years old and I have been working as a data scientist at Dataiku for two years now.
Tell me about your background?
I graduated with a Master's degree in AI from Pierre & Marie Curie University in 2012 (the program is now called DAC).
This degree covers the essential concepts I deal with in my job: statistical learning or machine learning, search engines, recommendation systems, etc. I then joined the startup Dataiku to conduct my end-of-studies internship with this subject: machine learning in e-commerce service.
I have now been at Dataiku for two years. I’ve certainly learned a lot and can now use other technologies that I had not seen during my studies.
Matthieu, the Data Scientist
How would you define your work as a data scientist?
My job is to analyze customer data. We help them understand and format data and then build an application such as designing a predictive model or creating a data visualization.
What aspects of your work do you prefer?
I prefer the scientific approach to data. Customers strive to maximize a variable or a factor. I love performing a series of experiments on their data to reach this goal. We're lucky to have customers working in various fields which enables us to work on very different data sets. The method is the same but the issues are always changing.
Which tools do you use?
I use Dataiku DSS, it literally has everything I need: Hadoop to aggregate large volumes of data, scikit-learn for machine learning, and d3js for visualizations.
How much time do you spend attending data science-oriented events?
I try to go to one meetup per month; this allows me to meet other data scientists or startups and exchange on new techniques, new technologies, etc.
Are statistical studies important to become a data scientist?
No, to become a data scientist many paths are available: university or engineering schools, mathematics, computer science, physics, or biology. The data is present in all of these disciplines and one really learns how to properly analyze the data. I even have a data scientist friend who went to Sciences Po (a Parisian elite school specialized in political science). As of last year, many schools and universities have developed specialized curricula in the data science or big data field.
Matthieu, the Kaggle Competitor
Kaggle.com and Datascience.net, how do they matter?
Kaggle and Datascience.net are very good ways to progress while having fun because you are challenged by other expert or beginner data scientists. The Kaggle forum is a true gold mine of ideas and methods shared by all participants.
Apart from statistics and programming, what other skills do you need in your work?
A good data scientist must have the ability to remain skeptical. It is very easy to defy the stats, the blog "spurious correlations" shows for instance that you could correlate the number of people who drowned by falling into a swimming pool with the number of movies where Nicolas Cage appeared! ;)
As a Kaggle master do you have advice for a beginner? How would you suggest increasing chances of winning a Kaggle competition in few words?
I compare Kaggle competition to sports: the more you practice the better you get. It’s a game of optimization: I constantly try to improve myself. At the conclusion of the competition, there is a massive learning opportunity with the forum, competitors share their best work for solving the problem. You compete with the best data scientist in the world so It brings out the best you have in yourself.
My advice is never give up and keep learning! It’s like tennis — you can’t expect to hold the top ATP ranking after your first tournament. If you want to know more about Matthieu and his tips to win a Kaggle competition, read this French article.