Today is International Unicorn Day, and we’re joining Pivigo in their campaign to show the breadth, diversity, and power of data scientists and the data science network. Because they’re #NotMythicalAnymore! We thought we’d focus on how to hire great ones, since it’s something that we’ve found many companies still struggle with today.
We have a pretty amazing team of data scientists here at Dataiku that solve all kinds of problems for enterprises large and small in a vast array of industries. They come from a wide range of backgrounds and experience — here’s a small blurb about a just few members of our data science team:
After graduating from the Manchester University, Silviu started his career at 1stTouh as the company's sole data scientist, building analytical products for the housing sector. He is now a member of the Dataiku team in the UK.
Aimee has a PhD from the University of Oxford at CERN in experimental particle physics, and her latest project takes a Bayesian approach to predicting football player performance. In her free time, she loves to travel and try new food.
Harizo finished a PhD in computational statistics at CEA, which is the national agency created after WWII to design the French Atomic bomb. Now, the agency deals with a broader scope of subjects — for instance, Harizo designed some methods to locate pollution sources based on various sources of data.
Guilherme helps customers at Dataiku build and deploy predictive applications. Before joining the team here, he was a fellow at the Insight Data Science Fellowship program, and prior to that he worked in quantitative finance. He holds a PhD in applied mathematics.
A very tall data scientist, who after stopping a collegiate basketball career explored and pursued the magical world of computer science, data, and machine learning, Alex W loves tech startups, data science, and being apart of the Dataiku family."
So You Want to Hire a Data Scientist …
The term “data scientist” was coined in 2008 by two LinkedIn analysts to describe their work deriving business value from the masses of data being generated by their website. Since then, data scientists have become much more mainstream, but there are definitely still people — and perhaps organizations — out there that wonder exactly what it is they do and what to look for when hiring.
If you are seeking an intrepid data scientist to lead your company into the wide blue yonder of big data, what magic skill set should you be looking for? Well, to start with, we recommend reading James Kobelius' admittidly old but still hilarious Data Scientists: Myths and Mathemagical Superpowers where he will disabuse you of any preconceived notions: the data scientist is neither unicorn nor trumped-up BI analyst, and he came down from the ivory tower a while ago.
And if you drill down more on the technical side, you can get an entertaining, but stressful, map of the many tasks and tools to master, as shown in Becoming a Data Scientist — Curriculum via Metromap.
But at the core, it really boils down to these...
Six Skills to Look For in a Data Scientist Hire:
What to look for: Can create a Powerpoint presentation as powerful as your Marketing VPs.
- A Good Data Scientist Knows Your Business: A data scientist needs to have an overall understanding of the key challenges in your industry, and consequently, your business. She must be familiar with the industry's financial ratios to rapidly assess whether there is a potential gain, its order of magnitude, and then find inspiration before taking her next breath. Another characteristic of a true data scientist is that she's fascinated by the subjects that will have the greatest impact, not the problems in themselves. A data scientist is not a scientist in the traditional sense; it’s not the quest for truth that drives her, but the process to uncover it.
What to look for: Goes for the highest stakes instead of the most complete.
- A Good Data Scientist Understands Statistical Phenomena: Data scientists must be able to correctly interpret statistics: is a result representative or not? This takes an understanding of statistics that allows the data scientist to assert, with authority, why 3% is statistically significant for certain cases, but means nothing for others. This skill is key, since the majority of stats we analyze contain statistical bias that needs correcting.
What to look for: Can understand what is statistically significant.
- A Good Data Scientist Makes Efficient Predictions: The data scientist must have a broad knowledge of algorithms to select the right one, and moreover, know which features to adjust to best feed the model. There is often a certain degree of creativity involved here; as a painter uses color to convey depth, a data scientist must know how to combine different data so they complement each other.
What to look for: Instinctively knows which features to add to the model.
- A Good Data Scientist Provides Production-Ready Solutions: Today's data scientists need to provide services that can run daily, on live data. What's new here is that historically, back office models built by BI or data mining teams were often re-written by technical teams for real-time production environments. Nowadays, a recommender system cannot withstand a rewrite before being put online.
What to look for: Knows how to deliver production-ready solutions.
- A Good Data Scientist Can Work On A Mass Scale: A data scientist must know how to handle multi-terabyte datasets to build a robust model that holds up in production. He must not be afraid of datasets with a 12-digit file size. In practice this means that he needs to have a good idea of computation time, what can be done in memory and what, on the other hand, requires Hadoop and MapReduce.
What to look for: Someone not afraid of big datasets.