How to Hire Great Data Scientists

Scaling AI Romain Doutriaux

Today is International Unicorn Day, and we’re joining Pivigo in their campaign to show the breadth, diversity, and power of data scientists and the data science network. Because they’re #NotMythicalAnymore! We thought we’d focus on how to hire great ones, since it’s something that we’ve found many companies still struggle with today.

unicorn data scientists

We have a pretty amazing team of data scientists here at Dataiku that solve all kinds of problems for enterprises large and small in a vast array of industries. They come from a wide range of backgrounds and experience — here’s a small blurb about a just few members of our data science team:

After graduating from the Manchester University, Silviu started his career at 1stTouh as the company's sole data scientist, building analytical products for the housing sector. He is now a member of the Dataiku team in the UK.

 Aimee has a PhD from the University of Oxford at CERN in experimental particle physics, and her latest project takes a Bayesian approach to predicting football player performance. In her free time, she loves to travel and try new food.

Harizo finished a PhD in computational statistics at CEA, which is the national agency created after WWII to design the French Atomic bomb. Now, the agency deals with a broader scope of subjects — for instance, Harizo designed some methods to locate pollution sources based on various sources of data. 

Guilherme helps customers at Dataiku build and deploy predictive applications. Before joining the team here, he was a fellow at the Insight Data Science Fellowship program, and prior to that he worked in quantitative finance. He holds a PhD in applied mathematics.

A very tall data scientist, who after stopping a collegiate basketball career explored and pursued the magical world of computer science, data, and machine learning, Alex W loves tech startups, data science, and being apart of the Dataiku family."

So You Want to Hire a Data Scientist …

The term “data scientist” was coined in 2008 by two LinkedIn analysts to describe their work deriving business value from the masses of data being generated by their website. Since then, data scientists have become much more mainstream, but there are definitely still people — and perhaps organizations — out there that wonder exactly what it is they do and what to look for when hiring.

If you are seeking an intrepid data scientist to lead your company into the wide blue yonder of big data, what magic skill set should you be looking for? Well, to start with, we recommend reading James Kobelius' admittidly old but still hilarious Data Scientists: Myths and Mathemagical Superpowers where he will disabuse you of any preconceived notions: the data scientist is neither unicorn nor trumped-up BI analyst, and he came down from the ivory tower a while ago.

st. patrick's day magic GIF by TipsyElves.com-source

And if you drill down more on the technical side, you can get an entertaining, but stressful, map of the many tasks and tools to master, as shown in Becoming a Data Scientist — Curriculum via Metromap.

But at the core, it really boils down to these...

Six Skills to Look For in a Data Scientist Hire:

  1. A Good Data Scientist Communicates Effectively to Business Users: The harsh reality is that statistics are complex. A data scientist has no hope of enlightening the average business user with an Excel file. To let the data tell a story, a data scientist needs to have a veritable Swiss army knife of presentation skills to convey their results persuasively, to anyone. This can range from the most mundane (Powerpoint presentation) to the most exotic (multimedia storytelling using interactive Javascript visualizations based on the latest D3 framework).
    What to look for: Can create a Powerpoint presentation as powerful as your Marketing VPs.

     

    How to test: When presenting his results, does your data scientist remember to highlight the hypothesis to explore in green, and the one to reject in red?"
  2. A Good Data Scientist Knows Your Business: A data scientist needs to have an overall understanding of the key challenges in your industry, and consequently, your business. She must be familiar with the industry's financial ratios to rapidly assess whether there is a potential gain, its order of magnitude, and then find inspiration before taking her next breath. Another characteristic of a true data scientist is that she's fascinated by the subjects that will have the greatest impact, not the problems in themselves. A data scientist is not a scientist in the traditional sense; it’s not the quest for truth that drives her, but the process to uncover it.
    What to look for: Goes for the highest stakes instead of the most complete.

     

    How to test: Does your data scientist prefer to deal with a safe bet of $100k or a risky endeavor worth $1M?" 
  3. A Good Data Scientist Understands Statistical Phenomena: Data scientists must be able to correctly interpret statistics: is a result representative or not? This takes an understanding of statistics that allows the data scientist to assert, with authority, why 3% is statistically significant for certain cases, but means nothing for others. This skill is key, since the majority of stats we analyze contain statistical bias that needs correcting.
    What to look for: Can understand what is statistically significant.

     

    How to test: Does your data scientist go bug eyed when you ask him to calculate a confidence threshold for his assessment?" 
  4. A Good Data Scientist Makes Efficient Predictions: The data scientist must have a broad knowledge of algorithms to select the right one, and moreover, know which features to adjust to best feed the model. There is often a certain degree of creativity involved here; as a painter uses color to convey depth, a data scientist must know how to combine different data so they complement each other.
    What to look for: Instinctively knows which features to add to the model.

     

    How to test: Has your data scientist already tried all the variables or derivatives that you can possibly think of?"
  5. A Good Data Scientist Provides Production-Ready Solutions: Today's data scientists need to provide services that can run daily, on live data. What's new here is that historically, back office models built by BI or data mining teams were often re-written by technical teams for real-time production environments. Nowadays, a recommender system cannot withstand a rewrite before being put online.
    What to look for: Knows how to deliver production-ready solutions.

     

    How to test: Does your data scientist run the other way when asked to code his algorithm in Java?" 
  6. A Good Data Scientist Can Work On A Mass Scale: A data scientist must know how to handle multi-terabyte datasets to build a robust model that holds up in production. He must not be afraid of datasets with a 12-digit file size. In practice this means that he needs to have a good idea of computation time, what can be done in memory and what, on the other hand, requires Hadoop and MapReduce.
    What to look for: Someone not afraid of big datasets.

     

    How to test: Does the prospect of reconciling several customer datasets of a few million lines apiece make your data scientist break into a cold sweat?"

You May Also Like

Explainable AI in Practice (In Plain English!)

Read More

Democratizing Access to AI: SLB and Deloitte

Read More

Secure and Scalable Enterprise AI: TitanML & the Dataiku LLM Mesh

Read More

Revolutionizing Renault: AI's Impact on Supply Chain Efficiency

Read More