The Dataiku Blog

The New Search : Fuzzy, Instantaneous, and Local

At Dataiku, we use extensively search logs and associated navigation information for user behaviour analytics and relevance optimization. Most of our customers today use SOLR or ElasticSearch....

data science, Technology | May 03, 2013 | Florian Douetteau

A Complete Guide to Writing Hive UDF

Note that this guide is quite old (it was written when Hive was at version 0.10) and might not apply as-is to recent Hive releases. Use at your own risks :)

Dataiku DSS provides deep integration...

Hadoop, data science, Technology | April 30, 2013 | Clement

Kaggle Contest: Blue Book For Bulldozers

Perhaps you know Kaggle and its slogan “making data science a sport”?

Kaggle is a cool platform for predictive modeling competitions where the best data scientists face each other, all trying to...

data science, machine learning, python | April 25, 2013 | Matt Scordia

Thomas at Strata - Part 2

The previous post on my trip to Strata describes my first day there. You may want to read it here.

The next two days were focused on keynotes and presentations, as well as exhibitors products...

Corporate, Events, strata | March 20, 2013 | Thomas Cabrol

Thomas at Strata - Part 1

I've been lucky enough to travel to Santa Clara, California, and attend the Strata Conf event. I was there for two days and have plenty of insight and feedback on all the sessions over the course...

Corporate, Events | March 12, 2013 | Thomas Cabrol

Visualizing Your LinkedIn Graph Using Gephi - Part 1

Graph analysis becomes a key component of data science. A lot of things can be modeled as graphs, but social networks are really one of the most obvious examples.

In this post, I am going to show how...

Technology | December 17, 2012 | Thomas Cabrol

Visualizing your LinkedIn Graph Using Gephi - Part 2

In the previous post, we learnt how to get data out of LinkedIn via its API. This task is quite technical but an entire component of every data science projects: accessing and manipulating data from...

Data Visualization, Technology | December 07, 2012 | thomas

Setting Up A Cool Data Science Platform Cheaply

Current technologies allow us to build a data science stack for very little, and it will perform as well or even better than stuff that used to cost a lot a few years ago.

stack, data science, tutorial | October 03, 2012 | Thomas Cabrol

A Simple Recommendation Engine Implemented in Different Languages

Ever wondered how you get recommended to watch Raiders of the Lost Ark after you gave a good rating to Star Wars on your favorite movie rental service ? (yes, back to the 80's...). That's the...

recommendation, data science, Technology | September 10, 2012 | Thomas Cabrol

Visualizing French Income Tax Data

What was supposed to be a simple data visualization side project with some French open data ( and Tableau Public ended up in something quite complex.

Data Visualization, Data Preparation, Data analysis | July 01, 2012 | Thomas Cabrol
Page 12 of 12