Are you interested in participating in a data challenge? Find out how Dataiku Data Science Studio can help you with this example from the AXA and datascience.net's latest challenge.
To start off 2015 with a fun data challenge, AXA and datascience.net launched "Building a cross-selling affinity score for an insurance product during a telemarketing campaign." The contest, which is composed of two distinct phases, challenged participants to score insurance products for cross-sales. The prizes rangeg from €5000 (first place) to €500 (6th place).
- At the end of the first phase, a quantitative metric (minimum of 10% lift) will determined the 6 best contributors.
- In the second phase (march 9, 2015 - march 27, 2015), the 6 participants replicated their project in Dataiku's Data Science Studio. Based on these results, a jury composed of datascience.net experts and of an AXA judge classified the finalists and delivered prizes accordingly.
In this blog post, you’ll find a few tips and tricks on how they properly used Dataiku Data Science Studio for the challenge. Even if it's over now, you can still play around with the project!(Sorry, these are old screenshots of Dataiku DSS, find out what the new version looks like here)
Let's get started!
First, download and install the DSS Community Edition.
Second, download the AXA project by clicking here. When you've downloaded the file, please import it as follows:
Now, enter the Data Science Studio flow:
And here is a little recap of the data you see above in "datasets source":
The challenge is scored according to lift. The challenge only scores the top 10% of your highest probabilities (see below):
Project Example in DSS
In the DSS project, here is an example of how you can build your own model:
In the model bench, you can try different algorithms and compare them to each other. In this example, we are testing Logistic Regression (L1 penalty, C=0.15), Logistic Regression (L2 penalty, C=0.15), and a Random Forest:
Great! When your model is done and when you're sure your project is first prize material, you can submit it on datascience.net. How? Simply put your model in the flow or customize it in a python notebook:
Now, go to the Export Center in your Data Science Studio and submit online:
If you have any questions, please get in touch with us! Otherwise, good luck!
And don't forget to...