Data Projects WILL Fail - Learn to Fail Quickly & Efficiently

Scaling AI Claire Carroll

In case you haven’t heard, a whopping 85% of big data projects fail (we’ve already talked about why). But data science is an inherently risky endeavor, so the challenge today is not how to avoid failure completely, but how to fail in an efficient and productive way.

broken plate pieces on the floor

Building a team and support system to get business value from data is a huge investment, but unless your team has the agility and organization to leverage their insights across the company, this energy might be wasted. However, this isn’t because our hot new tech is inherently flawed, but rather because it’s hard for users and businesses to adapt and integrate data-driven insights.

Daniel Carroll has pinpointed an easy way to develop data-driven solutions that overcome user resistance and early failures, and it's one that every entrepreneur has been through: the startup phase. It all boils down to rapid iteration, clear goal-setting. And, of course, our friend operationalization.

Start Small & Iterate

Risk is a key commonality between data science projects and startups. Data science projects can be labor-intensive black boxes that are difficult to scale or incorporate into existing business practices. But luckily, data projects - whether simple predictive models or complex AI systems - don’t have to be perfect on the first (or second) iteration. This is an innate part of startup culture: launching again and again.

Leveraging lean models and learning from them is an ideal approach to data science and machine learning. Getting from nothing to an operationalized model should be as streamlined as possible.

This is one reason collaboration is so critical. In the startup model, everyone works together and learns from each other, with each person working on a smaller part of the whole (in their area of expertise) to release a final product. The same must be true with data projects; data scientists cannot simply work in a vacuum and then present some body of work to executives once a quarter - by the time the team presents a model and has approval, the data is likely already stale, and it’s back to the drawing board.

Set Realistic Goals

George Doran created the S.M.A.R.T. tenets of goal setting as a way of evaluating the productivity and potential success of individual goals. The S.M.A.R.T. system works well with Agile methodology and traditional business hierarchies alike, making it the ideal bridge between the startup and enterprise models.

S.M.A.R.T. goals are Specific, Measurable, Achievable, Realistic, and Timely. Early data goals may not seem anywhere near the end goal, but in order to keep with the lean startup model, we need to produce something to start and build from there.

Following the S.M.A.R.T. principles, a compelling goal model for data integration for an e-commerce business might be:

General Goal: Decrease product waste through customer trend prediction.

Specific: Compare order data with local weather patterns.

Measurable: Yes! We’ll explore correlation.

Achievable: Yes! There are public weather databases, and we have the right data on the customer side. Our data team has the skills needed to complete this goal.

Realistic: Weather is likely a contributing factor to the sales of specific items.

Timely: Once we’ve operationalized the model, we’ll see how it’s performing on real-time data and adjust as needed.

With this system, you can make sure that your data projects are targeted enough to be implementable and produce concrete results that will improve efficiency and generate trust in more elaborate future models. The beauty of data and AI is that the technology has the potential to produce results you can’t even dream of.

However, you can’t start with an astronomical goal; you need to build up the trust and integration of your data system in order for it to help take your business to the next level.

Fail Gracefully

Yet even a prudent system like S.M.A.R.T. cannot guarantee your data project will be successful. Iterative trial-and-error is par for the course with any startup or data venture. Leveraging data is aspirational and risky; if you’re not failing, then you’re not pushing the boundaries enough.

However, failure in and of itself is not the goal, but rather moving quickly enough that failing is not catastrophic. Ideally, you will attempt multiple small projects simultaneously so that one failure doesn’t derail your entire endeavor.

The key is to ensure that you learn from and document your failures so that you don’t repeat mistakes. If you set precise goals, and adjust accordingly if they fail, then you set yourself up for successful data operationalization.

orange lightbulb in a dark space

“I have not failed, I’ve just found 10,000 ways that won’t work.” —Thomas Edison

Precise Steps To A Data-Driven Future

The key to successful integration of data science technology is to start small. Data science is not a magic wand capable of transforming your company overnight. A lean, lower-risk model, supported by considerate goal-setting will set up the strongest base for a more robust data strategy. Just because failure and iteration is a critical part of a data science journey doesn’t mean that all should abandon hope; it just requires targeted, fast projects in order to take advantage of data’s capabilities.

You May Also Like

From Vision to Value: Visual GenAI in Dataiku

Read More

Understanding the Why and How of the LLM Mesh Architecture

Read More

The Ultimate Test of ChatGPT

Read More