Incorporating Social Media Data into AI Strategy: Use Cases & Challenges

Use Cases & Projects, Scaling AI Lynn Heidmann

Data from social networks, particularly Twitter, Facebook, LinkedIn, Instagram, Foursquare, and Meetup, is a trove of valuable insight into the mentality, behaviors, and preferences of consumers. And since much of the data is public, there's an easy path to transforming social media data into predictive analytics.

Gone are the days where manual or small-scale analysis of social media data is possible, so when it comes to AI, companies often don’t know where to start. In general, there are two ways businesses today use social media data as a part of their AI strategy:

  • Social listening: This is the most common way companies today use social data, and it involves analyzing a specific set of data (usually text) from one or more medium in order to derive some insight and take action based on that insight.
  • Predictive modeling:  Good predictive use cases for social media data are rare, but those that exist are powerful and employ advanced machine learning techniques (like deep learning, especially for learning from image recognition on visual networks like Instagram and Pinterest). As opposed to social listening where the end goal is simply making sense of the data that exists, the goal here is to take social data and use it to predict future unknowns. To date, a wide range of models have been used successfully for predictive analytics with social data including Regression, Neural Network, SVM, Decision Trees, ARIMA, Dynamic Systems, Bayesian Networks, and combined models. If you’re interested in getting into deeper detail and some specific use cases, this is an excellent read.

But the first step in incorporating social media data into a big data strategy is to have a firm grasp on the end goal.

gif woman scrolling through social networks

Social data is abundant and can be insightful, but how can businesses use it effectively?

Ultimately, social media data can’t exist in a vacuum - analyzing it isn’t usually the goal per se; it’s a step along the way to a larger business goal. Here are a few examples of business goals that can be a good basis for transforming social media data into predictive analytics:

Improved Customer Service

Delta Airlines learned by analyzing social data that the biggest frustration of their customers is lost luggage and launched an innovative new product to address the problem.

Research and Development for New Products or Offerings

One car manufacturer redesigned the seats in one of its most popular models based on sentiment in the social sphere that the seats were uncomfortable.

Better Engagement with Key Influencers

L’Oréal used social data to find influencers in the beauty sphere for product trials and promotion, and they also used internal social network data to find company influencers to champion employee initiatives.

More Robust Recommendation Systems (Especially via Deep Learning)

A simple example would be an e-commerce business that could recommend products that friends of a shopper have already purchased for better recommendations (i.e., more likely to convert). Or, a deeper and more complex (but also more powerful) use case involves image recognition and deep learning. For example, a travel site could use images from Instagram or Pinterest and apply deep learning to understand a user’s preferences.

Predicting Upcoming Trends Before They Happen

News outlets might predict trending topics to prepare coverage, clothing retailers to buy/design clothing for a new season, or grocery stores to anticipate demand for certain items and stock more of them ahead of big events.

Better Business-to-Business Relations

A business might use social sentiment analysis to identify news outlets (or other businesses) consistently disseminating negative press and address it directly.

More Targeted Marketing

Businesses across industries can use social data to create more specific customer segments, drilling down deeper into their interests to deliver hyper-relevant marketing.

More Efficient Recruiting and Hiring

Recruiting departments can leverage professional networks to quickly understand who the top employers for the desired skillset are, what current employees have in common, and where lost talent goes.

Predictive Analytics for Business Decisions or Intelligence

Businesses with many locations or branches can use data from a service like Foursquare to predict where a new location would perform best. Or cities can similarly use geo-based social media data to predict crime, determine the best locations for gas or sewer lines, etc.


Even with an end goal in mind, proceed with caution; incorporating social media data into a data project seems like a worthwhile endeavor in this day and age, partially for the reasons mentioned above - it’s abundant, and (seemingly) free. But successfully incorporating social media into a data project can be more challenging than meets the eye because:

  1. Social media data is unstructured (as opposed to traditional structured data - here’s a good explanation of the difference) and can be more difficult to work with partially because unstructured data sets tend to be so large. But also, traditional systems - like relational database management systems - were not built for unstructured data, so it’s harder to analyze without proper tools that help parse, organize, manage, and make sense of it. Cutting-edge companies working with social data (particularly images) are turning more toward deep learning to bring meaning to large-scale unlabeled data. So today, it is actually the preprocessing of social data that is widely considered to be the most challenging component from a computational, big data analytics perspective.

structured vs unstructured data


2. Depending on exactly what type of social media data you want to use, there can be an issue of too much noise to be valuable (e.g., looking at a popular hashtag on Twitter) or not enough data to be valuable (e.g., analyzing company social media mentions for a brand that doesn’t see very much social traffic). Both factors can limit how advanced a predictive model based on social data can be.

3. Often, analyzing social media data in and of itself doesn’t turn out to be very useful, so many companies fail to see value from these projects. For example, let’s say via analysis that you find a spike in buzz around your product on social media. That’s really only half the story - does this also correlate with a spike in sales? How many of those talking about the product on social media actually made a purchase? This problem becomes compounded if you have massive amounts of real time social data, but, for example, sales numbers are only available quarterly (or even monthly) - the massive lag between explanatory factors and outcome often proves to be difficult to identify and use.

4. To get the other half of the story, some businesses try to combine social data with other data to get more value out of the analysis - this is a great idea (when it works, it’s best practice). But when combining social media data with other sources (like transactional data), it’s often difficult to tie a customer’s identity online to their customer ID, or the method being used to identify them in your company’s systems. If tying these two identities proves to be impossible, then it won’t be possible to build effective predictive models based on social data; you would never be able to test whether an observed behavior in social data lead to a desired behavior.

5. When consumers post about businesses or services, there’s generally a bias toward extremes - that is, posts are dominated by the very satisfied and the very unsatisfied. Also, certain populations (like males vs. females, specific age ranges, or even specific geographies) may be overrepresented. Questioning social data and analyzing it for cognitive biases before using it as part of a larger strategy is critical to preventing errors due to poor context.

Get Started

If you’ve decided to make the leap despite its challenges, a data science platform can speed up time spent on data prep for social data as well as make the project accessible to someone without coding or machine learning experience. Check out this post on predicting ISIS association based on Tweet content or this one looking at sentiment toward public transit for examples.

Or watch the video to see how one of our partners, Hewlett Packard Enterprise, along with Luciad analyzed millions of Tweets using Dataiku and built a visualization of the extracted, classified, and clustered Tweets:

You May Also Like

Building a Culture of Experimentation

Read More

Building a Modern AI Platform Strategy

Read More

Why Is My Data Drifting?

Read More

DEEP BEERS: Improving the Performance of Deep Recommendation Engines Using Keras

Read More