Advanced Analytics for Customer Insights

Marcello Garritano
December 15, 2022
5 min read

Customer Insights teams, more so than other functions, have struggled to effectively leverage the huge amount of innovation that’s taken place over the past decade in data analytics tech.  In large part this is a result of those innovations being tailored to solve the more generic, common denominator use cases related to basic querying and dashboarding of a single data source with a stable structure.  Customer insights work is significantly more complex than those generic use cases, however, and requires flexible, multi-data source analysis which goes beyond the rigid, single-purpose tools within the generic analytics toolkit.  An emerging class of more versatile analytical platforms are finally giving customer insights teams the ‘Swiss army knife’ they need to survive and thrive in the untamed wilderness within which they operate.  Here’s a look at some of the key challenges and emerging solutions every customer insights team should be using:

Manual Data Collection and Prep 

The process of collecting and preparing data from an array of research data tools is a ‘necessary evil’ that unfortunately represents a majority of the effort needed to get to actual insights.  This lower value work often ends up consuming so much of an insights team’s bandwidth that there isn’t enough time left for insights professionals to do higher value interpretation and insight discovery.  This is a suboptimal outcome for organizations which are relying on high quality insights that can transform the direction of the business.  Instead, doing this work against time constraints often results in outputs that feel more like a disjointed array of charts or tables without clear insights.  Digging deeper, the root causes behind this problem can be solved with a new set of automation approaches.

The challenge of collecting multiple data sources into one place

Insights data sources are not the best fit for consolidation in a central data warehouse.  The data schema of many insights datasets is constantly in flux because of new customer behaviors and data variables.  For example, survey data often changes with every new iteration of the survey (new questions, new response values, etc.)  Other data sources like social listening will have high variability as new topics emerge within organic conversations.

Many of the core data tools in the research & insights space are legacy tools without a formal API for programmatic extraction, transformation, and loading into a warehouse.  As a result, insights analysts resort to manually logging into these tools, configuring parameters, and downloading raw CSV data files.  This process may need to be repeated multiple times for different markets, segments, or other cuts of the data which means tens or hundreds of CSV files need to be manually stitched together in Excel.  If the data volume is significant, analysts run into the dreaded Excel row limit or even Excel crashing at higher data volumes.

Beyond structured data:  The challenge of working with semi-structured and unstructured data

Survey data often has an accompanying data map with semi-structured data (e.g. in JSON or XML format) describing the logical relationships and mappings needed to convert a raw responses file into a schema that can be analyzed by humans.  To effectively convert data using these data maps requires someone with coding knowledge (Python or R).  More importantly, there are often variable groupings that result from the survey structure that can’t be captured and analyzed in a traditional data warehouse schema (e.g. multi-code survey questions with multiple columns representing answers to a single question ‘group’ that need to be analyzed with crosstabs).

In other cases, there are challenges related to making sense of unstructured data from sources like social media, reviews sites, or survey open-ended responses.  Individual data vendors may have home-grown text analytics tools to help make sense of an individual data source.  However, these text analytics tools differ in their methodology, which leads to inconsistent application of text analytics across different sources leading to incomparable results.

No-code automation tools like Redbird give insights teams the ability to use flexible ingestion tooling to easily handle all of these challenges in a fraction of the time, and free up time for higher value insights work.

Uncovering Insights Requires Deeper Analytical Approaches

Uncovering meaningful customer insights requires intensive effort and multiple analysis iterations.

Analysis without a hypothesis

Insights teams often don’t have a hypothesis for what they want to analyze.  Currently they use manual approaches to organically discover customer segments or identify themes.  Data science-driven approaches can help do the same discovery at a larger scale, more accurately, and without inherent researcher bias.  For example, unsupervised learning algorithms can be used to automatically identify unknown segments based on different input variables.  Clustering algorithms can also be used to organically detect themes in unstructured text and save hours of human review of text documents. 

Flexible exploration and validation

Another typical part of the insights process involves exploring different cuts of the data using crosstabs or pivot tables, which can involve many manual iterations to uncover an insights story.  Statistical methods like significance testing also need to be applied to validate when a pattern is robust enough to constitute an insight.  Automated approaches can help to rapidly test out all potential combinations and quickly hone in on significant patterns.

Linking Datasets in the Absence of Panel Overlap and Respondent Level Data

Panel overlap between data sources is rare and expensive.  Even for large research providers like Nielsen who manage multiple panels under one roof, it is a challenge to connect all their data assets.

Some companies try to solve this problem by using a 1st party customer database as the unifying core for multiple insights data sources built on top of the customer database.  This is expensive to maintain and difficult for many organizations who don’t have the necessary specialization in house.  More importantly, it restricts the research to only current customers as opposed to the broader population or market.

Data science is changing all of this and giving brands the power to create unified views of segments across multiple panels or data sources without the need for respondent level data.  Attributes can be mapped across different data sources (e.g. individual demographic or psychographic variables are mapped across two or more datasets).  Once mappings are in place, data can be statistically normalized and aggregated to a segment level and then fed into multivariate proximity modeling to link segments across the data sources algorithmically.  This allows the creation of unified insights for a given segment using not just 1 data source, but many.  With this approach, for example, brands can uncover their core audience segments and know with confidence their purchasing patterns, online browsing behaviors, and media consumption - enabling a holistic understanding and feeding into marketing strategy across the customer journey.


With the right analytics operating system, all of the advanced analytics work described above can be recorded into automated workflows that allow for rapid analysis refreshes.  This has a huge impact for insights teams by empowering them to run analysis more frequently, with less manual effort, and with a timeliness that enables taking action from a marketing or product development perspective before competitors are aware of the same trends.