When embarking on the data exploration phase of a data analysis project, it can sometimes be difficult to readily spot patterns and relationships in a dataset, especially if there are many columns or if you do not already have some notion of where to begin your analysis.
I recently spoke with a data analyst analyzing manufacturing data for process improvements. The raw data was in the format of a log file and needed a great deal of data preparation prior to analysis. While there was some idea of potential items to look at, there was no definitive variable to start with.
Ultimately, the analyst needed to determine which variables are correlated or relevant to efficiency improvements in the production line. Visual data tools for data preparation and analysis, enabled the discovery of a potential time-savings opportunity when looking at the queuing time on the production line. This insight was unexpected, as the station level process was previously determined to be maximally efficient. Further data collection targeted at this system and the queuing process proved a clear opportunity to improve operational efficiency.
Why Use Visual Data Tools?
You may be thinking, “What are visual data exploration tools? How are they different from any other typical method for data exploration?” Let's dive into that briefly. Anyone engaged in data analysis is going to be familiar with tabular data views, cell level highlighting, and graphs. Visual data tools go beyond this by allowing users to quickly explore data through a visual UI and/or code, depending on their preference.
Visual interfaces for data exploration dramatically improve the time spent and results of the data exploration phase. Data scientists and analysts receive data in a variety of disparate formats, each typically requiring different approaches to data exploration. Visual data tools allow us to explore data relationships visually, regardless of data format, to find both quality issues and interesting aspects of the data during initial data preparation. Visual interfaces for data exploration also offer greater flexibility by enabling multiple branches of the exploration and analysis without losing the prior work.
Visual data tools make statistical information accessible throughout the data preparation and analysis process, even using AI to identify patterns for the analyst and make them visible for investigation. Transparency of the data exploration process enables reusability, compliance, and governance for organizations. Good visual data tools reduce days of work into hours.
How Visual Data Tools Help You Dig Deeper to Improve Processes
As mentioned above, the analysis revealed potential improvements in a system thought to be optimized. Prior analyses and optimization efforts focused on each automated workstation independently. This new analysis utilized the assisted data exploration tool and a pattern of station downtime was identified across larger segments of the line inclusive of multiple stations.
By exploring the data from many stations on the line simultaneously as a group, they discovered an efficiency improvement that was not identified when optimizing an isolated station alone. Utilizing the queuing downtime across workstations reduced the individual station downtime and improved the overall line throughput. What previously seemed like a small unavoidable downtime, when capitalized upon, produced notable improvement in line throughput.
Visual data tools accelerate the data exploration process and provide insights that are not easily discovered. A Dataiku project offers data analysts visual tools throughout the process, providing useful insights each step of the way.