Analytics & Big Data
Sep 01, 2015

Generating new forms of business value with text analytics

September 1, 2015 - Big data analytic tools and programs that are today available to anyone with an internet connection have integrated semi-structured and unstructured text data into the healthcare analytics ecosystem, and enabled institutional informaticians to scrutinize frontiers previously out of reach.

What lies beneath the surface?
IDC forecasts show that 90% of data generated over the next five years will be unstructured, and that will grow, during that same period, at an astonishing rate of 800%. Of course, this statistic does not account for the volume of text data that resides unused in databases today. The idea behind coalescing unstructured and structured data sources is that, by applying important contextual dimensions, you are able to do more than merely scratch the surface of data. Indeed, context is everything when it comes to reconstructing business narratives, and text analytics provides the techniques necessary to draw a more complete picture.

Demystifying text analytics – test, learn, and compare
Before getting started with a text analytics project, there are a few fundamental points you should to be aware of as you sketch out your project plan.

  • Commit yourself to exploring the potential – Don't attempt to learn and understand the intricacies of text analytics and text mining during off hours. Give yourself time to understand the subject, the business problem you are looking to address, and the value-swap of committing time to a new competency your organization can access in the future. But without the freedom to explore the potential of this new analytics approach, the idea will never make it off the shelf.
  • Analyze a domain-specific, syntactically shallow data source – Identify a free text data source that is not only linguistically well defined, but also confined to a particular customer, use case, or business function. This helps ensure that you reach quick results for accurate evaluation of what can turn into a longer-term business case. Live nurse notes from a core chronic-condition-management program, for instance, would be an ideal candidate.
  • Identify a discrete and well-defined business problem to tackle – Human discernment plays a key role in text analytics as opposed to relying on traditional programming. Once you discover a meaningful set of practical insights from the text data you are mining, don't get paralyzed by the intermediate results—move on!
  • Start by analyzing word frequencies and co-occurrence – Utilize basic precision and recall measures to understand the language subtleties in your data set. Begin the process of classifying your results into topics and concepts in anticipation of future standardization into structured form.
  • Measure residual value gained through text analytics – Did you identify unforeseen factors of interest related to improving customer experience? Did you identify patient safety issues not captured by current reporting and analytics? Were particular judgments confirmed, or otherwise clarified, by having greater depth around the data set? Answering these and other questions will help you and your leadership team determine if further investment is necessary to drive new forms of value for your business.

The role of technology in scaling text analytics for a growing list of use cases
Genpact Analytics has designed a comprehensive Unstructured Data Exchange (UDX) framework that shortens the time it takes organizations to transform text data into analyzable form. By focusing on Ingestion, Mapping, and Transformation requirements, UDX provides the machinery to extract and organize concepts, and enables corpus-wide exploration to avoid transporting data between analytical systems—which could cause you to miss high-return business opportunities. Utilizing technical resources that are designed to handle the nuances of text analytics ensures you have built a generalized approach that can be easily repurposed for any use case, regardless of complexity or dimensionality.

Conclusion
Text analytics is a novel approach to understanding not just data, but the context to that data. Using macro analytic techniques and models to grasp details currently excluded or truncated from analysis supports a more sophisticated understanding of healthcare business operations, and generates in turn new forms of business value. Take the time to select your data source, identify a business problem, analyze word frequencies, and ponder the results, and you'll discover the value of adding text analytics to your business.

Author: Jorge Fuentes - Assistant Vice President, Healthcare & Life Science Analytics