Artificial Intelligence
Nov 14, 2018

The content factory of machine learning

Machine learning takes more than good AI, you need people to work the data

In any artificial intelligence (AI) or machine learning project, 80 to 90% of the work centers around preparing the data used to train the machine. Staff go to work in something similar to a “content factory."

In the content factory, workers are not breaking a sweat operating heavy machinery, but rather using data engineering skills and domain knowledge of industry and business to give shape and context to datasets. You need both data engineering and domain experts to realize machine learning's full potential. Here is what they do:

Gather the data and map it out

Sometimes called “data wrangling," data engineering is all about pulling together data and molding it into something digestible. After all, data can be structured, unstructured, or ambiguous. You have to first get the information out of multiple systems in order to then build models to map the data's behavior – that's the first step.

Find gaps in the information and fill them

You have to play with the data to find if it's complete and can serve the project goals, as there can be gaps in the information. For example, TV production companies need to figure out when to air ads based on time and audience. They typically have a sheet of data and tools that say what TV show to play an ad on and how long it should air. There might not be demographic and regional information. If they had such information, a company could use machine learning to determine which ads are the most effective by location and better target future placements.

Show the machine the meaning

Data enrichment is possibly the most complex and important piece of machine learning. It takes domain experts who understand how the business works to label the data with context and give it meaning.

For instance, an amusement park can use an online chatbot to interact with possible visitors looking for quick information like admission prices. While today's bots can deliver simple, scripted answers, most lack the ability to have heartfelt conversations. To make empathy possible, chatbots have to be able to interpret messages and have a clear goal like converting a prospect to a customer. A domain expert can provide an idea of the types of sentiments that humanize the process and turn interactions into revenue.

Don't let the machine run amok

A machine can now string together business rules and recommendations. But how do you know it did the job right? You can't unless you check it. You need governance, or “supervised learning" by domain experts, to see that the machines are connecting the dots correctly. Experts can look over the rules and manually validate results.

For example, automotive insurance companies can train a system to simplify its payout process using millions of images of previous accidents. The trained model can then evaluate the degree of future accidents and recommend a payout amount. But someone should sit down and review the machine's recommendation – otherwise, it might determine a car is totaled when it's actually repairable.

About the author

Vikram Mahidhar

Vikram Mahidhar

Business leader of AI solutions

Follow Vikram Mahidhar on LinkedIn