Data science has increased in popularity over recent years as organizations realize that the challenges they face can be addressed in whole or in part by understanding the data available to them. The field of data science is still relatively new and is founded on several skill areas.
Core skill areas
Data science includes a number of key capabilities:
- Mathematics and statistics help provide a rigorous analytical framework that can be invaluable when making decisions to address inherently amorphous business challenges
- Computer science supports programming as it provides the theory to formalize approaches for real-life data challenges
- Domain knowledge supports the development of expertise by providing reference points for hypotheses generation, whether data driven or by expert judgment
Combining these three core skill areas with the right technology and processes enables data scientists to help organizations gain value from data (see figure 1).
Moving through skills boundaries
The interplay of these skills areas represents increased capability and learning trajectories for people. For example, an analyst with domain knowledge and coding skills can write programming scripts to work with more data and automate key tasks to become more self-sufficient and capable. As another example, the difference between a good model and an excellent one can be the insight from a domain expert (for example, a marketing professional, physician, or insurance specialist).
Machine learning is attracting increasing interest due to applications as wide ranging as identifying objects in images to translating human language. It sits squarely between mathematics and computer science and requires strong knowledge of both skills to generate the greatest results. Given the wide range of talent areas associated with data science, it is vital to build teams with complementary skills.
We are passionate about analytics education at Genpact. Moreover, we have an interest in fostering a data-driven culture with our partners as it enables several important things ranging from functional to industry leading. As more teammates hit internal repositories, data veracity increases through applied data testing, which can then support increasingly advanced projects providing differentiation that has an impact on an industry and society.