In InsideBigData, Within3 CTO Jason Smith explains why some AI models fail to deliver on their full potential for life science: poor-quality data.
“The data deluge has swamped all industries, but none more so than life science,” explains Smith. “Life science teams face a challenge keeping pace with the number of online channels where opinions are shared and information can be mined.” However, quantity does not always equate to quality. To combat this, companies must adopt a data-centric approach, shifting away from large volumes of information to smaller samples with higher-quality data sets to train artificial intelligence models.
“To make matters worse, a significant proportion of life science data is ‘dirty’ – inaccurate, incomplete, or inconsistent – and not immediately usable.”
Life science data is often unstructured, coming in the form of typed MSL reports and field team observations that can vary drastically in length, format, and even language. Many healthcare organizations have fully migrated to electronic medical records (EMRs), but some have only partially migrated, while others are yet to begin the transition. These disparate and often-inconsistent data streams mean that life science data sets must often be cleaned before they are used to train effective models.
“Thus far, AI adoption in life science has been a mixed bag,” says Smith. “In many cases, projects have gone awry not because the technology is immature but because the data it’s based on is unclean, unstructured, or ringfenced by regulations.”
“As AI moves from a “nice to have” to a “must have,” companies and their leaders should build a vision and strategy to leverage AI, then put in place the building blocks needed to scale its use.” – Deloitte
Read the full article at InsideBigData.