New technologies are generating significant amounts of data in the healthcare and life sciences sectors and this is viewed as a positive development for these markets.
Data is no longer just being produced from lab equipment and corporate research projects, but that data is now also coming from consumer devices, from new-to-the-game companies like Apple and Google.
The increasing public popularity of wearable devices and mobile health/digital health applications, coupled with growth in the use of social media and analytics means that more and more data streams are now available to medical researchers looking to extract meaningful information.
These new technology companies (Apple, Google and so on) that have historically never been in the healthcare market, are very much part of it now.
And because of their reach, and the public’s widespread acceptance of the technology they use to gather data (as evidenced by the success of the Apple Watch and all the companies producing ‘fitness wearables’ such as FitBits), huge numbers of people are generating huge amounts of data that traditional institutions in these markets would very much like to get their hands on.
One of the critical factors in the use of this new data is that irrespective of how data is sourced, be it from a scientific journal, an Electronic Media Record (EMR), a social media post or a wearable device, the data can only be analysed effectively if it has been semantically organised.
In other words you need to be able to make use of it. However, another major factor in the successful use of this data is the extreme necessity for it to be absolutely 100% trustworthy and authentic, and for it to be usable and accessible for years to come.
A clear data management strategy needs to underpin the use of such data, irrespective of its source (but especially if it’s coming from new, varied and largely untried sources such as consumer wearables).
Data harmonisation and discovery are key to extracting meaningful information from the data.
This also needs to be underpinned by data integrity, as the more data there is, the greater the risk that a tiny percentage of it is unusable, and the impact of this could be huge on a healthcare organisation or research project.
The Bristol Genetics Laboratory which is based at Bristol Southmead NHS hospital delivers routine genetic testing services to the South West region, a population of approximately five million, as well as providing highly specialised services to the rest of the UK and internationally.
With the help of Arkivum, the laboratory implemented A cloud-based service to facilitate the storing of gene-sequencing data and digital imaging from the 30,000 genetic investigations it conducts annually.
This includes a wide range of sample types using a range of molecular and cytogenetic techniques. Many of these investigations, including those conducted using its Illumina NextSeq and MiSeq next generation sequencing (NGS) platforms, are resulting in a massive increase in the size of data sets (exacerbated by the fact that as the testing gets cheaper, clinicians are requesting more tests and thus increasing diagnostic yield).
So right now, big data is becoming a very real challenge for all genetics laboratories, not just Bristol. Over the next two years, Bristol Genetics Laboratory will generate more than 30TB of data from NGS services alone.
The true value of that data is not yet known so it must be stored for the very long-term to be able to access and interrogate it over time as new data analysis techniques are developed.
Sourced from Nik Stanbridge, VP marketing, Arkivum