Sharing Genome Data

sharing genome dataThe enormous progress in genetics and genomics over the last few decades has been underpinned by a spirit of openness and collaboration. Many scientists have been fully committed to the concept of data sharing, without which it would be impossible to even contemplate the application of genetics and genomics to medicine.


Summary data vs individual data

Sharing anonymised summary data has become routine and widely accepted. For example the Exome Aggregation Consortium (ExAC) makes available anonymised genetic variant summary data from >60,000 people. This summary data has been exceptionally useful for research and medicine.

It is not possible to know which genetic variants any given individual in ExAC has, which allays the concerns many have about individual genetic data potentially being used inappropriately or illegally. However, not being able to access the data at an individual level limits the use of this formidable dataset in certain contexts. The general principles for how one might share individual-level genome data are being widely discussed.


Sharing individual data with research participants

A very important consideration is the wishes of research participants themselves. As the number of people who have their exome or genome sequenced increases it is likely that some of them will want access to the full individual-level data, not simply the headlines. This could be for all sorts of different reasons – a desire to analyse the data themselves, or to seek out an analysis, focusing on specific diseases; an interest in their ancestry; or simply because it ‘belongs’ to them.

Should researchers provide genomic data to these individuals if they request it? We recently addressed this question in a paper published in Wellcome Open Research .


Logistical challenges

There are a number of practical implications of returning genome data to individuals. In particular, it’s absolutely critical to confirm that the data is derived from the individual requesting it, to prevent inadvertently and inappropriately sharing someone else’s data!

In addition, it is likely that bespoke informatics support will be needed to provide data to someone outside of the research team. Genome data can come in many forms, from the original raw files, which are huge, to various types of processed information. Whatever the data, it needs to be provided in a format that can be easily accessed and used, checked meticulously for any errors and transferred in a secure way. All of these activities require additional resources and time.


Ethical and legal challenges

There are also some ethical and legal implications of returning genome data to individuals. Although some individuals will find benefit from mining their own data, it is likely that there will be some unintended, and potentially negative, consequences in some individuals. For example, over-interpretation or mis-interpretation of the data is likely to occur, as we described in a previous post.

Returning family data is also tricky, particularly if it includes individuals who lack the capacity to give consent, such as children.

Any research study that wishes to return DNA sequence data to individual participants will need to consider these ethical and legal issues carefully. And a central consideration is to what extent is the research team responsible for mitigating any harms to the participant themselves, their family members, the healthcare system, or society at large?


It’s not free!

Unless appropriate resources to deliver and manage the process are available, the return of genomic data to research participants could do more harm than good

Returning individual-level genomic data to research participants potentially has both positive and negative consequences. It also takes time and resources which could mean that the research team have to delay or compromise their main work.

Researchers have a duty to society and their funders to make the best use of finite resources, and must balance their responsibilities to individual participants with broader responsibilities to the whole group of participants and the creation of generalizable knowledge.

So unless there are appropriate resources available to deliver and manage the process, the return of genomic data to individual research participants could do more harm than good.


Sharing clinical genetic data

Many of the issues discussed above also apply to clinical testing. There are already excellent databases such as DECIPHER that have been instrumental in improving the accuracy and safety of genetic diagnosis by sharing individual genetic variants and connecting clinicians caring for patients with similar genetic conditions.

However, questions still remain about how much clinical genetic data we should be sharing, and with whom. For example, sharing one potentially interesting genetic variant between clinical labs is very different from putting a whole genome sequence online!

To find an appropriate balance between the potential benefits and risks of data sharing, we need to consider both the depth and breadth of the data, as well as who will be able to access it. And we need to keep closely aligned with what the people whose data is being shared want, ensuring that they are aware of the personal and broader impacts of those choices.

Leave a Comment

Your email address will not be published. Required fields are marked *