You are here

HostSeq: Enabling data sharing to tackle COVID-19 and future health challenges

Friday, October 1, 2021

Q&A with Dr. Ma’n H. Zawati, Assistant Professor, McGill University’s Faculty of Medicine and Health Sciences; Executive Director, Centre of Genomics and Policy in the Department of Human Genetics; Lead, HostSeq Data Access Compliance Office.

Dr. Ma’n H. Zawati

The CGEn HostSeq Databank is building a vitally important new infrastructure for research collaboration to tackle COVID-19 and other major health challenges facing communities across Canada. Powered by genomic data—generated through the sequencing of up to 10,000 human genomes as part of the Canadian COVID-19 Genomics Network (CanCOGeN) HostSeq initiative—the databank is the first of its kind in Canada.

Qualified researchers can apply to access the databank through HostSeq’s Data Access Compliance Office (DACO), which oversees the overall process for data requests and includes a review by an independent Data Access Committee (DAC). Access requests are reviewed based on a set of criteria, that include, but is not limited to compatibility of the proposal with HostSeq’s objectives and feasibility of the proposed project.

APPLICATION (1) Principal investigator (PI) submits application for access to HostSeq Databank > DACO* REVIEW (2) Application is reviewed by the HostSeq DACO for compliance with HostSeq policies > AGREEMENT (3) Upon approval, Data Access Agreement signed between PI and CGEn > CGEn HostSeq Databank  * HostSeq’s Data Access Compliance Office

The work of HostSeq, comprising a series of cohort-based projects, aims to better characterize the role of human genetics in COVID-19 disease, deliver new biomarkers that predict who is at highest risk, and implement appropriate healthcare strategies, including potential precision therapies.

We asked Dr. Ma’n Zawati, lead of the HostSeq Data Access Compliance Office, to tell us more about data access through HostSeq and how it’s helping Canada tackle COVID-19. Dr. Zawati’s research concentrates on the legal, ethical, and policy dimensions of genomics and health research, with a special focus on biobanking, data sharing, professional liability and the use of novel technologies in both the clinical and research settings.

Read our Q&A with Dr. Zawati below. 


“The idea is not just to enable data sharing for COVID-19, but also for other health outcomes and to strategically plan for future public health challenges.” — Dr. Ma’n H. Zawati


What excites you most about your work with CanCOGeN?

We are all still living through a very difficult time, and there continue to be many heartbreaking stories. At the same time, the pandemic has prompted many of us to forge new connections—as countries, as organizations and as researchers. It is a privilege to be able to work with an interdisciplinary group of experts on fighting COVID-19 by understanding the virus that causes it and its impact. The policy work we, at the Centre of Genomics and Policy, are doing through HostSeq aims to facilitate this understanding and provide researchers with useful tools to lay the groundwork for a stronger response to the next major public health challenge.  

How is the CGEn HostSeq Databank organized?

The Databank is divided into two datasets. The first contains open access data, which includes aggregated genomic information as well as aggregated personal health information (e.g. age groups, ancestry, pre-existing conditions and smoking status, to name but a few). The second dataset contains controlled access data with individual genomic information, such as whole genome sequencing data, as well as individual personal and health information (e.g. clinical information regarding a persons’ disease symptoms).

What are some key questions HostSeq data could help us address?

Data generated through HostSeq is helping us identify genetic risk factors for viral infection. It is also helping inform our public health response to COVID-19 as well as our response to future pandemics and public health challenges. One example of this is researchers using the data to identify potential targets for therapeutics, and in the development of targeted vaccines or other treatments. This will be very helpful for the research community.

What role does HostSeq’s Data Access Compliance Office (DACO) play in reviewing data requests?

DACO works to facilitate and streamline the process of accessing the controlled data that’s available—ensuring data requests are evaluated in an efficient and equitable manner. We ensure administrative review of each application in a timely fashion before it is sent to an independent Data Access Committee (DAC) for final review. The DAC brings together Canadian experts who are knowledgeable about different fields of work related to COVID-19, whether it’s infectious diseases, IT, genomics or ethical issues, to review each application. The Committee also includes a patient representative.

Who can access the data collected through HostSeq? What is required to access the data in terms of qualifications?

The host data is available to Canadian and international researchers from different disciplines, whether they’re in academia, private industry, or hospitals. Our analysis of each data request is based on the research question the team is seeking to answer. We look at the qualifications of the research team members, and the scientific merit and feasibility of the project. Researchers have typically had a research ethics board approve the project or provide a waiver before it comes to us. Once the DAC approves the application,  an agreement is signed and access is given for one year. HostSeq is built for Canadian contribution and accessibility, but also with the goal of international participation.

How is patient privacy protected?

HostSeq has strict data security and privacy safeguards in place, which we take very seriously. The database does not collect or store information that directly identifies participants. As mentioned earlier, when controlled data is provided to researchers, they must also sign an agreement, which binds them and their institution(s) to very clear conditions. For example, researchers cannot share data from HostSeq with unauthorized parties. For us, this is a matter of reciprocity. When participants share their data with us, they expect, among other things, that we protect their confidentiality and respect their contributions. Doing so will only sustain the trust that is at the core of these endeavors.


The Canadian COVID-19 Genomics Network (CanCOGeN) is on a mission to respond to COVID-19 by generating accessible and usable data from viral and host genomes to inform public health and policy decisions, and guide treatment and vaccine development. This pan-Canadian consortium is led by Genome Canada, in partnership with six regional Genome Centres, the National Microbiology Lab and provincial public health labs, genome sequencing centres (through CGEn), hospitals, academia and industry across the country.