Session Details: GA4GH April Connect 2026

Name

Provenance metadata for genomic datasets

Date & Time

Thursday, April 16, 2026, 4:00 PM - 5:30 PM

Description

Understanding the context and origin of genomic datasets is critical for discovery, appropriate reuse, and giving credit to data generators. Yet, basic provenance information (who created a dataset, under what project, with what funding, where it has been published) is often incomplete or inconsistent across repositories.

This working session will:

map the current landscape of provenance standards (W3C PROV, FAIR principles, domain-specific approaches);
identify concrete gaps where GA4GH could add value;
examine how Beacon summaries and other Discovery products could expose provenance information;
develop GA4GH recommendations for applying existing standards to genomic data sharing.

We are seeking input from data producers, repository managers, and researchers who need to answer questions like: "What project generated this cohort?"; "Where was this dataset collected?"; "What publications describe or use this data?"; "Who funded this work?"; and "How large is this resource?" The goal is not to create new standards, but to provide clear guidance on using existing provenance frameworks effectively in genomic contexts.

Please log in to join sessions virtually.

Location Name

Joyce (Level A)

Virtual session link

Join Session

Agenda

Agenda

Session topic(s)

Discovery

Session format(s)

Working session: collaborative work toward a specific goal or consensus, Group Discussion: particular topic to gather feedback, perspectives, or ideas

Suggested level of familiarity

Level 2: Have a basic understanding to follow the conversation and contribute thoughtfully