A series of bioinformatics workshops on the analysis of target capture sequence data are being offered to conference participants as part of the Bioplatforms Australia Genomics for Australian Plants (GAP) initiative. Among the aims of the GAP initiative is to build capacity in the management and application of genomic data and to provide tools to enable genomic data to be used to identify and classify biodiversity at a range of scales.
The GAP phylogenomics-bioinformatics working group has combined newly developed and existing scripts into an integrated workflow for the assembly of target capture data. The group is now offering both theoretical webinars and hands-on training workshops in these pipelines. This series of bioinformatics training events will be delivered virtually in collaboration with the Australian BioCommons in conjunction with the ASBS conference.
Two webinars will provide an overview of the challenges of conflict within target capture datasets and strategies to employ during analysis. These lunchtime talks are freely available to the public on the 20th of May and 10th of June. Further information including registration details is now available on the Australian BioCommons website for Conflict in multi-gene datasets: why it happens and what to do about it - deep coalescence, paralogy and reticulation and Detection of and phasing of hybrid accessions in a target capture dataset
For participants who would like to take the next step and get hands-on, a series of three interactive in-depth workshops will be delivered between 5th-8th July, prior to the ASBS conference. The workshops are suited to researchers analysing target capture datasets and will provide hands-on training in the use of workflows covering the processing of raw sequence reads, as well as strategies for resolving paralogy and hybridisation.
Workshop 1 - GAP phylogenomics bioinformatic pipeline – part 1: Assembly of raw reads using HybPiper
Workshop 2 - GAP phylogenomics bioinformatic pipeline – part 2: Yang and Smith paralogy resolution
Workshop 3 - HybPhaser – Detection and phasing of hybrid accessions in a target capture dataset
Costs and registration
These hands-on training sessions will cost $20 per workshop and places will be capped at 20 people per workshop, so be sure to register your interest early through the link to the expression of interest form below. Places are open to ASBS conference delegates and priority will be given to participants in the Genomics for Australian Plants consortium.
TO REGISTER YOUR INTEREST IN PARTICIPATING IN THE WORKSHOPS ON ANALYSING TARGET CAPTURE DATASETS PLEASE FILL OUT THE EXPRESSION OF INTEREST FORM HERE: https://docs.google.com/forms/d/e/1FAIpQLSfh-x67c8P4LNpB-OYvJ3TrZAsajQL4OFgcpfTEzH8CpCEctg/viewform
Expressions of interest to participate in the workshop series will close on June 13th!
Workshop participants will require a computer as well as a reliable internet connection. The workshops will require the installation of the freely available software VirtualBox and Vagrant, which enable a virtual Linux environment that contains all required software independent of the participants operating system. All participants will be required to attend an onboarding session where support is available to install these applications and the virtual environments.
Day 1: Onboarding Session
Monday, 5th July 2021
Assistance with the installation of software required to participate in the workshops.
Day 2 : Workshop 1- GAP phylogenomics bioinformatic pipeline – part 1: HybPiper and tree reconstruction in IQ-TREE
Tuesday, 6th July 2021
The workshop will demonstrate a user-friendly Nextflow container based on the HybPiper software for the assembly of raw reads of target capture / target enrichment data and the interpretation of its results. The outputs will be used to demonstrate tree inference with IQ-TREE, but the workshop is not dedicated to exploring phylogenetic analysis in depth.
Participants must have attended the onboarding session and become familiar with reference materials that will be provided prior to the workshop.
Day 3 : Workshop 2- GAP phylogenomics bioinformatic pipeline – part 2: Yang and Smith paralogy resolution
Wednesday, 7th July 2021
Paralogy is the existence of several variants of the same gene in one species that are derived from gene or genome duplication events. If paralogs from different species are analysed together in the same alignment, they may mislead phylogenetic inference. The workshop will demonstrate a user-friendly Nextflow container implementing the paralogy resolution approaches of Yang & Smith (2014). They comprise four different ways of using gene tree topologies to identify ortholog groups for each gene, with varying degrees of stringency.
Participants must have attended or watched the recording of Webinar 1: Conflict in multi-gene datasets: why it happens and what to do about it - deep coalescence, paralogy and reticulation, have attended the onboarding session and workshop 1 (or have a good knowledge of the HypPiper pipeline)
Day 4 : Workshop 3 - HybPhaser – Detection and phasing of hybrid accessions in a target capture dataset
Thursday, 8th July
Hybrids (and polyploids originating from a hybridisation event) contain divergent alleles that can introduce conflicting signals in phylogenetic analyses. Separating the divergent reads according to their haplotypes can reduce the phylogenetic conflict and provide insights into past reticulation events. This workshop will explore the workflow HybPhaser, which can detect hybrids in target capture datasets and phase hybrid accessions to reveal their parental lineages. We will demonstrate the application of the workflow on a test dataset and compare phylogenies with and without hybrid phasing. This workshop is an extension of workshop 1, as HybPhaser requires an assembly using HybPiper. Participants are expected to be familiar with HybPiper or have taken part in Workshop 1 on 6th July.
Participants must have attended or watched the recording of Webinar 2: Detection of and phasing of hybrid accessions in a target capture dataset, have attended the onboarding session and workshop 1 (or have a good knowledge of the HypPiper pipeline).
Chris Jackson – Royal Botanic Gardens Victoria
Chris Jackson is a Research Scientist (Bioinformatician) with a background in genomics, phylogenomics and endosymbiosis and collaborates with other researchers on diverse projects including plant and fungal genome evolution, phylogenomics, and population genetics. Chris provides bioinformatics support for the Genomics for Australian Plants initiative and has developed the bioinformatic pipeline utilised in the GAP phylogenomics project which will be presented in workshops 1 and 2.
Alexander Schmidt-Lebuhn – CSIRO
Alexander is a CSIRO scientist at the Centre for Australian National Biodiversity Research in Canberra. His research interests include the systematics and evolution of flowering plants, in particular of Asteraceae (daisy family), biogeography, user-friendly species identification tools including through the application of computer vision, and polyploidy. Alexander leads a Future Science Platform Environomics project on high-throughput sequence capture and the Phylogenomics Bioinformatics Working Group for the Genomics for Australian Plants Initiative’s Australian Angiosperm Tree of Life project.
Lars Nauheimer – Australian Tropical Herbarium
Lars is a post-doctoral research fellow at the Australian Tropical Herbarium. His research interests include resolving phylogenetic relationships to reconstruct divergence and dispersal in time and space. Lars’s research is focused on resolving evolutionary relationships among Australian orchids, specifically Thelymitra and Diuris, and also provides bioinformatics support to systematics research undertaken at the Australian Tropical Herbarium. Lars is experienced in the bioinformatic analysis of short read next generation sequence data and has developed a novel workflow for the detection and phasing of hybrids in target capture datasets that will be presented in workshop 3.
Theodore Allnutt – Royal Botanic Gardens Victoria
Theo is a bioinformatician at the Royal Botanic Gardens Victoria providing support to the genomics for Australian Plants initiative. Theo has over 25 years’ experience in plant molecular biology research and biological statistics, and ten years' bioinformatics experience. Theo has worked for the UK Government, European Commission, Universities and Research Institutes on a wide range of molecular biology projects.
Todd McLay – Royal Botanic Gardens Victoria
Todd McLay is a Postdoctoral Fellow (Pauline Ladiges Plant Systematics Research Fellow) with a background in genomics, phylogenomics and taxonomy. His interests include the use of next-generation sequencing methods to explore the systematics, biogeography, and taxonomy of the Australian and New Zealand flora and his work is currently focused on the Australian Hibisceae (Malvaceae), Corymbia and Eucalyptus (Myrtaceae), Xanthorrhoea (Asphodelaceae) grass trees, Orchidaceae, Monotoca (Ericaceae) and native citrus (Rutaceae). Todd has recently led the development of a novel pipeline for improving locus recovery in target capture datasets generated from the Angiosperms353 probe set.
Lalita Simpson – Australian Tropical Herbarium
Lalita’s research interests include investigating the evolution of plant groups through a phylogenetic framework with an aim to understand the processes that drive the spatial distribution of biodiversity at both deep and shallow phylogenetic scales. Her research utilises genomic data to reconstruct and resolve the evolution, historical biogeography and classification of tropical orchids including Bulbophyllum, Dendrobium and Cymbidium. Lalita is engaged with the Genomics for Australian Plants Initiative as the Research Community Project Manager for the phylogenomics project.