Peter Breslin

Peter Breslin

Dec 16, 2017

Group 6 Copy 139
0

Analyzing data

Part of the challenge of what I'm doing with the population genetics piece will be that using genomic data to capture population level variation is relatively new. The high throughput next generation sequencing method called RADseq is also relatively new. It has been deployed in several studies in order to clarify species level, and even infraspecific level, differentiation among taxa.

Here is a link to a good overview from 6 years ago:

https://academic.oup.com/bfg/article/9/5-6/416/182576

More traditional population genetics approaches involved microsatellites or other short sequence markers-- weird little genomic sequences not subject to selection pressure where there is single nucleotide (or sometimes multiple) variation across a species. The challenge involved with these methods is that it is time consuming and painstaking work to locate the microsats, develop primers for them, amplify them and then analyze the resulting data. The approach sometimes is very powerful, but sometimes also lacks much fine resolution and shows no genetic structure when it is fairly clear from a morphological and biogeographic perspective that there must be.

The below image is an outline of the RADseq process, from the above linked paper:

The process of RADSeq. (A) Genomic DNA is sheared with a restriction enzyme of choice (SbfI in this example). (B) P1 adapter is ligated to SbfI-cut fragments. The P1 adapter is adapted from the Illumina sequencing adapter (full sequence not shown here), with a molecular identifier (MID; CGATA in this example) and a cut site overhang at the end (TGCA in this example). (C) Samples from multiple individuals are pooled together and all fragments are randomly sheared. Only a subset of the resulting fragments contains restriction sites and P1 adapters. (D) P2 adapter is ligated to all fragments. The P2 adapter has a divergent end. (E) PCR amplification with P1 and P2 primers. The P2 adapter will be completed only in the fragments ligated with P1 adapter, and so only these fragments will be fully amplified. (F) Pooled samples with different MIDs are separated bioinformatically and SNPs called (C/G SNP underlined). (G) As fragments are sheared randomly, paired end sequences from each sequenced fragment will cover a 300–400 bp region downstream of the restriction site. Put simply, the goal in analyzing the resulting sequences is to detect polymorphisms in the genomes between individuals, or, ultimately, between separated populations.

This data is useful in understanding the impact of the landscape on gene flow, a crucial consideration that was not within reach for conservation biologists until the advent of these methods.


0 comments

Join the conversation!Sign In

About This Project

I am developing and applying innovative population genetics data analysis with habitat suitability modeling to solve longstanding challenges in the effort to save endangered plant species. Combining high throughput RADseq data analysis with species distribution models, I am exploring relatively inexpensive, feasible methods to generate powerful population viability assessments, estimates of threshold population size and the constraints on the habitats of rare plants.

Blast off!

Browse Other Projects on Experiment

Related Projects

Wormfree World - Finding New Cures

Hookworms affect the lives of more than 400,000,000 men, women and children around the world. The most effective...

Viral Causes of Lung Cancer

We have special access to blood specimens collected from more than 9,000 cancer free people. These individuals...

Cannibalism in Giant Tyrannosaurs

This is the key question we hope to answer with this study. This project is to fund research into a skull...

Backer Badge Funded

Add a comment