Herman Mays

Herman Mays

Aug 21, 2017

Group 6 Copy 227
3

Putting together a genome. Part I.

Putting together a genome is like assembling a puzzle. Typically genomes are sequenced in short fragments, maybe only 250 letters (or bases) of sequence data per fragment. Then we use computer algorithms to put millions of these these fragments together to assemble a genome sequence. For well known organisms like humans, fruit-flies or laboratory mice we have a reference to guide the genome assembly. It's a bit like a puzzle where the picture on the box serves as a guide for arranging the peices in the right order.

However, for organisms for which there is no reference sequence we have to peice together the genome from stratch or de novo (de novo is from Latin meaning 'of new'). This is a much more challenging task. It is akin to assembling a puzzle without the benefit of the picture on the box as a guide.

One way to help overcome this challenge is to have two sorts of sequencing reads, the relatively short fragments, like described above, and longer fragments. Sequence data from the longer fragments may serve as a scaffold, a frame to which the shorter reads may be matched. This is especially useful for assembling regions of the genome that have repetive sequences.

The first step in our project is creating a reference genome for Narcissus Flycatcher. Once we have a reliable reference we can use it to assemble genomes from other individuals, just as in done in well known model organisms in biology, like mice and fruit flies. This step is more than half-way completed as we have good short-read sequence data for one individual Narcissus Flycatcher sample, a sample collected from an individual of the migratory subspecies in Japan.

MinION sequencer from Oxford Nanopore Technologies. Tiny but poweful.

MinION sequencing is an ideal part of this approach as it can provide very long sequencing reads, thousands or even tens of thousands of DNA bases. MinION sequencing has already been used to sequence the genome of the European Eel, a similar sized genome to that of the Narcissus Flycatcher. The MinION device uses nanopore sequencing which biochemically feeds DNA molecules though a pore imbedded on a membrane. There is a an electrical potential across the membrane and an open pore completes the circuit, however, if the pore is blocked by a molecule of DNA this changes the current. Different bases in a molecule of DNA being rapidly fed through the pore will alter the current in slight but detectable ways and machine learning algorithms have been employed to translate these changes in current into base reads.

Click HERE for link to nanopore sequencing video.

We currently have the short read genome seqeuncing part of this process complete. Compared to the MinION seqeucning this is a much more laborious and expensive process done on the Illumina HiSeq sequencing platform. We are now conducting some quality control on the short-read Illumina data and so far things look good, but more on that in the next lab note.

Support from you will help us move on to the next steps. Thanks for your interest in our project. Keep following for more updates.


2 comments
1 comment

3 comments

Join the conversation!Sign In
  • Cindy Wu
    Cindy WuBacker
    Ah, I really want to see this project succeed now. And, reading your lab note makes me want to test out this MinION sequencer!
    Aug 23, 2017
  • Cindy Wu
    Cindy WuBacker

    MinION sequencing has already been used to sequence the genome of the European Eel

    I did not know about this. This is very cool. Looks like the paper was published online on August 3rd 2017 https://www.nature.com/articles/s41598-017-07650-6.epdf
    Aug 23, 2017
  • Cindy Wu
    Cindy WuBacker

    reference to guide the genome assembly

    Is there a resource online that you know of that keeps track of all organisms that have been sequenced?
    Aug 23, 2017
  • Herman Mays
    Herman MaysResearcher
    Yes. Usually when any organism's genome is sequenced it is made available online through various data repositories. Here are two of the main public repositories for genome data http://www.ensembl.org/index.html https://www.ncbi.nlm.nih.gov/ Plus if you search on Google Scholar for 'animal genomes' you'll see papers on lots of species pop up, everything from ctenophores and octopus to giraffes and pandas.
    Aug 24, 2017

About This Project

The Narcissus Flycatcher (Ficedula narcissina) is an East Asian songbird with three subspecies. Two subspecies are long-distance migrants breeding at high latitudes and wintering in the tropics. The third is a year-round resident of the subtropical Ryukyu Islands. Using a new, low-cost, high-throughput DNA sequencing technology we will sequence 11 Narcissus Flycatcher genomes. Comparing the genomes of these very different populations will provide insight into the evolution of migration.

Campaign Ended

Add a comment