The Ideal Molecular Barcode for Identifying Freshwater Green Algae (Chlorophyceae)

Mira Loma High School
Sacramento, California
DOI: 10.18258/9257
Raised of $3,200 Goal
Funded on 6/17/17
Successfully Funded
  • $3,204
  • 100%
  • Funded
    on 6/17/17



The major aim of this project is to investigate a selected few short molecular barcode regions of both the chloroplast and nuclear genomes of various species of algae. The best-known and still widely-applied method for this process, which is to be employed in this project, is genomic PCR and Sanger (dye-termination) sequencing. If the analogy of a grocery-store barcode is used, PCR is equivalent to printing out a barcode from a template, and the Sanger sequencing would be scanning this barcode to identify the target product (or in this case, species of algae). 

The first steps of the project will require starting and maintaining pure cultures of a few dozen wild-type strains and harvesting this biomass for use in extracting DNA for PCR. For this project, the most practical and successful method of isolating algae and starting clonal cultures has been through successive dilutions on agar plates. The plates are made with a freshwater algal medium, solidified with ~1% agar, and then a drop of centrifugally-concentrated water sample is streaked on the plate. After a few weeks, colonies of green algae are randomly selected, and using micropipettes made with Pasteur pipettes, they are collected and streaked on new media. A light microscope can be used to assist in harvesting cells of just one particular species for a quicker isolation. This process is repeated until light microscopy reveals only one species of algae growing in a particular colony, after which time this colony can be used to inoculate agar slant tubes or plates for growing and maintaining lager biomasses of algae. 

For the PCR in this experiment, or copying the molecular barcode region out of the raw genomic DNA of the algae, direct or robust methods will be used, which involve minimal or no preparation of template DNA (extracting and purifying the genomic DNA from the algae). The main reason for this is because the equipment required for homogenizing the algal tissue (liquid nitrogen or bead-beaters) is very expensive and difficult to operate in a limited lab setting; in addition, DNA extraction and preparation has proved to be equally time consuming and not very successful. On the other hand, the use of special polymerases, buffer systems and modified PCR protocols originally intended for use with plants and crude samples (perhaps as simple as a Proteinase K digestion and lysis with a detergent solution, this rough mix added directly to the PCR as a template) has worked very well for algae in preliminary experiments. A variety of products for this process exist, including polymerases with extreme resistance to plant-based PCR inhibitors (polysaccharides and polyphenols, principally) as well as high fidelity, meaning that the traditional step of creating a dedicated PCR template consisting of only pure genomic DNA can be bypassed, speeding up the experimentation process and the replication of the intended barcode regions. The project is currently in this phase, with preliminary experiments underway to identify the most effective polymerase systems for PCR, and additional funding will be required (as described in the project budget) for purchasing a sufficient quantity of this polymerase for the completion of these experiments. 

Standard agarose gel electrophoresis will be used to confirm the reaction success - whether or not the intended product was copied and in a sufficient quantity. The PCR mixtures will then be purified to remove salts, enzymes, and small DNA fragments, leaving only synthesized DNA behind. In cases where DNA was not amplified specifically, nested PCR will be performed to copy the appropriate DNA fragment out of the purified mixture. Where DNA has been amplified, is pure and free of spurious products, and contains enough product for downstream use, it will be purified and prepared for sequencing. This is by far the most important portion of this project's intended research and methods. Currently, it appears that the best way to conduct this step is to work with a company or sequencing core at a university which can provide the sequencing service, although the use of personal sequencing technologies will also be investigated. Sequencing will most likely be run using the Sanger dye-termination method, although the Nanopore sequencing technology could potentially be used as well. Phylogenetic analysis of the sequenced barcodes has not been finalized in full as of yet, but will be elaborated on as the project nears this phase of its execution. 


Likely the most difficult part about this project will be the PCR phase. As green algae vary tremendously in their ecological niche and growth formations (singular cells, connected cell colonies or cells embedded in an extracellular matrix of some sort), they may prove difficult for effective PCR if only one polymerase system is used. After the preliminary round of experimentation, three polymerase systems have been identified which demonstrate the ability to perform as necessary in robust and direct PCR, although which polymerase system works best in which specific conditions has yet to be determined. Likely, several PCRs will be run using slightly different cycling protocols or methods and compared with electrophoresis before finally selecting and purifying the reaction whose DNA will be used for the sequencing reaction. 

Another challenge which is likely to occur is the difficulties of phylogenetic analysis. Having to construct several phylogenetic trees, compare the sequences obtained in this experiment with those already known and described (e.g. on GenBank), and come to a conclusion based on comparing these data to the most recent and widely-accepted taxonomic organization of green algae to verify the accuracy of the molecular barcodes will likely prove to be quite intensive for a relative newcomer in this field. However, after completing the collection of data, the project can be shared with professors or other scientists and experts at a local university or elsewhere who may be able to assist or provide advice on this process. 

Pre Analysis Plan

There are several ways by which the effectiveness of various molecular barcodes to identify green algae can be compared, all of which are expected to be used in this project. These different methods each investigate a different aspect by which the barcodes can be judged: universality/convenience of use, ability to identify algae with accuracy and precision (preferably down to the species level), and ability to correctly demonstrate algal phylogenies and interrelationships. 

The first and simplest of these is through electrophoresis of PCR products. This process doesn't offer any data regarding the actual sequence of the barcode or other DNA region, but it is useful because it can easily be used to diagnose whether or not the barcode can be amplified from a certain strain and with what ease. For example, it is clear through solely electrophoresis that a barcode which amplified successfully from 50 strains using a single protocol may be easier and more practical to use for the identification of algae than a barcode which can only be amplified from 10 strains and after much optimization of reaction conditions. 

After being sequenced, the barcodes can then be used directly as they would in an applied study - for identifying the algae with no microscopic assistance. Based solely off of BLAST or a similar sequence comparison program, the barcodes can be used to assign each strain a tentative identification, preferably one for each barcode region tested. Afterwards, microscopy can be used to identify the strains in a traditional manner, and based off of that control, the various proposed identities of the algae as derived by each molecular barcode can be compared. 

Finally, with the aid of phylogenetic analysis software, all of the sequences of one particular type of barcode can be used to construct a phylogenetic tree. Combined with the assigned identities of each species of algae, these trees can be compared to the most recent or comprehensive algal phylogeny model to check for similarity. Ideally, the molecular barcode should not only be able to identify the algae, but also correctly distinguish its lineage and relation to other species, genera or families. This final use of the molecular barcode may enable it to be applied far more broadly to taxonomic re-evaluations or new studies for recently-discovered species or poorly-understood algal taxa. 


This project has not yet shared any protocols.