Sequence Data Finally In! The Good, Bad, and the Ugly
Finally, after all these months of buildup, unbelievably tedious PCRs, and meticulous culturing, some sequence data have finally arrived - currently 192 individual sequences across all six barcodes and 27 strains of algae!
So first, the good news:
Running some initial tests, I've already been able to get some identification information running and processing for some of my strains that were previously unidentified. The large spheres characteristic of JIAC15, 21, and 24 have turned out to be (most likely) Chlorococcum strains, the very strange morphology of JIAC7 matches with the genus Geminella, and JIAC23, initially suspected to be a terrestrial member of the Chaetopeltidales and then later revised to a guess of Klebsormidium, was indeed found to be Klebsormidium.
The bad:
Not every marker performed equally well across the board, and not every strain had accompanying sequence data to identify it with. And closely associated with this idea,
The ugly:
Some bacteria sequences that got unwittingly picked up and detected include those for some really gnarly genera, like Pseudomonas or Stenotrophomonas, which can cause serious infection in humans with weak immune systems and are very difficult to treat with antibiotics. Yuck!
With that being said, let's look at the basic success of each barcode region, so we can get around to answering (at least in part) the original research question or objective of this project, although of course there is a lot more work to be done (and many sequence and, ugh, PCR repeats as well). Success here is defined here as whether or not a sequence put into BLAST, a giant government database and tool for basically matching sequences together, was able to successfully pair my sample sequences with a reasonable guess for the true species identity. Sure, it's subjective to some degree and not very quantifiable, but it's a good test procedure to see if I really have algal DNA, or bacterial DNA or even a sequence that is so mushy and bad-quality that it doesn't look like anything recognizable.
With this in mind, the UPA and tufA markers performed absolutely terribly, both of which amplified mostly bacterial and not algal DNA. That kind of surprised me, because even though these markers were designed to only amplify from photosynthetic organisms (green algae and maybe cyanobacteria), they picked up the sequences of some definitely....regular bacteria. The UPA marker returned 3 algal sequences, and the tufA marker returned 7 good sequences (technically 9 that could identify algae, although two of these were of low quality), even though both technically yielded far more PCR products (they just happened to be bacterial in origin).
ITS, 18S, and 26S genes, on the other hand, are all components found in ONLY eukaryotic organisms, basically meaning that bacterial contamination would not be an issue. None of these sequences were bacterial in origin, although the success rate was still not 100% due to some low-quality sequences that could not be recognized by the BLAST program. 13 ITS sequences passed the first inspection and yielded algal sequence matches, compared to 31 sequences for the 18S marker (in both the forward and reverse directions) and 38 for the 26S marker.
Finally, the rbcL gene performed the best for me (which I'm definitely happy about, seeing as I designed the PCR primers to pick out this gene region myself!). The reason why it worked so well, I think, was that it was designed specifically for the green algae, excluding not only all bacteria but also most all other culture contaminants like molds, yeasts, diatoms or any protozoa. None of the working sequences had any bacterial contamination evident, and although a few of these sequences failed due to low-quality sequencing, 42 other sequences were able to successfully match with green algal sequences in the BLAST database.
Now, I have some more tedious work to do, including editing my sequences (essentially it's like proofreading them - cutting out unnecessary parts and "correcting" a few misspellings), redoing the necessary PCR, and most importantly, finding out how to get rid of the bacterial contamination so that I can give the UPA and tufA markers a fair appraisal. Stay tuned...stuff is getting exciting now!
1 comment