Premise of the study: Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Methods and Results: Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Conclusions: Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.
• Premise of the study: Hyb‐Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low‐copy nuclear genes and high‐copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed ( Asclepias syriaca ) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb‐Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off‐target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb‐Seq approach enables targeted sequencing of thousands of low‐copy nuclear exons and flanking regions, as well as genome skimming of high‐copy repeats and organellar genomes, to efficiently produce genome‐scale data sets for phylogenomics.
• Premise of the study: Melissopalynology, the identification of bee‐collected pollen, provides insight into the flowers exploited by foraging bees. Information provided by melissopalynology could guide floral enrichment efforts aimed at supporting pollinators, but it has rarely been used because traditional methods of pollen identification are laborious and require expert knowledge. We approach melissopalynology in a novel way, employing a molecular method to study the pollen foraging of honey bees ( Apis mellifera ) in a landscape dominated by field crops, and compare these results to those obtained by microscopic melissopalynology. • Methods: Pollen was collected from honey bee colonies in Madison County, Ohio, USA, during a two‐week period in midspring and identified using microscopic methods and ITS2 metabarcoding. • Results: Metabarcoding identified 19 plant families and exhibited sensitivity for identifying the taxa present in large and diverse pollen samples relative to microscopy, which identified eight families. The bulk of pollen collected by honey bees was from trees (Sapindaceae, Oleaceae, and Rosaceae), although dandelion ( Taraxacum officinale ) and mustard (Brassicaceae) pollen were also abundant. • Discussion: For quantitative analysis of pollen, using both metabarcoding and microscopic identification is superior to either individual method. For qualitative analysis, ITS2 metabarcoding is superior, providing heightened sensitivity and genus‐level resolution.
The measurement of fitness is critical to biological research. Although the determination of fitness for some organisms may be relatively straightforward under controlled conditions, it is often a difficult or nearly impossible task in nature. Plants are no exception. The potential for long‐distance pollen dispersal, likelihood of multiple reproductive events per inflorescence, varying degrees of reproductive growth in perennials, and asexual reproduction all confound accurate fitness measurements. For these reasons, biomass is frequently used as a proxy for plant fitness. However, the suitability of indirect fitness measurements such as plant size is rarely evaluated. This review outlines the important associations between plant performance, fecundity, and fitness. We make a case for the reliability of biomass as an estimate of fitness when comparing conspecifics of the same age class. We reviewed 170 studies on plant fitness and discuss the metrics commonly employed for fitness estimations. We find that biomass or growth rate are frequently used and often positively associated with fecundity, which in turn suggests greater overall fitness. Our results support the utility of biomass as an appropriate surrogate for fitness under many circumstances, and suggest that additional fitness measures should be reported along with biomass or growth rate whenever possible.
Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.
Premise of the study: Targeted sequencing using next-generation sequencing (NGS) platforms offers enormous potential for plant systematics by enabling economical acquisition of multilocus data sets that can resolve difficult phylogenetic problems. However, because discovery of single-copy nuclear (SCN) loci from NGS data requires both bioinformatics skills and access to high-performance computing resources, the application of NGS data has been limited. Methods and Results: We developed MarkerMiner 1.0, a fully automated, open-access bioinformatic workflow and application for discovery of SCN loci in angiosperms. Our new tool identified as many as 1993 SCN loci from transcriptomic data sampled as part of four independent test cases representing marker development projects at different phylogenetic scales. Conclusions: MarkerMiner is an easy-to-use and effective tool for discovery of putative SCN loci. It can be run locally or via the Web, and its tabular and alignment outputs facilitate efficient downstream assessments of phylogenetic utility, locus selection, intron-exon boundary prediction, and primer or probe development.
• Premise of the study: We explored a targeted enrichment strategy to facilitate rapid and low‐cost next‐generation sequencing (NGS) of numerous complete plastid genomes from across the phylogenetic breadth of angiosperms. • Methods and Results: A custom RNA probe set including the complete sequences of 22 previously sequenced eudicot plastomes was designed to facilitate hybridization‐based targeted enrichment of eudicot plastid genomes. Using this probe set and an Agilent SureSelect targeted enrichment kit, we conducted an enrichment experiment including 24 angiosperms (22 eudicots, two monocots), which were subsequently sequenced on a single lane of the Illumina GAIIx with single‐end, 100‐bp reads. This approach yielded nearly complete to complete plastid genomes with exceptionally high coverage (mean coverage: 717×), even for the two monocots. • Conclusions: Our enrichment experiment was highly successful even though many aspects of the capture process employed were suboptimal. Hence, significant improvements to this methodology are feasible. With this general approach and probe set, it should be possible to sequence more than 300 essentially complete plastid genomes in a single Illumina GAIIx lane (achieving ∼50× mean coverage). However, given the complications of pooling numerous samples for multiplex sequencing and the limited number of barcodes (e.g., 96) available in commercial kits, we recommend 96 samples as a current practical maximum for multiplex plastome sequencing. This high‐throughput approach should facilitate large‐scale plastid genome sequencing at any level of phylogenetic diversity in angiosperms.
Premise of the study: Difficulties inherent in microscopic pollen identification have resulted in limited implementation for large-scale studies. Metabarcoding, a relatively novel approach, could make pollen analysis less onerous; however, improved understanding of the quantitative capacity of various plant metabarcode regions and primer sets is needed to ensure that such applications are accurate and precise. Methods and Results: We applied metabarcoding, targeting the ITS2, matK, and rbcL loci, to characterize six samples of pollen collected by honey bees, Apis mellifera. Additionally, samples were analyzed by light microscopy. We found significant rank-based associations between the relative abundance of pollen types within our samples as inferred by the two methods. Conclusions: Our findings suggest metabarcoding data from plastid loci, as opposed to the ribosomal locus, are more reliable for quantitative characterization of pollen assemblages. Furthermore, multilocus metabarcoding of pollen may be more reliable than single-locus analyses, underscoring the need for discovering novel barcodes and barcode combinations optimized for molecular palynology.
Premise of the study: To study pollination networks in a changing environment, we need accurate, high-throughput methods. Previous studies have shown that more highly resolved networks can be constructed by studying pollen loads taken from bees, relative to field observations. DNA metabarcoding potentially allows for faster and finer-scale taxonomic resolution of pollen compared to traditional approaches (e.g., light microscopy), but has not been applied to pollination networks. Methods: We sampled pollen from 38 bee species collected in Florida from sites differing in forest management. We isolated DNA from pollen mixtures and sequenced rbcL and ITS2 gene regions from all mixtures in a single run on the Illumina MiSeq platform. We identified species from sequence data using comprehensive rbcL and ITS2 databases. Results: We successfully built a proof-of-concept quantitative pollination network using pollen metabarcoding. Discussion: Our work underscores that pollen metabarcoding is not quantitative but that quantitative networks can be constructed based on the number of interacting individuals. Due to the frequency of contamination and false positive reads, isolation and PCR negative controls should be used in every reaction. DNA metabarcoding has advantages in efficiency and resolution over microscopic identification of pollen, and we expect that it will have broad utility for future studies of plant-pollinator interactions.
PREMISE OF THE STUDY: Herbarium specimens provide a robust record of historical plant phenology (the timing of seasonal events such as flowering or fruiting). However, the difficulty of aggregating phenological data from specimens arises from a lack of standardized scoring methods and definitions for phenological states across the collections community. METHODS AND RESULTS: To address this problem, we report on a consensus reached by an iDigBio working group of curators, researchers, and data standards experts regarding an efficient scoring protocol and a data-sharing protocol for reproductive traits available from herbarium specimens of seed plants. The phenological data sets generated can be shared via Darwin Core Archives using the Extended MeasurementOrFact extension. CONCLUSIONS: Our hope is that curators and others interested in collecting phenological trait data from specimens will use the recommendations presented here in current and future scoring efforts. New tools for scoring specimens are reviewed.
Premise of the Study Predicting the flowering times of angiosperm taxa is a goal of mounting importance in the face of future climate change, with applications not only in plant biology and ecology, but also horticulture, agriculture, and invasive species management. To date, no tool is available to facilitate predictions of flowering phenology using multivariate phenoclimatic models. Such a tool is needed by researchers and other stakeholders who need to predict phenological activity, but are unfamiliar with phenoclimate modeling techniques. PhenoForecaster allows users of any background to conduct species-specific phenological predictions using an intuitive graphical interface and provides an estimate of each prediction's accuracy. Methods and Results Elastic net regression techniques were used to develop species-specific models capable of predicting the flowering dates of 2320 angiosperm species. Conclusions PhenoForecaster is the first stand-alone package to make phenological modeling directly accessible to users without the need for in-depth phenological observations.
Premise of the study: The One Thousand Plant Transcriptomes Project (1KP, 1000+ assembled plant transcriptomes) provides an enormous resource for developing microsatellite loci across the plant tree of life. We developed loci from these transcriptomes and tested their utility. Methods and Results: Using software packages and custom scripts, we identified microsatellite loci in 1KP transcriptomes. We assessed the potential for cross-amplification and whether loci were biased toward exons, as compared to markers derived from genomic DNA. We characterized over 5.7 million simple sequence repeat (SSR) loci from 1334 plant transcriptomes. Eighteen percent of loci substantially overlapped with open reading frames (ORFs), and electronic PCR revealed that over half the loci would amplify successfully in conspecific taxa. Transcriptomic SSRs were approximately three times more likely to map to translated regions than genomic SSRs. Conclusions: We believe microsatellites still have a place in the genomic age-they remain effective and cost-efficient markers. The loci presented here are a valuable resource for researchers.
Premise of the Study Phenological annotation models computed on large-scale herbarium data sets were developed and tested in this study. Methods Herbarium specimens represent a significant resource with which to study plant phenology. Nevertheless, phenological annotation of herbarium specimens is time-consuming, requires substantial human investment, and is difficult to mobilize at large taxonomic scales. We created and evaluated new methods based on deep learning techniques to automate annotation of phenological stages and tested these methods on four herbarium data sets representing temperate, tropical, and equatorial American floras. Results Deep learning allowed correct detection of fertile material with an accuracy of 96.3%. Accuracy was slightly decreased for finer-scale information (84.3% for flower and 80.5% for fruit detection). Discussion The method described has the potential to allow fine-grained phenological annotation of herbarium specimens at large ecological scales. Deeper investigation regarding the taxonomic scalability of this approach is needed.
Premise of the Study The Plant Phenology Ontology (PPO) was originally developed to integrate phenology observations of whole plants across different global observation networks. Here we describe a new release of the PPO and associated data pipelines that supports integration of phenology observations from herbarium specimens, which provide historical and modern phenology data. Methods and Results Critical changes to the PPO include key terms that describe how measurements from parts of plants, which are captured in most imaged herbarium specimens, relate to whole plants. We provide proof of concept for ingesting annotations from imaged herbarium sheets of Prunus serotina, the common black cherry. We then provide an example analysis of changes in flowering timing over the past 125 years, demonstrating the value of integrating herbarium and observational phenology data sets. Conclusions These conceptual and technical advances will support the addition of phenology data from herbaria, but also could be expanded upon to facilitate the inclusion of data from photograph-based citizen science platforms. With the incorporation of herbarium phenology data, new historical baseline data will strengthen the capability to monitor, model, and forecast plant phenology changes.
Premise of the Study Herbarium specimens are increasingly used to study reproductive phenology. Here, we ask whether classifying reproduction into progressively finer-scale stages improves our understanding of the relationship between climate and reproductive phenology. Methods We evaluated Acer rubrum herbarium specimens across eastern North America, classifying them into eight reproductive phenophases and four stages of leaf development. We fit models with different reproductive phenology categorization schemes (from detailed to broad) and compared model fits and coefficients describing temperature, elevation, and year effects. We fit similar models to leaf phenology data to compare reproductive to leafing phenology. Results Finer-scale reproductive phenophases improved model fits and provided more precise estimates of reproductive phenology. However, models with fewer reproductive phenophases led to similar qualitative conclusions, demonstrating that A. rubrum reproduces earlier in warmer locations, lower elevations, and in recent years, as well as that leafing phenology is less strongly influenced by temperature than is reproductive phenology. Discussion Our study suggests that detailed information on reproductive phenology provides a fuller understanding of potential climate change effects on flowering, fruiting, and leaf-out. However, classification schemes with fewer reproductive phenophases provided many similar insights and may be preferable in cases where resources are limited.
Premise of the Study Herbarium specimens are increasingly used in phenological studies. However, natural history collections can have biases that influence the analysis of phenological events. Arctic environments, where remoteness and cold climate govern collection logistics, may give rise to unique or pronounced biases. Methods We assessed the presence of biases in time, space, phenological events, collectors, taxonomy, and plant traits across Nunavut using herbarium specimens accessioned at the National Herbarium of Canada (CAN). Results We found periods of high and low collection that corresponded to societal and institutional events; greater collection density close to common points of air and sea access; and preferences to collect plants at the flowering phase and in peak flower, and to collect particular taxa, flower colours, growth forms, and plant heights. One-quarter of collectors contributed 90% of the collection. Discussion Collections influenced by temporal and spatial biases have the potential to misrepresent phenology across space and time, whereas those shaped by the interests of collectors or the tendency to favour particular phenological stages, taxa, and plant traits could give rise to imbalanced phenological comparisons. Underlying collection patterns may vary among regions and institutions. To guide phenological analyses, we recommend routine assessment of any herbarium data set prior to its use.
Premise of the Study Fungal diversity (richness) trends at large scales are in urgent need of investigation, especially through novel situations that combine long-term observational with environmental and remotely sensed open-source data. Methods We modeled fungal richness, with collections-based records of saprotrophic (decaying) and ectomycorrhizal (plant mutualistic) fungi, using an array of environmental variables across geographical gradients from northern to central Europe. Temporal differences in covariables granted insight into the impacts of the shorter- versus longer-term environment on fungal richness. Results Fungal richness varied significantly across different land-use types, with highest richness in forests and lowest in urban areas. Latitudinal trends supported a unimodal pattern in diversity across Europe. Temperature, both annual mean and range, was positively correlated with richness, indicating the importance of seasonality in increasing richness amounts. Precipitation seasonality notably affected saprotrophic fungal diversity (a unimodal relationship), as did daily precipitation of the collection day (negatively correlated). Ectomycorrhizal fungal richness differed from that of saprotrophs by being positively associated with tree species richness. Discussion Our results demonstrate that fungal richness is strongly correlated with land use and climate conditions, especially concerning seasonality, and that ongoing global change processes will affect fungal richness patterns at large scales.