Signal, bias, and the role of transcriptome assembly quality in phylogenomic inference

Academic Article


  • The empirical details of whole transcriptome sequencing and assembly have been thoroughly evaluated, but few studies have addressed how user-defined aspects of the assembly process may influence performance in phylogenomic analyses. Errors in transcriptome assembly could affect ortholog prediction, alignment quality, and phylogenetic signal. Here we investigate the impacts of transcriptome assembly quality in phylogenomic studies by constructing phylogenomic data matrices from alternative transcriptome assemblies representing high-quality and intentionally low-quality assembly outcomes. We leveraged a well-resolved topology for craniates to apply a topological constraint to our analyses, providing a way to quantify phylogenetic signal. Craniates are amply represented in publicly available raw RNA-seq repositories, allowing us to control for transcriptome tissue type as well. By studying the performance of phylogenomic datasets derived from these alternative high- and low-quality inputs in a controlled experiment, we show that high-quality transcriptomes produce richer phylogenomic datasets with partitions that have lower alignment ambiguity, less compositional bias, and stronger phylogenetic signal than low-quality transcriptome assemblies. Our findings demonstrate the importance of transcriptome assembly in phylogenomic analyses and suggest that a portion of the uncertainty observed in phylogenomic studies could be alleviated at the assembly stage.
  • Authors

  • Spillane, Jennifer
  • LaPolice, Troy
  • MacManes, Matthew
  • Plachetzki, David
  • Status

    Publication Date

  • 2020
  • Digital Object Identifier (doi)