Introducing gold-standard essential gene datasets for Pseudomonas aeruginosa to enhance Tn-Seq analyses
Cléophée Van Maele,
Ségolène Caboche,
Nathan Nicolau-Guillaumet,
Anaëlle Muggeo and
Thomas Guillard
PLOS Computational Biology, 2026, vol. 22, issue 2, 1-19
Abstract:
Transposon Sequencing (Tn-Seq) is a high-throughput technique that utilizes transposon mutant libraries to assess gene fitness or essentiality under specific conditions potentially identifying novel therapeutic targets. However, the diversity of statistical methods, bioinformatics tools, and parameters complicates the selection of the most appropriate and reliable analysis pipeline for a given dataset. A significant limitation of existing studies is the absence of a gold-standard set of essential genes (EGs) for evaluating the analysis process. Relying on the original study as a gold-standard is suboptimal, as these results may have been obtained using non-optimal tools. Here, we introduce reliable EG datasets for Pseudomonas aeruginosa to enhance Tn-Seq analyses. By utilizing literature data and sequencing of six samples from PA14 Wild-Type (WT) and PA14 OprD-deficient (ΔoprD), grown in LB medium, we compared EG lists generated by several statistical methods of TRANSIT2 and by the FiTnEss tools. We established a reference dataset of 84 genes found in P. aeruginosa and another gold-standard set composed of 115 genes specific to PA14 grown in LB. Our findings revealed that depending on the analysis method used, retrieval rates of gold-standard genes ranged from 0% to 100%. The Hidden-Markov Model (HMM) method available in TRANSIT2 identified approximately 90% of gold-standard EGs, while FiTnEss identified up to 100%. This study addressed a critical gap in the field by providing gold-standard sets of EGs, enabling comparative evaluation of Tn-Seq analysis methods to help researcher select the most suitable bioinformatics pipeline for a given Tn-Seq dataset. We anticipate that our results will facilitate Tn-Seq analysis comparisons, harmonize P. aeruginosa-related studies, promote standardization and enhance reproducibility. Ultimately, this will lead to more reliable identification of EGs and potential therapeutic targets in P. aeruginosa, advancing our understanding of this important pathogen.Author summary: Tn-Seq analyses are a crucial resource for understanding gene function and identifying potential new therapeutic targets. However, during our analyses, we encountered vast array of available tools and disparity in results reported in the literature. Therefore, we conducted a comparative evaluation of bioinformatics tools based on two gold-standard datasets of essential genes for Pseudomonas aeruginosa that we established. This approach enables researchers performing Tn-Seq analysis to assess the quality of their results, thereby promoting consistency and harmonization across studies.
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013945 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 13945&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1013945
DOI: 10.1371/journal.pcbi.1013945
Access Statistics for this article
More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().