E-probe Diagnostic Nucleic acid Analysis (EDNA): a theoretical approach for handling of next generation sequencing data for diagnostics

Anthony H Stobbe, Jon Daniels, Andres S Espindola, Ruchi Verma, Ulrich Melcher, Francisco Ochoa Corona, Carla Garzon, Jacqueline Fletcher, William Schneider

September 2013

Abstract

Plant biosecurity requires rapid identification of pathogenic organisms. While there are many pathogen-specific diagnostic assays, the ability to test for large numbers of pathogens simultaneously is lacking. Next generation sequencing (NGS) allows one to detect all organisms within a given sample, but has computational limitations during assembly and similarity searching of sequence data which extend the time needed to make a diagnostic decision. To minimize the amount of bioinformatic processing time needed, unique pathogen-specific sequences (termed e-probes) were designed to be used in searches of unassembled, non-quality checked, sequence data. E-probes have been designed and tested for several selected phytopathogens, including an RNA virus, a DNA virus, bacteria, fungi, and an oomycete, illustrating the ability to detect several diverse plant pathogens. E-probes of 80 or more nucleotides in length provided satisfactory levels of precision (75%). The number of e-probes designed for each pathogen varied with the genome size of the pathogen. To give confidence to diagnostic calls, a statistical method of determining the presence of a given pathogen was developed, in which target e-probe signals (detection signal) are compared to signals generated by a decoy set of e-probes (background signal). The E-probe Diagnostic Nucleic acid Analysis (EDNA) process provides the framework for a new sequence-based detection system that eliminates the need for assembly of NGS data.

Type

Journal article

Publication

J. Microbiol. Methods

Bioinformatics; Next-generation sequencing; Pathogen detection