SPADA Draft Documents

genomes so that later testing for coverage can be performed with a combined database of full- 290 length and partial genomes using an algorithm such as Primer-BLAST or ThermoBLAST , but 291 such partial genomes should ideally be avoided for purposes of design. In some instances, there 292 is an abundance of sequencing for a particular gene (e.g., 16S ribosomal RNA, a particular 293 conserved virulence factor, or a toxin gene) from an organism and in such cases, it is important 294 to include in the inclusivity only sequences that contain that one gene of interest. The trend in 295 next generation sequencing is toward production of a large number of draft microbial genomes 296 with multiple contigs due to the low cost of generating short read sequences. Adding long read 297 sequences using sequencers such as PacBio to generate a single contig finished genome will only 298 add additional costs. However, there is also a definite increase in complete finished genomes 299 (Figure 6) over time. Thus, we anticipate that with time the relative contributions of complete 300 genome sequences will increase. Along with the expected exponential increase in the size of 301 databases will be a growing demand on the computational resources to handle such larger 302 databases.

303 304

16

Made with FlippingBook flipbook maker