SPADA Meeting Book

It is recommended to include only high-quality sequences in the inclusivity database, since 261 including poorly determined sequences can effectively reduce the number of conserved 262 signatures regions in a set of target genomes. Use of partial sequences in the inclusivity can 263 cause assay design algorithms to ignore otherwise promising regions and introduce artificial 264 design constraints, thereby compromising design quality by introducing bias into the signature 265 regions (e.g., due to the number of times partial sequences are present, rather than focusing on 266 regions that are actually most conserved). Use of poor-quality sequences that contain deletions or 267 inserted sequences can result in assays that detect “phantom” sequences that do not exist in 268 nature. 269 The ideal case occurs when the inclusivity database fully represents the diversity of extant 270 natural (or engineered) viral pathogens with high-quality full-length genomes (e.g., Ebola, HIV, 271 and Influenza A viruses). The availability of low-cost sequencing methods has made such high- 272 quality genomes more common, though often such a ready-made, up-to-date collection does not 273 exist. Then, it is incumbent on the assay developer to gather all available sequences into a 274 curated inclusivity database taking the sequence quality into consideration (see above). Some 275 viruses have highly variable genomes (e.g., the human rhino viruses (HRV types A and B), 276 human papilloma viruses (HPV), LCMV, Lassa virus and CCHFV). For such highly variable 277 viruses, utilizing full-length genomes (and removing partial sequences) is of paramount 278 importance for high quality PCR design. 279 Alternatively, there are some viruses (e.g., Marburg virus subtypes Ci67, Musoke, and 280 RAVN) where only a few examples have been fully sequenced to date. Such cases occur with 281 newly emerging infectious diseases or diseases that have sparked little research interest. For 282 these cases, utilizing only the few full-length genomes would result in “over-fitting” wherein 283

16

Made with FlippingBook - Online magazine maker