AOAC SPADA VNGS - Final

This is the final version of the AOAC SPADA draft standard with Comments received and their reconciliation. All interested stakeholders may demonstrate their consensus by voting on this document.

Standard Requirements for Nucleotide Sequences used in Biothreat Agent Detection, Identification, 1 and Quantification: Verified Next Generation Sequences (VNGS) 2 3 Intended use : This document provides requirements for biothreat agent reference nucleotide 4 sequences that can be used for consequential (e.g., military context) diagnosis and surveillance including 5

reference sequences that will be used as part of method development and validation (1,2). 6

7

1 Applicability

8

9

This document applies to all biothreat nucleotide sequences determined by next generation

sequencing technology that are accessible on the semantic web and included in a genetic database 10

11

(public or private) (3, 4, 5, 6).

12

2 Analytical Technique

13

14

These requirements are not dependent upon a particular next generation sequencing platform.

Nucleotide sequence and motif identification techniques can vary dependent upon the diagnostic or 15

16

surveillance methods that are used (7).

17

3 Definitions

18

Alignment .—A way of arranging nucleotide sequences so that regions of similarity are shown 19

Alignments .—Nucleotide sequences arranged according to similarity 20

ASCII character set .—Character encoding standard for electronic communication 21

Assemblies .—A set of DNA segments or sequences that overlap in a way that provides a contiguous 22

23

representation of a genomic region

AOAC Draft Standard – Version 09282022; Public Comment Revisions

1

Base calling .—Computational process in massively parallel sequencing for translating raw electrical 24

25

signals to nucleotide sequence (8)

26

Base composition .—Percentage of GC base pairs in the genome

Biothreat agent (biological agent) .—Any microorganism (including, but not limited to, bacteria, viruses, 27

fungi, or protozoa), or infectious substance, or any naturally occurring, bioengineered, or synthesized 28

component of any such microorganism or infectious substance, capable of causing: (1) Death, disease, 29

or other biological malfunction in a human, an animal, a plant, or another living organism; (2) 30

Deterioration of food, water, equipment, supplies, or material of any kind; or (3) Deleterious alteration 31

of the environment: biological, virological, or toxic threat select agents (BSAT) are identified in a list provided 32

by the US federal select agent program (https://www.selectagents.gov/sat/list.htm) (9) 33

Breadth of coverage. —The percentage of genome bases sequenced at a given coverage (sequencing 34

35

depth) (10)

Cluster density. —Number of templates present on a NGS sequencing cell 36

Context .—Circumstance, purpose, and perspective under which an object is defined or used 37

Contig.— A contiguous stretch of DNA sequence that results from the assembly of smaller, overlapping 38

39

DNA sequence reads

Coverage .—Number of times that a given base position is read in a sequencing run (11, 12) 40

Detection .—Recognition of the presence of the target nucleic acid 41

Extensible markup language (XML) .—Markup language that encodes information in a way that is 42

43

machine-processable as well as human-readable

FAST5 format .—Standard sequencing output for Oxford Nanopore sequencers 44

FASTQ format .—Text-based format for storing both a biological sequence (usually nucleotide sequence) 45

and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a 46

47

single ASCII character for brevity.

AOAC Draft Standard – Version 09282022; Public Comment Revisions

2

Forward and backward compatibility .—Design that is compatible with previous and future versions 48

Identification .—Establishment of the identity of a biothreat agent by NGS analysis 49

50

Insert size .—Length of the sequence between the adapters (13)

JavaScript Object Notation (JSON) .—Open and text-based exchange format 51

Knowledge representation .—Process or result of encoding and storing knowledge in a knowledge base 52

Length of longest contig .—The size of the longest consensus region of DNA produced from a set of 53

54

overlapping DNA segment reads.

Machine readable.— Data in a form that can be automatically input to a computer 55

56

Metadata .—Data that describe other data

N50 .—Weighted median statistic such that 50% of the entire NGS assembly is contained in contigs or 57

58

scaffolds equal to or larger than this value.

Note: Individual sequencing reads are processed to remove linkers and barcodes and then assembled 59

into overlapping contigs. The next step is to assemble the contigs into progressively larger scaffolds by 60

bridging the gaps between contigs with additional sequence reads until the entire genome is assembled. 61

Next generation sequence (NGS) .—Nucleotide sequence produced using a massively parallel sequencing 62

63

methodology

NG50 .—Resembles N50 except the metric relates to the genome size rather than the assembly size (14) 64

Nucleotide sequence .—String of DNA or RNA subunits of purine or pyrimidine nucleosides linked by a 65

phosphodiester backbone and hydrogen bonding: an essential component of every living organism 66

Number of contigs .—Total number of contigs after assembly of the whole genome sequence 67

Number of reads .—Collected number of fragmented nucleotide sequences that were used to 68

reconstruct the original sequence for next generation sequencing technologies (15) 69

70

OBO Foundry .—Open Biological and Biomedical Ontology

AOAC Draft Standard – Version 09282022; Public Comment Revisions

3

Ontology .—Logical structure of the terms used to describe a domain of knowledge, including both the 71

72

definitions of the applicable terms and their relationships

OWL .—Web ontology language is a family of knowledge representation or ontology languages for 73

74

authoring ontologies or knowledge bases

pod5. —High performance sequencing file format for nanopore reads 75

Quantification. —Determination of the amount of a biothreat agent in a sample 76

77

Raw read .—Raw output of an NGS run

78

Reifiable .—Capable of being made more concrete or real

Resource Description Framework (RDF) .—XML syntax for describing metadata 79

Responsible party .—Person or persons responsible for the provision of the standard requirements 80

Semantic interoperability .—Ability of data shared by systems to be understood at the level of fully 81

82

defined domain concepts

Sequence length distribution .—Spread of sequence fragment read sizes in an NGS run (16). 83

Surveillance .—Close observation through microbiological sampling and analysis 84

UCS character set .—Character set encoding standard for international electronic communication 85

Universal resource identifier (URI) .—Sequence of characters, capable of uniquely identifying the thing 86

with which it is associated, within a specified context. URI is an internet protocol standard that builds on 87

the uniform resource indicator protocol by greatly expanding the set of permitted characters 88

Variant .—Single nucleotide polymorphism, insertion, or deletion occurring in one sequence but not in 89

90

another

Verified .—Provision of objective evidence that a given item fulfils specified requirements 91

Verified next generation sequence (VNGS) .—Next generation nucleotide sequence that conforms with 92

93

this standard

94

AOAC Draft Standard – Version 09282022; Public Comment Revisions

4

4 VNGS Requirements

95

( a ) VNGS Identifier Scheme

96

97

(1) VNGS Universal Resource Identifier (URI). —The VNGS shall be considered fit for purpose if it

possesses a specific namespace and context. The VNGS URI shall be in the form: 98

99

scheme: //hierarchical namespace qualifier(s)/name

where the ASCII character set is used, except where ASCII characters are not used then UCS characters 100

101

shall be used.

The VNGS URI can be represented in any scheme, e.g., http: urn (www.iana.org ). 102

All metadata connected to a VNGS URI shall be capturable and shall be kept over the whole lifetime of 103

the data. A VNGS URI shall be persistent and remain independent of its mapping on a server, and its 104

notation. The VNGS URI should not attempt to infer from properties of the biothreat next generation 105

sequence data and metadata including raw data, base quality metadata, metadata, alignments, variants, 106

107

features, etc.

A VNGS URI shall identify a single biothreat agent VNGS and each VNGS shall be assigned to only one URI 108

109

(17).

110

a. Using the same VNGS URI to identify more than one biothreat agent VNGS is discouraged:

111

existing server conventions for NGS and VNGS should be considered to avoid URI collision.

112

b. The responsible party shall avoid the assignment of equivalent VNGS URIs to multiple

113

biothreat agent VNGS.

114

c. It shall be the responsibility of reference sequence providers to manage the assignment of

115

VNGS URIs.

The VNGS URI shall be opaque and shall not contain: the author’s name, the status, the access, the file 116

name extension, the software mechanism, the disk name, or the domain name. 117

118

(2) All aspects of data and metadata for the VNGS shall be version controlled.

AOAC Draft Standard – Version 09282022; Public Comment Revisions

5

119

(3) Formats shall permit machine readability and can permit human readability (18).

120

a. JSON, XML and RDF are permitted.

121

(4) VNGS data and metadata that are not open and do not protect semantic interoperability during

processing or transferring shall be made machine readable subject to: security considerations; cost(s) 122

and benefit(s); legal liabilities; intellectual property right(s); confidential business information; contract 123

124

restriction(s); or other binding written agreement(s).

125

(5) Knowledge representations shall use VNGS web ontology, preferably in OWL (web ontology

126

language).

127

( b ) VNGS Technical and Organizational Requirements

128

129

(1) The reference sequence provider is responsible for establishment, maintenance, and potential

changes in the ownership of the VNGS URI format including the format description, version, structure, 130

and data representation. VNGS format description and contact information shall be documented. 131

132

(2) The sequence provider is responsible for the delegation of user requests.

133

(3) AOAC takes responsibility for VNGS Standard Requirement updates and error corrections in the

134

specification.

135

( c ) Documentation

136

137

The sequence provider will provide a stable and identifiable source for the VNGS format where data

on the provenance, maintenance, format structure, data items, data formatting and features of the 138

format are maintained and updated. Data types and metadata shall be documented. The exact format 139

140

version will be documented according to a change control schedule.

141

( d ) Compatibility, extensibility, and compression

142

AOAC Draft Standard – Version 09282022; Public Comment Revisions

6

143

Forward and backward compatibility shall be ensured. Absence of critical information data that is

backward compatible shall be noted. Critical information and metadata shall not be omitted for forward 144

compatibility. The addition of new data items for future updates shall be enabled. Encoding and 145

decoding algorithms shall be referenced, e.g., ISO 23092 (19, 20, 21, 22, 23, 24, 25). 146

147

( e ) Data types

148

149

Numerical values shall be denoted as measured, inferred, or assumed data in SI units. Measured

data shall include information on the method of obtaining data (if applicable) and measurement 150

151

precision, uncertainty, and accuracy.

152

Nucleic acid sequence data shall be encoded according to the recommendations of the International

Union of Pure and Applied Chemistry (IUPAC) and the International Union of Biochemistry and 153

Molecular Biology (IUBMB) in the “Biochemical Nomenclature and Related Documents” (known as the 154

White Book), released by the IUPAC-IUBMB Joint Commission on Biochemical Nomenclature and 155

156

Nomenclature Commission of IUBMB.

157

( f ) Format validation

158

159

The sequence provider should enable checking of the VNGS format with regards to strength,

weakness, applicability, and limitations of the data format, and provide a means of enabling validation 160

161

of a data file against the format specification.

162

( g ) Data versioning and provenance

163

164

Data versioning and provenance shall be documented. A complete chain of provenance traceable

from isolated biological material shall be available for VNGS. If an entry is deleted, its identifier should 165

AOAC Draft Standard – Version 09282022; Public Comment Revisions

7

remain valid. Every revision of an entry shall remain separately identifiable and include the provenance 166

167

that explains the history in human or machine-readable format.

168

( h ) Data structure

169

170

Data may be expressed as a table or reside in a database. Data should be reifiable to RDF triplets.

171

( i ) Ontology Requirements

172

173

The sequence provider shall use and maintain VNGS ontology consistent with their genomic

sequencing user community. There should be a defined methodology in the referenced ontology 174

community for the maintenance of the VNGS ontology including adding, removing, and deprecating 175

terms, e.g., OBO Foundry. The ontology syntax shall follow OWL or OBO foundry. This SMPR or suitable 176

177

publication will serve as the license for the VNGS ontology.

178

( j ) Minimum annotation information

179

180

The minimum annotation information required is sample name (sample ID), raw reads, assemblies

and alignments, organism name, strain name, identification method, sample type, host, isolation 181

provider name, isolation acquisition identity, taxonomic identification, contact name, clinical or 182

183

environmental sample.

184

( k ) Language

185

186

The language used shall be English.

187

( l ) Domain

188

AOAC Draft Standard – Version 09282022; Public Comment Revisions

8

189

The domain for VNGS shall be defined and include the term Biothreat Agent Next Generation

190

Sequences.

191

( m ) Stable URIs and versioning

192

193

Stable URIs for the terms, concepts, and versioning of VNGS shall be maintained by the sequence

194

provider.

195

( n ) Raw Sequence Data

196

197

All raw sequence data shall be available with each VNGS. The possible sequence formats are FASTQ

(26, 27, 28), FAST5 (29), and pod5 (30). In the case of FAST5, these files may be converted to FASTQ If a 198

199

human reader is required.

200

( o ) Aligned Sequence Data

201

202

Aligned sequences shall be included as BAM (Binary Alignment/MAP) formatted files (31, 32).

203

( p ) Annotation Formats

204

205

Annotation formats shall include Browser Extensible Data (BED) Format (33), Wiggle Track Format

(WIG) (34), General Feature Format (GFF3) (35), Variant Call format (VCF) (36), Gene Transfer Format 206

(GTF) (37), Genome Variation Format (GVF) (38) and/or Synthetic Biology Open Language (SBOL) (39). 207

208

( q ) Sequence Instrument Quality Metrics

209

210

(1) Base quality score .—Statistical algorithms used for base calling shall be known, verified and

converted to a Q score (26, 27). Average base quality score Q>20. Single base quality score for the 211

212

targeted region Q>30.

AOAC Draft Standard – Version 09282022; Public Comment Revisions

9

213

(2) Artefacts.— No artefacts found in final sequence (40, 41, 42).

214

(3) Sequencing platform specific error profiles .—All platform associated errors should be resolved

215

(43, 44).

216

(4) Variation in quality scores across the sequence read.—Sequence reads shall have an overall

217

resolved Q score >20 (45).

218

(5) Biases in sequence data driven by base composition.— GC-rich sequence bias shall be anticipated

219

and resolved, based on species specificity or nucleic acid repair (46, 47).

220

(6) Departure from suboptimal library fragment sizes . — If possible, average library fragment size

221

shall be provided as metadata (48).

222

(7) Contamination from known and unknown species other than the sequencing target .—

223

Contaminating species sequences shall be removed (49).

224

(8) Insert size .—Insert size and type of library preparation should be provided.

225

(9) Number of reads.— The minimum read depth is 20X.

226

(10) Base calling.— Base calling protocol should be provided.

227

(11) Sequence length distribution.— Library quality should be recorded.

228

(12) Length of longest contig.— Length of longest contig can be provided.

229

(13) N50 .—N50 should be given in annotations (50).

230

(14) NG50 .—NG50 should be given in annotations (50).

231

(15) Number of contigs.— Number of contigs should be given in annotations.

232

(16) Base composition.— The proportions of the four bases (adenine, cytosine, guanine, and thymine

or uracil) present in DNA or RNA expressed as the percentage (mol %) of G plus C should be given in 233

234

annotations.

235

(17) Coverage .—At the run level, minimum 20X. At the sample level, it depends on the application.

AOAC Draft Standard – Version 09282022; Public Comment Revisions

10

236

(18) Breadth of coverage .—Coverage for 95% of the genome should be at the minimum level or

higher depending on the expected application: 100% of target sequences should be at the minimum 237

238

coverage or higher.

239

(19) Cluster density .—The total length of all contigs or scaffolds should approximate the known

240

genome size of the target organism (51).

241

5 References

242

1. Beck, L., Coates, S.G., Gee, J., Hadfield, T., Jackson, P., Keim, P., Lindler, L., Ostlund, V.E., Roberto, 243 F., Samuel, J., Sharma, S., Tallent, S., & Wagner, D.M. (2018) J. of AOAC Int . 101 , 1667-1707, 244 https://doi.org/10.1093/jaoac/101.6.1665 245

2. Valdivia-Granda, W.A. (2013) Virulence 4 , 745–751, doi: 10.4161/viru.26893 246 3. 247

Sichtig, H., Minogue, T., Yan, Y., Stefan, C., Hall, A., Tallon, L., Sadzewicz, L., Nadendla, S., Klimke,

248

W., Hatcher, E., Shumway, M., Aldea, D.L., Allen, J., Koehler, J., Slezak, T., Lovell, S., Schoepp, R., &

Scherf, U. (2019) Nature Comm . 10 , 3313, https://doi.org/10.1038/s41467-019-11306-6

249

4. Sakai, K., Takeda, M., Shimizu, S., Takahama, T., Yoshida, T., Watanabe, S., Iwasa, T., Yonesaka, K., 250

251

Suzuki, S., Hayashi, H., Kawakami, H., Nonagase, Y., Tanaka, K., Tsurutani, J., Saigoh, K., Ito, A.,

Mitsudomi, T., Nakagawa, K., & Nishio, K. (2019) Sci. Rep . 9 , 11340,

252

253

https://doi.org/10.1038/s41598-019-47673-9

5. Pandey, K.R., Maden, N., Poudel, B., Pradhananga, S., & Sharma, A.K. (2012) Genomics, Proteomics 254 & Bioinformatics 10 , 317–325, htto:/dx.do1.org/0.016/1.ob.2012.06.006 255 6. Katsoulakis, E., Duffy, J.E., Hintze, B., Spector, N.L., & Kelley, M.J. (2020) JCO Precis Oncol 4:212- 256

257

221, DOI: 10.1200/PO.19.00118

7. Minogue, T.D., Koehler, J.W., Stefan, C.P., & Conrad, T.A. (2019) Clin. Chem . 65 , 383–392, 258 https://doi.org/10.1373/clinchem.2016.266536 259

AOAC Draft Standard – Version 09282022; Public Comment Revisions

11

8. ISO (2017) Health informatics — Data elements and their metadata for describing structured 260

261

clinical genomic sequence information in electronic health records (ISO/TS 20428:2017)

9. US Code of Federal Regulations, Title 42, Chapter I, Subchapter F, Part 73, Select agents and toxins 262

263

(eCFR :: 42 CFR Part 73 -- Select Agents and Toxins)

10. Sims, D., Sudbery, I., Ilott, N.E., Heger, A., & Ponting, C.P. (2014) Nature Reviews Genetics 15 , 121- 264 132, https://doi.org/10.1038/nrg3642 265

11. Bogaerts, B., Delcourt, T., Soetaert, K., Boarbi, S., Ceyssens, P-J., Winand, R., Van Braekel, J., De 266

267

Keersmaecker, S.C.J., Roosens, N.H.C., Marchal, K., Mathys, V., & Vanneste, K. (2021) J. Clin.

Microbiol. 59 , e00202-21, doi: 10.1128/JCM.00202-21

268

12. Portmann, A.-C., Fournier, C., Gimonet, J., Ngom-Bru, C., Barretto, C., & Baert, L. (2018) Front. 269 Microbiol . 9 , 446, https://doi.org/10.3389/fmicb.2018.00446 270 13. Roy, S., Coldren, C., Karunamurthy, A., Kip, N.S., Klee, E.W., Lincoln, S.E., Leon, A., Pullambhatla, 271 M., Temple-Smolkin, R.L., Voelkerding, K.V., Wang, C., & Carter, A.B. (2018) J. Mol. Diag . 20 , 4-27, 272 doi: 10.1016/j.jmoldx.2017.11.003 273 14. Laver, T., Harrison, J., O’Neill, P.A., Moore, K., Farbos, A., Paszkiewicza, K., & Studholme, D.J. 274 (2015) Biomol. Detect. and Quant . 3 , 1–8, doi: 10.1016/j.bdq.2015.02.001 275 15. ISO (2021) Genomics informatics — Reliability assessment criteria for high-throughput gene- 276

277

expression data (ISO/TS 22690:2021)

16. ISO (2022) Microbiology of the food chain —Whole genome sequencing for typing and genomic 278

279

characterization of foodborne bacteria — General requirements and guidance (ISO 23418:2022)

17. ISO (2022) Biotechnology — Requirements for data formatting and description in the life sciences 280

281

(ISO/FDIS 20691), https://fairsharing.org/search/?q=20691, accessed September 28, 2022.

18. Jacobs, I., & Walsh, N. (2004) Architecture of the World Wide Web, Volume One: W3C 282

283

Recommendation, http://www.w3.org/TR/webarch/, accessed September 27, 2022

AOAC Draft Standard – Version 09282022; Public Comment Revisions

12

19. Voges, J., Hernaez, M., Mattavelli, M., & Ostermann, J. (2021) Proc. IEEE 109, 1607-1622, doi: 284

285

10.1109/JPROC.2021.3082027

20. ISO (2020) Information technology — Genomic information representation — Part 1: Transport 286

287

and storage of genomic information (ISO/IEC 23092-1:2020)

21. ISO (2020) Information technology — Genomic information representation — Part 2: Coding of 288

289

genomic information (ISO/IEC 23092-2:2020)

22. ISO (2020) Information technology — Genomic information representation — Part 3: Metadata 290

291

and application programming interfaces (APIs) (ISO/IEC 23092-3:2020)

23. ISO (2020) Information technology — Genomic information representation — Part 4: Reference 292

293

software (ISO/IEC 23092-4:2020)

24. ISO (2020) Information technology — Genomic information representation — Part 5: 294

295

Conformance (ISO/IEC 23092-5:2020)

25. ISO (2022) Information technology — Genomic information representation — Part 6: Coding of 296

297

genomic annotations (ISO/IEC DIS 23092-6)

26. Ewing, B., Hillier, L., Wendl, M.C., & Green, P. (1998) Genome Res . 8 , 175-185, doi: 298 10.1101/gr.8.3.175 299 27. Ewing, B. & Green, P. (1998) Genome Res . 8 , 186–194, doi: 10.1101/gr.8.3.186 300 28. Cock, P.J.A., Fields, C.J., Goto, N., Heuer, M.L., & Rice, P.M. (2010) Nucl. Ac. Res . 38 , 1767–1771, 301 https://doi.org/10.1093/nar/gkp1137 302 29. Gamaarachchi, H., Samarakoon, H., Jenner, S.P., Ferguson, J.M., Amos, T.G., Hammond, J.M., 303 Saadat, H., Smith, M.A., Parameswaran, S., & Deveson, I.W. (2022) Nature Biotechnol. 40 , 1026- 304 1029, https://doi.org/10.1038/s41587-021-01147-4 305

30. Nanoporetech, pod5 file format, https://github.com/nanoporetech/pod5-file-format, accessed 306

307

September 28, 2022

AOAC Draft Standard – Version 09282022; Public Comment Revisions

13

31. Global Alliance for Genomics and Health SAM/BAM Format Specification Working Group (2022) 308

309

Sequence Alignment/Map Format Specification, https://samtools.github.io/hts-specs/SAMv1.pdf ,

310

accessed September 28, 2022

32. Li, H., Handsaker. B., Wysoker, A., Fennell,T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Richard 311

312

Durbin, R. and 1000 Genome Project Data Processing Subgroup (2009) Bioinformatics. 25:16,

313

2078–2079, DOI: https://doi.org/10.1093/bioinformatics/btp352

33. Pringle, T.H., Zahler, A.M., & Haussler, D. (2002) Genome Res . 12 , 996–1006, 314 DOI: 10.1101/gr.229102 315 34. Kent, W.J., Zweig, A.S., Barber, G., Hinrichs, A.S., & Karolchik, D. (2010) Bioinformatics 26 , 2204– 316 2207, https://doi.org/10.1093/bioinformatics/btq351 317

35. General Feature Format 3, http://gmod.org/wiki/GFF3, accessed September 28, 2022 318

36. Global Alliance for Genomics and Health (2022) The Variant Call Format (VCF) Version 4.3 319

320

Specification, https://samtools.github.io/hts-specs/VCFv4.3.pdf, accessed September 28, 2022

37. GTF2.2: A Gene Annotation Format. http://mblab.wustl.edu/GTF22.html, accessed September 28, 321

322

2022

38. Reese, M.G., Moore, B., Batchelor, C., Salas, F., Cunningham, F., Marth, G.T., Stein, L., Flicek, P., 323 Yandell, M., & Eilbeck, K. (2010) Genome Biol . 11 , R88, https://doi.org/10.1186/gb-2010-11-8-r88 324 39. McLaughlin, J.A., Beal, J. , Mısırlı , G., Grünberg, R., Bartley, B.A., Scott-Brown, J., Vaidyanathan, P., 325

326

Fontanarrosa, P., Oberortner, E., Wipat, A., Gorochowski, T.E., & Myers, C.J. (2020) Front. Bioeng.

Biotechnol . 8 , 1009, doi: 10.3389/fbioe.2020.01009

327

40. Bogaerts, B., Nouws, S., Verhaegen, B., Denayer, S., Van Braekel, J., Winand, R., Fu, Q., Crombé, F., 328

329

Piérard, D., Marchal, K., Roosens, N.H.C., De Keersmaecker, S.C.J., & Vanneste, K. (2021) Microb.

Genom . 7 , 000531, doi: 10.1099/mgen.0.000531

330

AOAC Draft Standard – Version 09282022; Public Comment Revisions

14

41. Sahlin, K. & Medvedev, P. (2021) Nature Comm . 12 , 2, https://doi.org/10.1038/s41467-020-20340- 331 8 332

42. Roberts, H.E., Lopopolo, M., Pagnamenta, A.T., Sharma, E., Parkes, D., Lonie, L., Freeman, C., 333

334

Knight, S.J.L., Lunter, G., Dreau, H., Lockstone, H., Taylor, J.C., Schuh, A., Bowden, R., & Buck, D.

(2021) Sci. Rep . 11 , 6408, https://doi.org/10.1038/s41598-021-85354-8

335

43. Stoler, N. & Nekrutenko, A. (2021) NAR Genomics and Bioinformatics 3, lqab019, doi: 336

337

10.1093/nargab/lqab019.

44. Wang, Y., Zhao, Y., Bollas, A., Wang,Y., & Au, K.F. (2021) Nature Biotechnol . 39 , 1348–1365, 338 https://doi.org/10.1038/s41587-021-01108-x 339 45. Ward, C.M., To, T.-H., & Pederson, S.M. (2020) Bioinformatics 36 , 2020, 2587–2588, doi: 340 10.1093/bioinformatics/btz937 341 46. Ross, M.G., Russ, C., Costello, M., Hollinger, A., Lennon, N.J., Hegarty, R., Nusbaum, C., & Jaffe, 342 D.B. (2013) Genome Biol . 14 , R51, https://doi.org/doi:10.1186/gb-2013-14-5-r51 343 47. Romiguier, J. & Roux, C. (2017) Front. Genet . 8 , 16, https://doi.org/10.3389/fgene.2017.00016. 344 48. ISO (2021) Biotechnology — Massively parallel sequencing — Part 2: Quality evaluation of 345 49. Vasiljevic, N., Lim, M., Humble, E., Seah, A., Kratzer, A., Morf, N.V., Prost, S., & Ogden, R. (2021) 347 Forensic Sci. Int.: Genetics 53 , 102493, https://doi.org/10.1016/j.fsigen.2021.102493 348 50. Alhakami, H., Mirebrahim, H., & Lonardi, S. (2017) Genome Biol . 18 , 93, 349 https://doi.org/10.1186/s13059-017-1213-3 350 51. Du, H., Hao, Y., & Wang, Z. (2022) Connection Science 34 , 857-873, 351 https://doi.org/10.1080/09540091.2021.2012422 352 sequencing data (ISO 20397-2:2021) 346

AOAC Draft Standard – Version 09282022; Public Comment Revisions

15

Comments Received and Reconciliation

Type of comment Line #

Comment

SPADA WG Response

Technical

41 Per London Calling 2022 update from Nanopore Technologies, they will be getting rid of the FAST5 format and moving to the "pod5" format if the move has not been made already. Maybe relevant to include separately or along with FAST5 definition. 176 ‐ 178 This section is not a complete sentence. Suggest: "At a minimum, annotation information must include sample name...clinical or environmental sample." 181 Remove "human". Not sure why human needs to be specified. The requirement of English should be specific enough. 192 ‐ 193 Could add the new "pod5" Nanopore Technologies format here as well. 108 Assigning the responsibility to individual parties to avoid assignment of equivalent VNGS URIs to multiple biothreat agent VNGS is good; however, responsible parties must have the ability to verify assigned URIs do not conflict. For instance, are canonical URLs used? 88 ‐ 89 This may have been in the document and I over looked it but what standards cover Verified Next Generation Sequence (VNGS). Or does that line mean the written standards of the current document.

accepted

Editorial

accepted

Editorial

accepted

accepted

Technical

Technical

not accepted, canonical URLs do not have the same URIs.

General

We mean the current standard.

General

47, 71, 95, 142, 166, 214, 328, 352, 376

I wasn't able to read the comments due to the overlay

Our apologies.

Technical

27 Biothreat agent ‐ Add a DOD context to the definition separate from the public health concerns. 37 Coverage ‐ Add definitions for depth and breadth of coverage 231 Insert breadth of coverage with definition between (17) and (18). Make (17) depth of coverage.

accepted with modification. The literary path used in OMA Appendix O was applied. DOD does not directly specifiy a list of BSAT.

accepted accepted

Technical Technical

Made with FlippingBook Digital Proposal Maker