SPADA Draft Documents

match primers, short amplicons, no-template near neighbors, and single-plex PCR, many real- 589 world applications do not fall into this category. Thus, a “ground truth” dataset is needed to help 590 determine model accuracy. The dataset could be used to objectively evaluate algorithms from 591 different research groups. The experimental, “ground truth” PCR dataset would need to capture 592 many details including: (a) the target genome, (b) the presence of contaminating organisms 593 (determined through NGS sequencing), (c) the enzyme and buffer compositions, (d) the primer 594 and probe concentrations, (e) the composition of the amplicon products (by NGS sequencing to 595 reveal the concentrations of the desired amplicon and off-target amplicons, primer dimers, etc.), 596 and (f) the composition of the PCR reaction at each cycle of PCR (e.g., real time monitoring of 597 the fluorescence, along with quantification of primer concentrations and enzyme activity). For 598 this training dataset, both the PCR inputs and outputs would be publicly revealed to enable the 599 user community to improve and validate their in silico methods. 600 601 PCR Datasets in Support of Competitions to Spur the Community Forward 602 603 an open competition to assess the performance of different computational approaches for in 604 silico PCR, using experimental data, which could provide a quantitative ranking of models by 605 accuracy and spur the development of improved in silico models. For the training set, both the 606 PCR inputs and outputs (described above in Assessing Model Accuracy ) would be publicly 607 revealed. For the validation sets, the PCR inputs would be revealed publicly, but the outputs 608 would be held secret for future evaluation (by independent referees) of different contestant 609 methods (i.e. PCR predictions from different research groups). The final goal is to quantitatively 610 evaluate the performance of different scoring methods (i.e. what did different models get right 611 Similar to the Critical Assessment of Protein Structure Prediction (CASP), there is a need for

32

Made with FlippingBook flipbook maker