A paper from Georgia Institute of Technology researchers will be the first ever to be recognized with a new designation that lets other scientists know that its results have been successfully duplicated.
The new designation scheme is a response to recent studies by the Association for Computing Machinery (ACM) that have shown that an “uncomfortably large number of research results” are not measuring up to the basic scientific method premise that experimental research results must be reproducible to be validated.
To meet this fundamental challenge, ACM, its Special Interest Group on High Performance Computing (SIGHPC), and the International Conference for High Performance Computing, Networking, Storage, and Analysis–more commonly known as Supercomputing (SC)–are rolling out a new initiative that aims to promote and strengthen the integrity of computing’s research community.
At this week’s SC16 in Salt Lake City, ACM and SC organizers will begin awarding badges to papers judged at the annual event. These badges represent one or more of the ‘Three Rs’ crucial to scientific research: repeatability, replicability and reproducibility.
The inaugural badge for Results Replicated will be awarded to a paper written by a team from Georgia Tech during the SC16 awards ceremony set for Nov. 17.
Badge of honor
“This is a great honor for our Georgia Tech team,” said School of Computational Science and Engineering Professor and co-Executive Director of Georgia Tech’s Institute for Data Engineering and Science (IDEaS) Srinivas Aluru. “Research integrity is of critical importance and I applaud SIGHPC and SC for undertaking this initiative. To be recognized with the inaugural award speaks volumes about the ethics and quality of work undertaken by GeorgiaTech Ph.D. students.”
Written by Ph.D. students Patrick Flick, Chirag Jain and Tony Pan, and Aluru, under research conducted for an NSF BIG DATA project, the paper earning the first-ever Three Rs badge is titled “A Parallel Connectivity Algorithm for De Bruijn Graphs in Metagenomic Applications.” Jain led the effort of converting their research software into a reproducible benchmark, which got selected for this initiative at SC16. The badge will be visible on the paper in the ACM Digital Library.
Going forward, the badges will be awarded when a paper’s artifacts—software, scripts, collected data, etc. either created by the authors to be used as part of the study or generated by the experiment itself–are determined to be:
- Repeatable by the same team with the same experimental setup
- Replicable by a different team with the same experimental setup
- Reproducible by a different team with a different experimental setup
This paper was presented and judged at last year’s SC event in Austin, Texas. Following a call to all authors of SC papers from the past three years, the GT Computing team agreed to transform a portion of their experiment into an application challenge for the SC16 Student Cluster Competition (SCC).
The SCC pits undergraduate and high school student teams against each other in a 48-hour competition to build a small HPC cluster. Once the cluster is complete, the students must use it to execute a real-world workload based on data sets and other artifacts from the GT Computing paper, as well as others judged at SC.
Along with the badge denoting replicated results for their paper, the Georgia Tech authors will also receive a certificate of appreciation from SIGHPC duirng the SC16 award ceremony.