TL;DR#
Many algorithms in distribution property testing lack replicability: their outputs vary significantly across runs even when applied to the same data, threatening the reliability of scientific studies. Existing uniformity testing algorithms, while efficient, are not always replicable, leading to potentially conflicting results depending on the input. This unreliability undermines public trust in scientific findings if these algorithms are used in real world applications.
This paper tackles this problem by developing a new uniformity tester that guarantees replicable outputs with high probability. The researchers achieve this by using a total variation distance statistic, which is less sensitive to outliers compared to previous methods. Their tester uses a number of samples that is nearly optimal for the problem. The authors also prove a lower bound on sample complexity, showing that their algorithm is near-optimal for a broad class of symmetric testing algorithms.
Key Takeaways#
Why does it matter?#
This paper is crucial for researchers focusing on algorithmic replicability and distribution testing. It offers novel theoretical bounds and a replicable uniformity tester, addressing critical issues of non-replicable behavior in existing algorithms. This work is significant due to its focus on the real-world applicability of algorithms and its impact on the trustworthiness of scientific studies. The paper opens new avenues for research on replicable algorithms, particularly in areas like identity and closeness testing, which face similar challenges.