ML Evaluation Standards
The aim of the workshop is to discuss and propose standards for evaluating ML research, in order to better identify promising new directions and to accelerate real progress in the field of ML research. The problem requires understanding the kinds of practices that add or detract from the generalizability or reliability of results reported, and incentives for researchers to follow best practices. We may draw inspiration from adjacent scientific fields, from statistics, or history of science. Acknowledging that there is no consensus on best practices for ML, the workshop will have a focus on panel discussions and a few invited talks representing a variety of perspectives. The call to papers will welcome opinion papers as well as more technical papers on evaluation of ML methods. We plan to summarize the findings and topics that emerged during our workshop in a short report.
Call for papers
We invite two types of papers – opinion papers (up to 4 pages) stating positions on the topics related to those listed above, and methodology papers (up to 8 pages excluding references) about evaluation in ML. These topics may include:
- Establishing benchmarking standards for ML research
- Reliable tools/protocols for benchmarking and evaluation
- Understanding and defining reproducibility for machine learning
- Meta analyses thoroughly evaluating existing claims across papers
- Incentives for doing better evaluation and reporting results
Submission Site: https://cmt3.research.microsoft.com/SMILES2022
Speakers
Thomas Wolf Hugginface Inc. |
Frank Schneider University of Tübingen |
Rotem Dror University of Pennsylvania |
James Evans University of Chicago |
Melanie Mitchell Sante Fe Institute |
Katherine Heller Google Brain |
Corinna Cortes Google Research NYC |
Panels
Reproducibility and Rigor in ML
Rotem Dror University of Pennsylvania |
Sara Hooker Google Brain |
Koustuv Sinha Mila, McGill University |
Frank Schneider University of Tübingen |
Gaël Varoquaux INRIA |
Slow vs Fast Science
Chelsea Finn Stanford University |
Michela Paganini DeepMind |
James Evans University of Chicago |
Russel Poldrack Stanford University |
Oriol Vinyals DeepMind |
Incentives for Better Evaluation
Corinna Cortes Google Research NYC |
Yoshua Bengio Mila, Université de Montréal |
John Langford Microsoft Research |
Kyunghyun Cho New York University |
Organizers
Stephanie Chan DeepMind |
Rishabh Agarwal Google Brain |
Xavier Bouthillier Mila, Université de Montréal |
Caglar Gulcehre DeepMind |
Jesse Dodge Allen Institute for AI |
For any queries, please reach out to the organizers at ml-eval-iclr2022@googlegroups.com .