A community-powered search of machine learning strategy space to find NMR property prediction models
Lars A Bratholm,
Will Gerrard,
Brandon Anderson,
Shaojie Bai,
Sunghwan Choi,
Lam Dang,
Pavel Hanchar,
Addison Howard,
Sanghoon Kim,
Zico Kolter,
Risi Kondor,
Mordechai Kornbluth,
Youhan Lee,
Youngsoo Lee,
Jonathan P Mailoa,
Thanh Tu Nguyen,
Milos Popovic,
Goran Rakocevic,
Walter Reade,
Wonho Song,
Luka Stojanovic,
Erik H Thiede,
Nebojsa Tijanic,
Andres Torrubia,
Devin Willmott,
Craig P Butts and
David R Glowacki
PLOS ONE, 2021, vol. 16, issue 7, 1-16
Abstract:
The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published ‘in-house’ efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties.
Date: 2021
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0253612 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 53612&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0253612
DOI: 10.1371/journal.pone.0253612
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().