Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
Nathaniel H. Park (),
Matteo Manica,
Jannis Born,
James L. Hedrick,
Tim Erdmann,
Dmitry Yu. Zubarev,
Nil Adell-Mill and
Pedro L. Arrechea
Additional contact information
Nathaniel H. Park: IBM Research–Almaden
Matteo Manica: IBM Research–Zurich
Jannis Born: IBM Research–Zurich
James L. Hedrick: IBM Research–Almaden
Tim Erdmann: IBM Research–Almaden
Dmitry Yu. Zubarev: IBM Research–Almaden
Nil Adell-Mill: IBM Research–Zurich
Pedro L. Arrechea: IBM Research–Almaden
Nature Communications, 2023, vol. 14, issue 1, 1-15
Abstract:
Abstract Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate a broad array of experiment and data types found in polymer science. This inflexibility presents a significant barrier for researchers to leverage their historical data in ML development. Here we show that a domain specific language, termed Chemical Markdown Language (CMDL), provides flexible, extensible, and consistent representation of disparate experiment types and polymer structures. CMDL enables seamless use of historical experimental data to fine-tune regression transformer (RT) models for generative molecular design tasks. We demonstrate the utility of this approach through the generation and the experimental validation of catalysts and polymers in the context of ring-opening polymerization—although we provide examples of how CMDL can be more broadly applied to other polymer classes. Critically, we show how the CMDL tuned model preserves key functional groups within the polymer structure, allowing for experimental validation. These results reveal the versatility of CMDL and how it facilitates translation of historical data into meaningful predictive and generative models to produce experimentally actionable output.
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-023-39396-3 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-39396-3
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-023-39396-3
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().