EconPapers    
Economics at your fingertips  
 

Position: The Pre/Post-Training Boundary Should Govern IP in Industry-Academia ML Collaborations

Dirk Bergemann, Soheil Ghili and Nitzan Mekel-Bobrov

Papers from arXiv.org

Abstract: Industry-academia ML collaborations routinely fail to launch -- not for scientific reasons, but because academics must publish while companies must protect models trained on proprietary data, and no standard contract framework resolves this tension. Because contracts are negotiated by legal departments alone, many apparent legal disputes are incentive misalignment problems that only scientists at the table can correctly diagnose. We propose PBOS (Protect-the-Business / Open-Source-the-Science), a community-adoptable contract template anchored to a single technically-grounded boundary: pre-training artifacts (architectures, training code, benchmarks, untrained weights) are open science; post-training artifacts (weights trained on proprietary data) are business IP. This boundary is technically meaningful, legally clean, and auditable -- and could not have been drawn correctly without scientists at the negotiating table. We argue the ML community should adopt PBOS as its default contract for such collaborations.

Date: 2026-05
References: Add references at CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2605.22632 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2605.22632

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2026-05-22
Handle: RePEc:arx:papers:2605.22632