Making Stata estimation commands faster through automatic differentiation and integration with Python
Paul Lambert
Additional contact information
Paul Lambert: University of Leicester
2021 Stata Conference from Stata Users Group
Abstract:
Fitting complex statistical models to very large datasets can be frustratingly slow. This is particularly problematic if multiple models need to be fit, for example, when using bootstrapping, cross-validation, or multiple imputation. I will introduce the mlad command, as an alternative to Stata's ml command, to estimate parameters using maximum likelihood. Rather than writing a Stata or Mata function to calculate the likelihood, mlad requires this to be written in Python. A key advantage is that there is no need to derive the gradient vector or the Hessian matrix because these are obtained through automatic differentiation using the Python Jax module. In addition, the functions for the likelihood, gradients, and Hessian matrix are compiled and able to use multiple processors. This makes maximizing likelihoods using mlad easier to implement and substantially faster than using ml with the advantage that all results are returned to Stata. Implementing mlad on the author’s own estimation commands leads to speed improvements of 70–98% compared with ml. The syntax of mlad is almost identical to that of ml, making it easy for programmers to add an option to their estimation command so that users using large datasets can benefit from the speed improvements.
Date: 2021-08-07
New Economics Papers: this item is included in nep-isf
References: Add references at CitEc
Citations:
Downloads: (external link)
http://fmwww.bc.edu/repec/scon2021/US21_Lambert.pdf
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:scon21:24
Access Statistics for this paper
More papers in 2021 Stata Conference from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().