EconPapers    
Economics at your fingertips  
 

fsdaSAS: a package for robust regression for very large datasets including the batch forward search

Francesca Torti, Aldo Corbellini and Anthony C. Atkinson

LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library

Abstract: The forward search (FS) is a general method of robust data fitting that moves smoothly from very robust to maximum likelihood estimation. The regression procedures are included in the MATLAB toolbox FSDA. The work on a SAS version of the FS originates from the need for the analysis of large datasets expressed by law enforcement services operating in the European Union that use our SAS software for detecting data anomalies that may point to fraudulent customs returns. Specific to our SAS implementation, the fsdaSAS package, we describe the approximation used to provide fast analyses of large datasets using an FS which progresses through the inclusion of batches of observations, rather than progressing one observation at a time. We do, however, test for outliers one observation at a time. We demonstrate that our SAS implementation becomes appreciably faster than the MATLAB version as the sample size increases and is also able to analyse larger datasets. The series of fits provided by the FS leads to the adaptive data-dependent choice of maximally efficient robust estimates. This also allows the monitoring of residuals and parameter estimates for fits of differing robustness levels. We mention that our fsdaSAS also applies the idea of monitoring to several robust estimators for regression for a range of values of breakdown point or nominal efficiency, leading to adaptive values for these parameters. We have also provided a variety of plots linked through brushing. Further programmed analyses include the robust transformations of the response in regression. Our package also provides the SAS community with methods of monitoring robust estimators for multivariate data, including multivariate data transformations.

Keywords: approximate analysis; big data; linked plots; monitoring; robust regression (search for similar items in EconPapers)
JEL-codes: C1 (search for similar items in EconPapers)
Pages: 21 pages
Date: 2021-06-01
New Economics Papers: this item is included in nep-big
References: Add references at CitEc
Citations: View citations in EconPapers (2)

Published in Stats, 1, June, 2021, 4(2), pp. 327 – 347. ISSN: 2571-905X

Downloads: (external link)
http://eprints.lse.ac.uk/109895/ Open access version. (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ehl:lserod:109895

Access Statistics for this paper

More papers in LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library LSE Library Portugal Street London, WC2A 2HD, U.K.. Contact information at EDIRC.
Bibliographic data for series maintained by LSERO Manager (lseresearchonline@lse.ac.uk).

 
Page updated 2024-12-28
Handle: RePEc:ehl:lserod:109895