Identifying HIV sequences that escape antibody neutralization using random forests and collaborative targeted learning
Jin Yutong () and
Benkeser David ()
Additional contact information
Jin Yutong: Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, USA
Benkeser David: Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, USA
Journal of Causal Inference, 2022, vol. 10, issue 1, 280-295
Abstract:
Recent studies have indicated that it is possible to protect individuals from HIV infection using passive infusion of monoclonal antibodies. However, in order for monoclonal antibodies to confer robust protection, the antibodies must be capable of neutralizing many possible strains of the virus. This is particularly challenging in the context of a highly diverse pathogen like HIV. It is therefore of great interest to leverage existing observational data sources to discover antibodies that are able to neutralize HIV viruses via residues where existing antibodies show modest protection. Such information feeds directly into the clinical trial pipeline for monoclonal antibody therapies by providing information on (i) whether and to what extent combinations of antibodies can generate superior protection and (ii) strategies for analyzing past clinical trials to identify in vivo evidence of antibody resistance. These observational data include genetic features of many diverse HIV genetic sequences, as well as in vitro measures of antibody resistance. The statistical learning problem we are interested in is developing statistical methodology that can be used to analyze these data to identify important genetic features that are significantly associated with antibody resistance. This is a challenging problem owing to the high-dimensional and strongly correlated nature of the genetic sequence data. To overcome these challenges, we propose an outcome-adaptive, collaborative targeted minimum loss-based estimation approach using random forests. We demonstrate via simulation that the approach enjoys important statistical benefits over existing approaches in terms of bias, mean squared error, and type I error. We apply the approach to the Compile, Analyze, and Tally Nab Panels database to identify AA positions that are potentially causally related to resistance to neutralization by several different antibodies.
Keywords: HIV; collaborative targeted minimum loss-based estimation; variable importance; random forests (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/jci-2021-0053 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:causin:v:10:y:2022:i:1:p:280-295:n:1
DOI: 10.1515/jci-2021-0053
Access Statistics for this article
Journal of Causal Inference is currently edited by Elias Bareinboim, Jin Tian and Iván Díaz
More articles in Journal of Causal Inference from De Gruyter
Bibliographic data for series maintained by Peter Golla ().