Advancing the science of qualitative patient preference assessment using large language models

Grover, Ted; Krebs, Emanuel; Weymann, Deirdre; Ehman, Morgan; Regier, Dean A

Advancing the science of qualitative patient preference assessment using large language models

Ted Grover, Emanuel Krebs, Deirdre Weymann, Morgan Ehman and Dean A Regier

PLOS Digital Health, 2026, vol. 5, issue 3, 1-18

Abstract: Patient experiences and perspectives are essential for shaping patient-centered healthcare. While large language models (LLMs) in healthcare are typically applied to specific clinical or patient-facing tasks, they have not been used for qualitative patient preference assessment, which often relies on thematic analysis to understand patient views expressed in interviews or focus groups. LLMs show initial promise for performing inductive thematic analysis of healthcare interview or focus group transcripts, yet no empirical studies have investigated LLMs to facilitate qualitative patient preference assessment. We employed the open-source Hermes-3-Llama-3.1-70B LLM to perform inductive thematic analysis on focus group transcripts from a previously published qualitative patient preference assessment study using three optimized prompt frameworks, and evaluated semantic similarity of LLM generated themes against human-analyzed themes using the Sentence-T5-XXL language embedding model. Sentence-level theme similarity was assessed using Jaccard similarity coefficients (0–1 range), computing coefficient scores across a broad range of discrete cosine similarity thresholds. We further evaluated LLM themes for similarity in lexical diversity and reading grade-level metrics and benchmarked semantic similarity results with published similarity thresholds previously used with qualitative healthcare data. All prompt frameworks generated themes with median Jaccard similarity coefficients with human-analyzed themes between 0.46–0.64, indicating moderate semantic overlap. Our best-performing framework instructed to pursue thematic saturation scored closest to human-analyzed themes on all reading grade-level metrics, and demonstrated 12% higher semantic overlap with human-analyzed themes compared to published benchmarks. Our worst-performing framework produced themes with moderate semantic overlap and hallucinated findings unidentified in human-analyzed themes. We demonstrate that LLMs can perform inductive thematic analysis of qualitative patient preference data, producing themes substantively similar in content and style to human-analyzed themes when augmented with sufficient domain-specific context. While LLMs may augment thematic analysis, the contextual nature of qualitative analysis remains a challenge requiring collaborative LLM frameworks integrating human expertise.Author summary: The experiences and preferences of patients provide valuable insights towards evaluating the risks and benefits of new health products, services, and technologies, and can help guide appropriate decision making along the development process. Patient interviews or focus groups are commonly used by researchers to develop a deep understanding of patient perspectives and the perceived benefits or risks of a new healthcare product, service, or technology. While this approach is effective, there is considerable manual effort and time required by researchers to uncover themes from the transcripts of these interviews or focus groups. In this study, we demonstrate that applying prompt optimization to open-source large language models can effectively and rapidly generate themes on patient preferences similar in content and style to human-analyzed themes. Our study can inform best practices for large language model use in thematic evidence generation of patient preferences to improve healthcare decision-making and accelerated patient-centered healthcare.

Date: 2026
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0001263 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 01263&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0001263

DOI: 10.1371/journal.pdig.0001263

Access Statistics for this article

More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().