Emergent Analogical Reasoning in Large Language Models: A Replication with Open-weights Alternatives
Andrea Gregor de Varda,
Chiara Saponaro and
Marco Marelli
No 239, I4R Discussion Paper Series from The Institute for Replication (I4R)
Abstract:
Webb, Holyoak & Lu (2023) compared human reasoners and GPT-3 on several tasks involving analogy resolution, documenting human-level or superhuman performance in most conditions. In this direct replication, we tested a different, open-weights language model (Mixtral-8x7B) on the same materials (Experiments 1 and 2, "Digit Matrices" and "Letter String" problems) or on an augmented dataset to obtain the desired statistical power (Experiment 4, "Story Analogies"). Our replication confirmed the sign and statistical significance of the reported effects in the Digit Matrices and Story Analogies problems, whereas the model did not surpass the human baseline in the Letter String problems.
Date: 2025
New Economics Papers: this item is included in nep-exp
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.econstor.eu/bitstream/10419/319837/1/I4R-DP239.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:zbw:i4rdps:239
Access Statistics for this paper
More papers in I4R Discussion Paper Series from The Institute for Replication (I4R)
Bibliographic data for series maintained by ZBW - Leibniz Information Centre for Economics ().