EconPapers    
Economics at your fingertips  
 

Co-training framework for enhancing survey accuracy while reducing respondent burden in travel data collection

Reem Alolabi and Makoto Chikaraishi

Transportation Research Part A: Policy and Practice, 2026, vol. 205, issue C

Abstract: A major bottleneck in travel behavior analysis is the need for a substantial amount of labeled data, which typically places a burden on survey respondents for collecting travel behavior data. Our study addresses this issue by leveraging semi-supervised learning, specifically utilizing the co-training algorithm, which effectively incorporates both labeled (active) and unlabeled (passive) data. We extend the semi-supervised learning concept to be a part of the survey scheme that involves both data collection process and enrichment process of travel attributes. Our experiments, focusing on travel mode identification using GPS data from Hiroshima, Japan, demonstrate that our proposed method outperforms existing conventional supervised learning methods such as neural networks, KNN, and SVM, particularly when incorporating an increased proportion of unlabeled data. This strategic use of unlabeled data achieves two apparently conflicting goals: (1) reduces the reliance on extensive manual labeling, thereby alleviating respondent burdens, and (2) increases the accuracy of the prediction. The results of our experiments also reveal that the delicate balance between labeled and unlabeled data proportions plays a pivotal role in co-training performance. Beyond serving as a mode identification tool, our findings underscore the transformative potential of co-training as a valuable data filtering method: By optimizing the interplay between labeled and unlabeled data, co-training efficiently filters noise and refines the dataset. This contributes to enhanced survey accuracy while minimizing labeling burdens. Our results provide useful information to design an adaptive scheme that dynamically tailors the information solicited from respondents to optimize the balance between data quality and respondent burden.

Keywords: Co-training; Respondent burden; Survey accuracy; Travel mode detection; Semantic enrichment (search for similar items in EconPapers)
Date: 2026
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0965856425003398
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:transa:v:205:y:2026:i:c:s0965856425003398

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01

DOI: 10.1016/j.tra.2025.104706

Access Statistics for this article

Transportation Research Part A: Policy and Practice is currently edited by John (J.M.) Rose

More articles in Transportation Research Part A: Policy and Practice from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2026-02-10
Handle: RePEc:eee:transa:v:205:y:2026:i:c:s0965856425003398