Adaptable address parser with active learning
You-Xuan Lin
International Journal of Data Mining, Modelling and Management, 2023, vol. 15, issue 1, 79-101
Abstract:
Address parsing, decomposing address strings to semantically meaningful components, is a measure to convert unstructured or semi-structured address data to structured one. Flexibility and variability in real-world address formats make parser development a non-trivial task. Even after all the time and effort dedicated to obtaining a capable parser, updating or even re-training is required for out-of-domain data and extra costs will be incurred. To minimise the cost of model building and updating, this study experiments with active learning for model training and adaptation. Models composed of character-level embedding and recurrent neural networks are trained to parse address in Taiwan. Results show that by active learning, 420 additional instances to the training data are sufficient for a model to adapt itself to unfamiliar data while its competence in the original domain is retained. This suggests that active learning is helpful for model adaptation when data labelling is expensive and restricted.
Keywords: address parsing; record linkage; active learning; model adaptation; recurrent neural network; RNN; address in Taiwan. (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.inderscience.com/link.php?id=129991 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:15:y:2023:i:1:p:79-101
Access Statistics for this article
More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().