Column-Type Prediction for Web Tables Powered by Knowledge Base and Text
Junyi Wu,
Chen Ye (),
Haoshi Zhi and
Shihao Jiang
Additional contact information
Junyi Wu: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Chen Ye: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Haoshi Zhi: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Shihao Jiang: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Mathematics, 2023, vol. 11, issue 3, 1-15
Abstract:
Web tables are essential for applications such as data analysis. However, web tables are often incomplete and short of some critical information, which makes it challenging to understand the web table content. Automatically predicting column types for tables without metadata is significant for dealing with various tables from the Internet. This paper proposes a CNN-Text method to deal with this task, which fuses CNN prediction and voting processes. We present data augmentation and synthetic column generation approaches to improve the CNN’s performance and use extracted text to get better predictions. The experimental result shows that CNN-Text outperforms the baseline methods, demonstrating that CNN-Text is well qualified for the table column type prediction.
Keywords: column type prediction; knowledge base; convolutional neural network (CNN); text data (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/11/3/560/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/3/560/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:3:p:560-:d:1042851
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().