EconPapers    
Economics at your fingertips  
 

Column-Type Prediction for Web Tables Powered by Knowledge Base and Text

Junyi Wu, Chen Ye (), Haoshi Zhi and Shihao Jiang
Additional contact information
Junyi Wu: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Chen Ye: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Haoshi Zhi: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Shihao Jiang: College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China

Mathematics, 2023, vol. 11, issue 3, 1-15

Abstract: Web tables are essential for applications such as data analysis. However, web tables are often incomplete and short of some critical information, which makes it challenging to understand the web table content. Automatically predicting column types for tables without metadata is significant for dealing with various tables from the Internet. This paper proposes a CNN-Text method to deal with this task, which fuses CNN prediction and voting processes. We present data augmentation and synthetic column generation approaches to improve the CNN’s performance and use extracted text to get better predictions. The experimental result shows that CNN-Text outperforms the baseline methods, demonstrating that CNN-Text is well qualified for the table column type prediction.

Keywords: column type prediction; knowledge base; convolutional neural network (CNN); text data (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/3/560/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/3/560/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:3:p:560-:d:1042851

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:11:y:2023:i:3:p:560-:d:1042851