EconPapers    
Economics at your fingertips  
 

Extracting Visually Presented Element Relationships from Web Documents

Radek Burget and Pavel Smrz
Additional contact information
Radek Burget: Faculty of Information Technology, IT4Innovations Centre of Excellence, Brno University of Technology, Brno, Czech Republic
Pavel Smrz: Faculty of Information Technology, IT4Innovations Centre of Excellence, Brno University of Technology, Brno, Czech Republic

International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 2013, vol. 7, issue 2, 13-29

Abstract: Many documents in the World Wide Web present structured information that consists of multiple pieces of data with certain relationships among them. Although it is usually not difficult to identify the individual data values in the document text, their relationships are often not explicitly described in the document content. They are expressed by visual presentation of the document content that is expected to be interpreted by a human reader. In this paper, the authors propose a formal generic model of logical relationships in a document based on an interpretation of visual presentation patterns in the documents. The model describes the visually expressed relationships between individual parts of the contents independently of the document format and the particular way of presentation. Therefore, it can be used as an appropriate document model in many information retrieval or extraction applications. The authors formally define the model, the authors introduce a method of extracting the relationships between the content parts based on the visual presentation analysis and the authors discuss the expected applications. The authors also present a new dataset consisting of programmes of conferences and other scientific events and the authors discuss its suitability for the task in hand. Finally, the authors use the dataset to evaluate results of the implemented system.

Date: 2013
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 18/ijcini.2013040102 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jcini0:v:7:y:2013:i:2:p:13-29

Access Statistics for this article

International Journal of Cognitive Informatics and Natural Intelligence (IJCINI) is currently edited by Kangshun Li

More articles in International Journal of Cognitive Informatics and Natural Intelligence (IJCINI) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jcini0:v:7:y:2013:i:2:p:13-29