EconPapers    
Economics at your fingertips  
 

Using structural information to improve search in Web collections

Edleno S. de Moura, David Fernandes, Berthier Ribeiro‐Neto, Altigran S. da Silva and Marcos André Gonçalves

Journal of the American Society for Information Science and Technology, 2010, vol. 61, issue 12, 2503-2513

Abstract: In this work, we investigate the problem of using the block structure of Web pages to improve ranking results. Starting with basic intuitions provided by the concepts of term frequency (TF) and inverse document frequency (IDF), we propose nine block‐weight functions to distinguish the impact of term occurrences inside page blocks, instead of inside whole pages. These are then used to compute a modified BM25 ranking function. Using four distinct Web collections, we ran extensive experiments to compare our block‐weight ranking formulas with two other baselines: (a) a BM25 ranking applied to full pages, and (b) a BM25 ranking that takes into account best blocks. Our methods suggest that our block‐weighting ranking method is superior to all baselines across all collections we used and that average gain in precision figures from 5 to 20% are generated.

Date: 2010
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/asi.21436

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:61:y:2010:i:12:p:2503-2513

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890

Access Statistics for this article

More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamist:v:61:y:2010:i:12:p:2503-2513