EconPapers    
Economics at your fingertips  
 

Combining sequence and itemset mining to discover named entities in biomedical texts: a new type of pattern

Marc Plantevit, Thierry Charnois, Jiri Klema, Christophe Rigotti and Bruno Cremilleux

International Journal of Data Mining, Modelling and Management, 2009, vol. 1, issue 2, 119-148

Abstract: Biomedical named entity recognition (NER) is a challenging problem. In this paper, we show that mining techniques, such as sequential pattern mining and sequential rule mining, can be useful to tackle this problem but present some limitations. We demonstrate and analyse these limitations and introduce a new kind of pattern called LSR pattern that offers an excellent trade-off between the high precision of sequential rules and the high recall of sequential patterns. We formalise the LSR pattern mining problem first. Then we show how LSR patterns enable us to successfully tackle biomedical NER problems. We report experiments carried out on real datasets that underline the relevance of our proposition.

Keywords: LSR patterns; left-sequence-right patterns; sequential patterns; biomedical NER; named entity recognition; constraint-based pattern mining; biomedical texts; sequential rule mining; gene names; protein names; text mining; information extraction. (search for similar items in EconPapers)
Date: 2009
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.inderscience.com/link.php?id=26073 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:1:y:2009:i:2:p:119-148

Access Statistics for this article

More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().

 
Page updated 2025-03-19
Handle: RePEc:ids:ijdmmm:v:1:y:2009:i:2:p:119-148