A novel approach to solve exact matching problem using multi-splitting of text patterns
Shashank Srivastav (),
P. K. Singh () and
Divakar Yadav ()
Additional contact information
Shashank Srivastav: Madan Mohan Malaviya University of Technology
P. K. Singh: Madan Mohan Malaviya University of Technology
Divakar Yadav: Indira Gandhi National Open University
International Journal of System Assurance Engineering and Management, 2023, vol. 14, issue 4, No 22, 1457-1466
Abstract:
Abstract Matching approaches based on the exact match are popular and commonly utilized in various applications such as texting, intrusion detection network systems, web-based search engines and molecular biology. The efficiency of these approaches is measured in terms of time taken in the text searching and effective utilization of heap memory. This paper suggests a unique approach for achieving both time proficiency and better memory usage by splitting the query text pattern used for searching the text Corpus. The propound approach divides the query pattern P into multiple parts such as $${\text{P}}_{{\text{k}}} ,\;{\text{P}}_{{{\text{k}} - {1}}} ,\; \ldots ,\;{\text{P}}_{{2}}$$ P k , P k - 1 , … , P 2 and P1 where k depends on the pattern length. In this article, the scanning of the text corpus is performed from the right to the left and the searching of the multiple sub-patterns is performed in the right to the left order. The propound approach applies the traditional Boyer Moore approach to minimize the comparison cost by using a bad match table for the characters of the first sub-pattern P1 only. The sub-patterns other than P1 are matched using the Brute Force approach. The sub-patterns are mapped at the beginning of other sub-patterns such as $${\text{P}}_{{\text{k}}} {\text{P}}_{{{\text{k}} - {1}}} \ldots {\text{P}}_{{2}} {\text{P}}_{{1}}$$ P k P k - 1 … P 2 P 1 , to find the exact matching. The comparative study of the traditional approaches and the suggested solution indicates that the suggested approach outruns the traditional approaches in terms of memory utilization and matching time.
Keywords: Information retrieval; Exact pattern matching; Pattern splitting; Partition-based pattern matching; String matching (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s13198-023-01948-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:ijsaem:v:14:y:2023:i:4:d:10.1007_s13198-023-01948-7
Ordering information: This journal article can be ordered from
http://www.springer.com/engineering/journal/13198
DOI: 10.1007/s13198-023-01948-7
Access Statistics for this article
International Journal of System Assurance Engineering and Management is currently edited by P.K. Kapur, A.K. Verma and U. Kumar
More articles in International Journal of System Assurance Engineering and Management from Springer, The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().