EconPapers    
Economics at your fingertips  
 

Sequential Pattern Mining Algorithm Based on Text Data: Taking the Fault Text Records as an Example

Xinglong Yuan, Wenbing Chang, Shenghan Zhou and Yang Cheng
Additional contact information
Xinglong Yuan: School of Reliability and System Engineering, Beihang University, Beijing 100191, China
Wenbing Chang: School of Reliability and System Engineering, Beihang University, Beijing 100191, China
Shenghan Zhou: School of Reliability and System Engineering, Beihang University, Beijing 100191, China
Yang Cheng: Center for Industrial Production, Aalborg University, 9220 Aalborg, Denmark

Sustainability, 2018, vol. 10, issue 11, 1-19

Abstract: Sequential pattern mining (SPM) is an effective and important method for analyzing time series. This paper proposed a SPM algorithm to mine fault sequential patterns in text data. Because the structure of text data is poor and there are many different forms of text expression for the same concept, the traditional SPM algorithm cannot be directly applied to text data. The proposed algorithm is designed to solve this problem. First, this study measured the similarity of fault text data and classified similar faults into one class. Next, this paper proposed a new text similarity measurement model based on the word embedding distance. Compared with the classic text similarity measurement method, this model can achieve good results in short text classification. Then, on the basis of fault classification, this paper proposed the SPM algorithm with an event window, which is a time soft constraint for obtaining a certain number of sequential patterns according to needs. Finally, this study used the fault text records of a certain aircraft as experimental data for mining fault sequential patterns. Experiment showed that this algorithm can effectively mine sequential patterns in text data. The proposed algorithm can be widely applied to text time series data in many fields such as industry, business, finance and so on.

Keywords: time series; sequential pattern mining; data analytics; text similarity; text mining (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2018
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2071-1050/10/11/4330/pdf (application/pdf)
https://www.mdpi.com/2071-1050/10/11/4330/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:10:y:2018:i:11:p:4330-:d:184548

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jsusta:v:10:y:2018:i:11:p:4330-:d:184548