Novel Linguistic Steganography Based on Character-Level Text Generation

Xiang, Lingyun; Yang, Shuanghui; Liu, Yuhang; Li, Qian; Zhu, Chengzhang

Novel Linguistic Steganography Based on Character-Level Text Generation

Lingyun Xiang, Shuanghui Yang, Yuhang Liu, Qian Li and Chengzhang Zhu
Additional contact information
Lingyun Xiang: Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology, Changsha 410114, China
Shuanghui Yang: School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
Yuhang Liu: School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
Qian Li: Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia
Chengzhang Zhu: Academy of Military Sciences, Beijing 100091, China

Mathematics, 2020, vol. 8, issue 9, 1-18

Abstract: With the development of natural language processing, linguistic steganography has become a research hotspot in the field of information security. However, most existing linguistic steganographic methods may suffer from the low embedding capacity problem. Therefore, this paper proposes a character-level linguistic steganographic method (CLLS) to embed the secret information into characters instead of words by employing a long short-term memory (LSTM) based language model. First, the proposed method utilizes the LSTM model and large-scale corpus to construct and train a character-level text generation model. Through training, the best evaluated model is obtained as the prediction model of generating stego text. Then, we use the secret information as the control information to select the right character from predictions of the trained character-level text generation model. Thus, the secret information is hidden in the generated text as the predicted characters having different prediction probability values can be encoded into different secret bit values. For the same secret information, the generated stego texts vary with the starting strings of the text generation model, so we design a selection strategy to find the highest quality stego text from a number of candidate stego texts as the final stego text by changing the starting strings. The experimental results demonstrate that compared with other similar methods, the proposed method has the fastest running speed and highest embedding capacity. Moreover, extensive experiments are conducted to verify the effect of the number of candidate stego texts on the quality of the final stego text. The experimental results show that the quality of the final stego text increases with the number of candidate stego texts increasing, but the growth rate of the quality will slow down.

Keywords: linguistic steganography; LSTM; automatic text generation; character-level language model (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/8/9/1558/pdf (application/pdf)
https://www.mdpi.com/2227-7390/8/9/1558/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:8:y:2020:i:9:p:1558-:d:411938

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().