EconPapers    
Economics at your fingertips  
 

Reasoning about unstructured data de-identification

Patricia Thaine and Gerald Penn
Additional contact information
Patricia Thaine: PhD Candidate, University of Toronto Co-Founder & CEO, Private AI, Canada
Gerald Penn: Professor of Computer Science, University of Toronto Co-Founder & Chief Science Officer, Private AI, Canada

Journal of Data Protection & Privacy, 2020, vol. 3, issue 3, 299-309

Abstract: We frame the problem of de-identifying unstructured text within the greater landscape of privacy-enhancing technologies. We then cover what sort of background knowledge can be gained from only stylistic information about a written document and how we can use research on authorship attribution and author profiling to improve our understanding about the sorts of inferences that can be made from an otherwise de-identified text. Finally, we provide a risk score for determining the likelihood that a message will be attributed to a particular author within a dataset using only author profiling tools.

Keywords: anonymisation; de-identification; authorship attribution; author profiling; unstructured data; risk (search for similar items in EconPapers)
JEL-codes: K2 (search for similar items in EconPapers)
Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
https://hstalks.com/article/5711/download/ (application/pdf)
https://hstalks.com/article/5711/ (text/html)
Requires a paid subscription for full access.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:aza:jdpp00:y:2020:v:3:i:3:p:299-309

Access Statistics for this article

More articles in Journal of Data Protection & Privacy from Henry Stewart Publications
Bibliographic data for series maintained by Henry Stewart Talks ().

 
Page updated 2025-03-19
Handle: RePEc:aza:jdpp00:y:2020:v:3:i:3:p:299-309