Untapped Location Discovery on Social Media by Combining Geospatial Clustering with Natural Language Processing
Siddharth Mehta (),
Gautam Jain () and
Shuchi Mala ()
Additional contact information
Siddharth Mehta: Amity School of Engineering and Technology, Noida, Uttar Pradesh, India
Gautam Jain: Amity School of Engineering and Technology, Noida, Uttar Pradesh, India
Shuchi Mala: Amity School of Engineering and Technology, Noida, Uttar Pradesh, India
Journal of Information & Knowledge Management (JIKM), 2024, vol. 23, issue 03, 1-17
Abstract:
This methodology combines geospatial clustering and Natural Language Processing (NLP) to create a framework for discovering unexplored geotags in social media. The framework contains the collection of data from social media platforms, the preprocessing of data with Pandas, Natural Language Toolkit (NLTK) and SpaCy libraries for the NLP analysis as well as for sentiment analysis and named entity recognition, followed by spatial clustering with Density-based Space Clustering of Noise Applications (DBSCAN), K-Means and HDBSCAN algorithms, then visualising with Matplotlib and Folium libraries. The data analysis and statistics were done using Pandas and NumPy libraries, and exploration through the selection and collection of more data based on the previous step. In addition, a prediction model has been developed to predict a location cluster using its name by comparing it to the preprocessed comma-separated values data file. Currently, there are certain locations like small-scale hospitals or unknown tourist places which are not currently tagged on available maps applications. This framework can be useful for researchers and policy makers to identify those locations and gain insights from social media data and find its potential for decision-making in various fields.
Keywords: Geospatial; data analysis; clustering; Natural Language Processing (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219649224500254
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:jikmxx:v:23:y:2024:i:03:n:s0219649224500254
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0219649224500254
Access Statistics for this article
Journal of Information & Knowledge Management (JIKM) is currently edited by Professor Suliman Hawamdeh
More articles in Journal of Information & Knowledge Management (JIKM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().