Machine Learning Using Cassandra as a Data Source: The Importance of Cassandra's Frozen Collections in Training and Retraining Models
Radhika Kanubaddhi ()
Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023, 2024, vol. 1, issue 1, 219-228
Abstract:
This paper explores the integration of Apache Cassandra as a data source for machine learning (ML) applications, emphasizing the role of Cassandra's frozen collections in model training and retraining. The study highlights how Cassandra's distributed and scalable architecture enables efficient storage and retrieval of large, diverse datasets essential for machine learning tasks. A key focus is placed on the functionality of frozen collections within Cassandra, which allow for compact storage of complex data structures like lists, sets, and maps. By using these frozen collections, machine learning models can be trained and retrained more effectively, improving data consistency, performance, and scalability. The paper also presents case studies and experiments demonstrating how leveraging frozen collections can optimize the machine learning pipeline, reducing latency and enhancing real-time model updates.
Keywords: Machine learning; Apache Cassandra; Frozen collections; Distributed databases; Data storage; Model retraining; Big data (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
https://newjaigs.com/index.php/JAIGS/article/view/228 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:das:njaigs:v:1:y:2024:i:1:p:219-228:id:228
Access Statistics for this article
Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023 is currently edited by Justyna Żywiołek
More articles in Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023 from Open Knowledge
Bibliographic data for series maintained by Open Knowledge ().