EconPapers    
Economics at your fingertips  
 

Optimization of Scalable Machine Learning Pipelines for Big Data Analytics in Distributed Systems

Tian Qi

Artificial Intelligence and Digital Technology, 2023, vol. 1, issue 1, 24-35

Abstract: This paper proposes an optimization approach for machine learning pipelines in distributed systems aimed at improving scalability and performance for big data analytics. The approach addresses key challenges such as data partitioning, load balancing, resource management, and fault tolerance. Experimental results demonstrate significant improvements in throughput, latency, scalability, and resource utilization, with up to a 43% increase in throughput and a 35% reduction in resource consumption. The optimized pipeline not only performs better under increasing dataset sizes and node counts but also exhibits enhanced fault tolerance and cost efficiency. This study contributes to advancing the efficiency and effectiveness of machine learning pipelines in distributed environments, offering valuable insights for large-scale data processing and analysis.

Keywords: machine learning; distributed systems; big data; scalability; optimization; performance (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:

Downloads: (external link)
https://soapubs.com/index.php/ICSS/article/view/207/217 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:axf:icssaa:v:1:y:2023:i:1:p:24-35

Access Statistics for this article

More articles in Artificial Intelligence and Digital Technology from Scientific Open Access Publishing
Bibliographic data for series maintained by Yuchi Liu ().

 
Page updated 2025-04-15
Handle: RePEc:axf:icssaa:v:1:y:2023:i:1:p:24-35