SwiftKV: A Metadata Indexing Scheme Integrating LSM-Tree and Learned Index for Distributed KV Stores

Wang, Zhenfei; Feng, Jianxun; Dun, Longxiang; Bao, Ziliang; Du, Chunfeng

SwiftKV: A Metadata Indexing Scheme Integrating LSM-Tree and Learned Index for Distributed KV Stores

Zhenfei Wang, Jianxun Feng, Longxiang Dun, Ziliang Bao and Chunfeng Du ()
Additional contact information
Zhenfei Wang: School of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
Jianxun Feng: School of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
Longxiang Dun: School of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
Ziliang Bao: School of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
Chunfeng Du: School of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China

Future Internet, 2025, vol. 17, issue 9, 1-21

Abstract: Optimizing metadata indexing remains critical for enhancing distributed file system performance. The Traditional Log-Structured Merge-Trees (LSM-Trees) architecture, while effective for write-intensive operations, exhibits significant limitations when handling massive metadata workloads, particularly manifesting as suboptimal read performance and substantial indexing overhead. Although existing learned indexes perform well on read-only workloads, they struggle to support modifications such as inserts and updates effectively. This paper proposes SwiftKV, a novel metadata indexing scheme that combines LSM-Tree and learned indexes to address these issues. Firstly, SwiftKV employs a dynamic partition strategy to narrow the metadata search range. Secondly, a two-level learned index block, consisting of Greedy Piecewise Linear Regression (Greedy-PLR) and Linear Regression (LR) models, is leveraged to replace the typical Sorted String Table (SSTable) index block for faster location prediction than binary search. Thirdly, SwiftKV incorporates a load-aware construction mechanism and parallel optimization to minimize training overhead and enhance efficiency. This work bridges the gap between LSM-Trees’ write efficiency and learned indexes’ query performance, offering a scalable and high-performance solution for modern distributed file systems. This paper implements the prototype of SwiftKV based on RocksDB. The experimental results show that it narrows the memory usage of index blocks by 30.06% and reduces read latency by 1.19×~1.60× without affecting write performance. Furthermore, SwiftKV’s two-level learned index achieves a 15.13% reduction in query latency and a 44.03% reduction in memory overhead compared to a single-level model. For all YCSB workloads, SwiftKV outperforms other schemes.

Keywords: metadata indexing; KV storage; LSM-Tree; dynamic partitioning; learned index (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/17/9/398/pdf (application/pdf)
https://www.mdpi.com/1999-5903/17/9/398/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:17:y:2025:i:9:p:398-:d:1738320

Access Statistics for this article

Future Internet is currently edited by Ms. Grace You

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().