SSD Data Tiering
Dragonfly v1.21.0 introduces a powerful new feature: SSD data tiering. With it, Dragonfly can leverage SSD/NVMe devices as a secondary storage tier that complements RAM. By intelligently offloading specific data to fast disk storage, Dragonfly can significantly reduce physical memory usage, potentially achieving a 2x-5x improvement while maintaining sub-millisecond average latency.
How It Works
Dragonfly's data tiering focuses on string values exceeding 64 characters in size. When enabled, these longer strings are offloaded to the SSD tier. Shorter strings and other data types, along with the primary hashtable index, remain in high-speed memory for rapid lookup. This tiered approach maintains high performance while reducing memory consumption. When accessed, offloaded data is seamlessly retrieved from SSD and integrated back into memory. Write, delete, and expire operations are managed entirely in-memory, leveraging disk-based keys for efficient operation.
Enabling Data tiering
The feature can be enable by passing --tiered_prefix <nvme_path>/<basename>
flag.
Dragonfly will automatically check the free disk space on the partition hosting <nvme_path>
and
will deduce the maximum capacity it can use. In order to explicitly set the maximum
disk space capacity, for data tiering you can use --tiered_max_file_size=<size>
. For example,
--tiered_max_file_size=96G
.
Checking Data tiering metrics
Dragonfly provides detailed metrics to help you monitor and analyze data tiering performance.
When running redis-cli info tiered
you get the following metrics:
- tiered_entries: - how many values are offloaded.
- tiered_entries_bytes: - how much data was offloaded in bytes
- tiered_total_stashes: - how many offload requests were issued
- tiered_total_fetches: - how many times the offloaded items were read from disk
- tiered_total_deletes: - how many times the offloaded items were deleted from disk
- tiered_total_uploads: - how many times the offloaded items were promoted back to RAM
- tiered_allocated_bytes: - how much disk space was used by data tiering.
- tiered_capacity_bytes: - the maximum size of tiered on-disk capacity.
- tiered_pending_read_cnt: - currently pending io read requests. A high number indicates that the server is bottlenecked on disk read i/o.
- tiered_pending_stash_cnt: - currently pending io write requests. A high number indicates that the server is bottlenecked on disk write i/o.
- tiered_cold_storage_bytes: - cooling queue capacity in bytes
- tiered_ram_hits: - how many times an entry lookup resulted in in-memory hit
- tiered_ram_misses: - how many times an entry lookup resulted in a disk read
- tiered_ram_cool_hits: - how many times an entry lookup resulted in cooling buffer hit.
Performance
Performance benchmarks against Elasticache instances and Memcached,
conducted on AWS instances, demonstrate Dragonfly's superior performance.
We've conducted loadtests using r6gd.xlarge
instance on AWS and compared Dragonfly against
datastore with similar features, namely Elasticache Data tiering cache.r6gd.xlarge
instance and
self-hosted Memcached/ExtStore server also running on r6gd.xlarge
. The test consisted of
writing a 90GB dataset into all stores and then reading them randomly with uniform distribution.
During the write phase with 200K RPS, Memcached had to drop ~18% of the workload in order to cope with memory pressure. Elasticache, on the other hand, throttled down the traffic to 66.5K RPS. Dragonfly handled 200K RPS with 8ms P99 latency and with enough RAM reserves to digest even more data.
During the read phase Dragonfly was the only datastore that could actually saturate local SSD IOPS and reached 60K RPS for reads. Elasticache could reach 23.5K RPS and Memcached reached only 16K RPS for reads. With 60K RPS throughput, Dragonfly inhibited 5ms P99 latency, while both Memcached and Elasticache reached 190ms P99 latency.
System requirements
Dragonfly Data Tiering requires Linux kernel version 5.19 or higher with io_uring API enabled.
io_uring API is a critical requirement for SSD Data tiering. Additionally, it requires
that <nvme_path>
points to a relatively fast SSD disk. For cloud workloads, using instances
with locally attached SSDs is recommended. See below for specific cloud provider suggestions.
GCP
GCP has excellent local SSD hardware that has great performance characteristics. Any instance with local SSD storage will suffice.
AWS
AWS provides a variety of instance families with local SSD disks attached. You can refer to the following table for detailed performance characteristics of local SSDs for each instance type. We recommend choosing an instance type from the r6gd or m6gd families.
Notes
Data tiering is currently in alpha, which means it's under development and may have limited functionality or stability. If you encounter any issues while using data tiering, please report them by filing an issue.
Limitations:
- Data tiering is not currently supported by BITOP and HLL operations.