- This event has passed.
ECE PhD Proposal Review: Maher Kachmar
October 15, 2020 @ 4:00 pm - 5:00 pm
PhD Proposal Review: Active Resource Partitioning and Planning for Storage Systems using Time Series Forecasting and Machine Learning Techniques
Maher Kachmar
Location: Zoom Link
Abstract: In today’s enterprise storage systems, supported data services such as snapshot delete or drive rebuild can result in tremendous performance overhead if executed inline along with heavy foreground IO, often leading to missing Service Level Objectives (SLOs). Moreover, static partitioning of storage systems resources such as CPU cores or memory caches may lead to missing Service Level Agreements (SLAs) such as data reduction rate (DRR). However, typical storage system applications such as Virtual Desktop Infrastructure (VDI) or web services follow a repetitive workload pattern that can be learned and/or forecasted. Learning these workload pattern allows us to address several storage system resource partitioning and planning challenges that may not be overcome with traditional manual tuning and primitive feedback mechanism.
We propose a priority-based background scheduler that learns storage system workload repetitive pattern and allows storage systems to maintain peak performance and meet service level objectives (SLOs) while supporting a number of data services. When foreground IO demand intensifies, system resources are dedicated to service foreground IO requests and any background processing that can be deferred are recorded to be processed in future idle cycles as long as our forecaster predicts that the storage pool has remaining capacity. The smart background scheduler adopts a resource partitioning model that allows both foreground and background IO to execute together as long as foreground IOs are not impacted, harnessing any free cycles to clear background debt. Using traces from VDI and web services applications, we show how our technique can out performance a static method that sets fixed limits on the deferred background debt and reduces SLO violations from 54.6% (when using a fixed background debt watermark), to only 6.2% when dynamically adjusted by our smart background scheduler.
This thesis also proposes a smart capacity planning and recommendation tool that ensures the right number of drives are available in the storage pool in order to meet both capacity and performance constrains without over-provisioning storage. Aided by forecasting models that characterizes workload pattern, we can predict future storage pool utilization and drive over-wearing. Similarly, to meet SLOs, the tool recommends expanding pool space in order to defer more background work through larger debt bins. We also propose a content-aware learning cache (CALC) that uses machine learning techniques to actively partition the storage system cache between a deduplication data digest cache, content cache, and address based data cache to improve cache hit performance while maximizing data reduction rate (DRR).