Loading Events

« All Events

  • This event has passed.

ECE PhD Proposal Review: Maher Kachmar

October 15, 2020 @ 4:00 pm - 5:00 pm

PhD Proposal Review: Active Resource Partitioning and Planning for Storage Systems using Time Series Forecasting and Machine Learning Techniques

Maher Kachmar

Location: Zoom Link

Abstract: In today’s enterprise storage systems, supported data services such as snapshot delete or drive rebuild can result in tremendous performance overhead if executed inline along with heavy foreground IO, often leading to missing Service Level Objectives (SLOs). Moreover, static partitioning of storage systems resources such as CPU cores or memory caches may lead to missing Service Level Agreements (SLAs) such as data reduction rate (DRR). However, typical storage system applications such as Virtual Desktop Infrastructure (VDI) or web services follow a repetitive workload pattern that can be learned and/or forecasted. Learning these workload pattern allows us to address several storage system resource partitioning and planning challenges that may not be overcome with traditional manual tuning and primitive feedback mechanism.

We propose a priority-based background scheduler that learns storage system workload repetitive pattern and allows storage systems to maintain peak performance and meet service level objectives (SLOs) while supporting a number of data services. When foreground IO demand intensifies, system resources are dedicated to service foreground IO requests and any background processing that can be deferred are recorded to be processed in future idle cycles as long as our forecaster predicts that the storage pool has remaining capacity. The smart background scheduler adopts a resource partitioning model that allows both foreground and background IO to execute together as long as foreground IOs are not impacted, harnessing any free cycles to clear background debt. Using traces from VDI and web services applications, we show how our technique can out performance a static method that sets fixed limits on the deferred background debt and reduces SLO violations from 54.6% (when using a fixed background debt watermark), to only 6.2% when dynamically adjusted by our smart background scheduler.

This thesis also proposes a smart capacity planning and recommendation tool that ensures the right number of drives are available in the storage pool in order to meet both capacity and performance constrains without over-provisioning storage. Aided by forecasting models that characterizes workload pattern, we can predict future storage pool utilization and drive over-wearing. Similarly, to meet SLOs, the tool recommends expanding pool space in order to defer more background work through larger debt bins. We also propose a content-aware learning cache (CALC) that uses machine learning techniques to actively partition the storage system cache between a deduplication data digest cache, content cache, and address based data cache to improve cache hit performance while maximizing data reduction rate (DRR).

Details

Date:
October 15, 2020
Time:
4:00 pm - 5:00 pm
Website:
https://northeastern.zoom.us/j/94544080683

Other

Department
Electrical and Computer Engineering
Topics
MS/PhD Thesis Defense