Loading Events

« All Events

  • This event has passed.

Jinkun Zhang PhD Proposal Review

August 11, 2023 @ 2:00 pm - 3:00 pm

Location: ISEC305

Title: Low-latency forwarding, caching and computation placement in data-centric networks

Committee Members:
Prof. Edmund Yeh (Advisor)
Prof. Stratis Ioannidis
Prof. Kaushik Chowdhury

Abstract:
With the exponential growth of data- and computation-intensive network applications, such as real-time augmented reality/virtual reality rendering and large-scale language model training, traditional cloud computing frameworks exhibit inherent limitations. These limitations include significant round-trip delays caused by backhaul network capacity bottlenecks and exorbitant costs associated with centralized computing power, e.g., training GPT-4 requires over 16,000 A100 GPUs.
To address these challenges, dispersed computing has emerged as a promising next-generation networking paradigm. By enabling geographically distributed nodes with heterogeneous computation capabilities to collaborate, dispersed computing overcomes the bottlenecks of traditional cloud computing and facilitates in-network computation tasks, including the training of large models.

Furthermore, in data-centric networking, communication and computation are resolved around data names instead of host addresses.
The deployment of network caches, by enabling data reuse, offers substantial benefits for data-centric networks.
For instance, consider a scenario where multiple machine learning applications seek to train different models simultaneously. These application could (partially) share data samples and intermediate results, and carefully designed data-reusing mechanisms become necessary. Optimal caching of data or intermediate results can significantly reduce the overall training cost, compared to each application independently gathering and transmitting data.

To efficiently manage computation and storage resources in heterogeneous data-centric networks, several frameworks have been proposed with different design objectives, such as optimizing throughput or incorporating multicast flows. However, previous approaches have failed to minimize average user delay despite the latency sensitivity of numerous real-world applications.

This proposal aims to address this gap by introducing a low-latency framework that jointly optimizes packet forwarding, storage deployment, and computation placement. The proposed framework effectively supports data-intensive and latency-sensitive computation applications in data-centric networks with heterogeneous communication, storage, and computation capabilities.

Specifically, to minimize user latency in congestible networks, we model delays caused by link transmissions and CPU computations using
traffic-dependent nonlinear functions. We formulate the joint forwarding, caching, and computation problem as an NP-hard mixed-integer non-submodular optimization, for which no constant-factor approximation algorithms are currently known. To make progress, we approach the joint problem by dividing it into two subproblems: the joint forwarding/computation problem and the joint forwarding/caching problem. Despite the non-convexity of the former subproblem, we provide a set of sufficient optimality conditions that lead to a distributed algorithm with polynomial-time convergence to the global optimum. For the latter subproblem, we demonstrate its NP-hardness and non-submodularity, even after continuous relaxation. We show that the objective function is a sum of a convex function and a geodesic convex function, and propose a set of conditions that provide a finite bound from the optimum. To the best of our knowledge, our method represents the first analytical progress in addressing the joint caching and forwarding problem with arbitrary topology and non-linear costs. Furthermore, our theoretical bound leads to a constant-factor approximation under additional assumptions.

As future work, we propose to develop a novel in-network large model training framework, building upon the aforementioned method.
Due to the substantial model size and extensive data samples required for training, centralized model storing and training are nearly infeasible for small and intermediate service providers.
Consequently, we will adopt horizontal model partitioning and distribute different model layers across the network nodes through caching.
Data samples or batches are input into the network and undergo the forward-backward procedure for training. Our objective is to jointly optimize data forwarding and model/computation placement, thereby minimizing the total cost of transmission, computation, and storage.

Furthermore, we introduce several network resource allocation optimization problems related to data-centric networking, thereby expanding the scope of our proposal.

Details

Date:
August 11, 2023
Time:
2:00 pm - 3:00 pm
Website:
https://northeastern.zoom.us/j/92849848970

Other

Department
Electrical and Computer Engineering
Topics
MS/PhD Thesis Defense
Audience
MS, PhD