Offloading Algorithms for Maximizing Inference Accuracy on Edge Device in an Edge Intelligence System

Fresa, Andrea; Champati, Jaya Prakash

doi:10.1109/TPDS.2023.3267458

dc.contributor.author	Fresa, Andrea
dc.contributor.author	Champati, Jaya Prakash
dc.date.accessioned	2023-07-19T10:00:01Z
dc.date.available	2023-07-19T10:00:01Z
dc.date.issued	2023-04-19
dc.identifier.issn	1045-9219	es
dc.identifier.uri	https://hdl.handle.net/20.500.12761/1731
dc.description.abstract	With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collected at the EDs, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling down the model size by trading off the inference accuracy. Considering that multiple ML models are available at the ED, and a powerful ML model is available at the ES, we formulate an Integer Linear Programming (ILP) problem with the objective of maximizing the total inference accuracy of n data samples at the ED subject to a time constraint T on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR 2 ) and prove that it results in a makespan at most 2T and achieves a total accuracy that is lower by a small constant from the optimal total accuracy implying that AMR 2 is asymptotically optimal. Further, if the data samples are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. Furthermore, we extend AMR 2 for the case of multiple ESs, where each ES is equipped with a powerful ML model. As proof of concept, we implemented AMR 2 on a Raspberry Pi, equipped with MobileNets, that is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR 2 for image classification.	es
dc.description.sponsorship	Jaya Prakash Champati	es
dc.language.iso	eng	es
dc.publisher	IEEE	es
dc.title	Offloading Algorithms for Maximizing Inference Accuracy on Edge Device in an Edge Intelligence System	es
dc.type	journal article	es
dc.journal.title	IEEE Transactions on Parallel and Distributed Systems	es
dc.type.hasVersion	VoR	es
dc.rights.accessRights	open access	es
dc.volume.number	34	es
dc.issue.number	7	es
dc.identifier.doi	10.1109/TPDS.2023.3267458	es
dc.page.final	2039	es
dc.page.initial	2025	es
dc.subject.keyword	Edge Intelligence	es
dc.subject.keyword	Edge Computing	es
dc.subject.keyword	IoT	es
dc.subject.keyword	Data models	es
dc.subject.keyword	Computational modeling	es
dc.subject.keyword	Inference algorithms	es
dc.subject.keyword	Costs	es
dc.subject.keyword	Servers	es
dc.subject.keyword	Approximation algorithms	es
dc.subject.keyword	Scheduling	es
dc.subject.keyword	computational complexity	es
dc.subject.keyword	deep learning (artificial intelligence)	es
dc.subject.keyword	dynamic programming	es
dc.subject.keyword	edge computing	es
dc.subject.keyword	inference mechanisms	es
dc.subject.keyword	integer programming	es
dc.subject.keyword	linear programming	es
dc.subject.keyword	resource allocation	es
dc.description.refereed	TRUE	es
dc.description.status	pub	es

Files in this item

Name:: AMR2_2-4.pdf
Size:: 1.137Mb
Format:: PDF

This item appears in the following Collection(s)

IMDEA Networks

Show simple item record