Professor Daniela Tuninetti receives grant to develop framework for distributed computation
Professor Tuninetti receives grant to develop framework for distributed computation
Daniela Tuninetti, professor and interim department head in University of Illinois Chicago’s Electrical and Computer Engineering Department, has received a three-year, $475,000 grant from the National Science Foundation to develop a framework for distributed computation, which would enable faster information exchange for future data- and computation-intense applications.
The grant, CIF: Small: Fundamental Tradeoffs Between Communication Load and Storage Resources in Distributed systems, begins October 1 and will fund two Ph.D. students. The research covers two related problems: distributed cache-aided ‘Fog Radio Access Network’ architectures, which models a type of data network envisaged for 5G wireless systems, and peer-to-peer distributed data shuffling, which has applications in big data and machine learning.
From the crush of people downloading videos or movies to their smartphones during an evening commute to complex algorithms in big data and machine learning, an ever-increasing amount of data is being processed on modern wireless networks. The amount of data can no longer be processed by a single machine or stored on a single machine. Computing devices must operate on parts of the data and exchange bits of information in order to be able to compute the final result. Finding the optimal way to split this data into pieces for each machine to process so as to minimize the communication load is part of what Tuninetti is studying.
“Each machine will do something on one piece of the data. You need to get all results—partial results—back to the same point before you can compute the final result,” Tuninetti said. “It’s not the time the CPU is crunching this data but rather exchanging the information over a resource-constrained communication network that may take a long time.”
It’s not only a matter of ensuring the computation is split in the most efficient way, it’s building in redundancy. Each machine will compute more than what is strictly needed to complete a calculation, and that little extra will make the communication phase faster–even if it takes more memory to complete the local computation.
“You are trading off local storage space for computation speed. We call it communication load. There are less things you need to communicate,” Tuninetti said. “We are trying to figure out the tradeoff between those storage space resources and the number of things you have to communicate, either through a central server or other workers in distributed computation problem.”
Tuninetti will be using error-correcting codes to take care of the distributed nature of the computations. While codes have been around for decades to protect information from various communication impairments, only in recent years have they been applied in distributed computing frameworks. Erasure-correcting codes cleverly add redundancy to the information data so that you can recover your data even if some pieces get lost (i.e., are erased). They essentially fill in the blanks of what information you don’t have, by treating it as if you ‘lost’ data, even if you never received it in the first place as part of your machine’s piece of a computation. Those codes can be used to speed up the communication phase among machines.
“Part of this grant is using codes to speed up data shuffling when machines have to exchange information—this is what I’m most excited about,” Tuninetti said.
The other part of the grant involves pre-loading information a user may want into its local memory ahead of time, to save on the overall number of transmissions once the user requests a piece of content. This reduces the communication load on congested networks during peak times.
“The idea is that we can somehow store at 3 a.m. in the memory of a phone something the user may want to watch or listen to at 5 p.m. If it’s stored locally, I don’t have to get it from the network, and it’s a way to balance the network utilization,” Tuninetti said.
“About five years ago people realized we shouldn’t look at each device individually but in a network. Not just what a particular user would want, but what all the users globally would want,” Tuninetti said.” The reduction in the communication load scales proportionally to the amount of memory globally available in the network. The more users, the merrier.”
Reducing congestion load: an example
Serving many users at once, or “creating multicast opportunities,” is key to reducing congestion load. For example, say Alice and Bob want to listen to two different songs. Assume that Alice has already stored locally SongA, which she listened to at some point in the past, and now wants to listen to SongB. At the same time Bob has already stored locally SongB and now wants to listen to SongA.
A classical approach is to deliver first SongB to Alice, and then SongA to Bob, for a total of two songs.
Instead, Alice and Bob can be served at the same time by delivering the “coded data” SongA SongB, which has the size of just one single song. Alice and Bob can take the coded data and subtract off what they have stored locally; this allows each of them to recover their desired song. As both songs were sent in one transmission, the new “coded delivery” has cut in half the number of required transmissions.