Tuninetti researching ways to minimize use of communication resources while ensuring privacy
Tuninetti researching ways to minimize use of communication resources while ensuring privacy Heading link
Professor and ECE Department Head Daniela Tuninetti is researching ways to minimize the use of communication resources in computer networks during information retrieval while maintaining the privacy of the person or entities who initiated the query.
Tuninetti, in collaboration with Mingyue Ji at the University of Utah, and Hua Sun at the University of North Texas, will develop novel codes and algorithms to efficiently execute complex queries on massive databases that are stored on multiple remote servers.
Storing multiple copies of data or files in temporary locations close to the end users, known as caching, is used to speed up download times and increase the efficiency of the network. Browsers and websites will load faster because access elements, such as homepage images, have been previously downloaded.
Still, increasing demands on network resources strain available bandwidth, especially at peak usage times. An example of this is a streaming service such as Netflix, which will have far more customers trying to access movies at 7 p.m. than 7 a.m.
Caching data locally can assist with handling high demand, but this approach is inefficient when each service provider manages their own cache. Utilizing coded caching, where all caches are managed as a single network, can result in significant bandwidth savings. This has made coded caching an active research area as, theoretically, it allows for a tradeoff between (expensive) network bandwidth and (cheaper) local storage.
While most research on coded caching has focused on single file retrieval, its application is expected to be especially beneficial when a request involves retrieving the result of computational tasks on large data sets. In such instances, downloading and locally storing every data point in the data set is not only impractical, and possibly unwise, but also very demanding on the network bandwidth.
For instance, a researcher may want to compute the average blood pressure of a patient group. With coded caching, there is no need to download every individual blood pressure value/data point. Instead, a function of the data set can be downloaded, allowing the researcher to obtain the needed result while also ensuring the privacy of the individual medical records.
“What we are trying to do is go beyond single file retrieval,” Tuninetti said. “So how do you manage a network of distributed caches in a way that you can deliver results of functions that may be useful for those users, but in a secure and private way?”
While caching improves download speed, it risks confidential or sensitive data being exposed. To improve security, Tuninetti and her collaborators are using error-correcting codes, which have long been used to protect information from errors or defects, such as on a scratched CD. Every time you have information stored or transmitted that can be altered by something that is beyond your control, you can add redundancy to protect it. This concept can extend to errors caused by data that was not stored in the first place.
“This is an emerging application of similar ideas–now you don’t have errors caused by nature, but you might have missing information because you know you didn’t have space to store it,” Tuninetti said. “From the perspective of the end user, a piece of information is missing; it doesn’t matter who caused that missing piece not to be there.”
Tuninetti is focused on a fundamental understanding of the secure distributed computation aspects of the project, by capturing some of the difficulty and complexity of computing all the data that is being accumulated by the growing number of sensors and AI-powered applications.
“We want to know what the fundamental trade-offs in such systems are,” Tuninetti said. “Nothing comes for free, so how much is it going to ‘cost’ to be able to have that distributed, secure, private system for function retrieval?”
Tuninetti’s past work on single file retrieval without security or privacy considerations was also sponsored by the National Science Foundation under award number 1910309 “CIF: Small: Fundamental Tradeoffs Between Communication Load and Storage Resources in Distributed Systems.”
Tuninetti’s share of the new $1.2 million, four-year National Science Foundation award number 2312229 “Collaborative Research: CIF: Medium: Fundamental Limits of Cache-aided Multi-user Private Function Retrieval,” is $460,000.