Outsourcing machine learning is a rising trend in industry. Major tech firms have launched cloud platforms that conduct computation-heavy tasks, such as running data through a convolutional neural network (CNN) for image classification. Resource-strapped small businesses and other users can upload data to those services for a fee and get back results in several hours.
But what if there are data leaks? In recent years, researchers have explored various secure-computation techniques to protect such sensitive data. But those methods have performance drawbacks that make neural network evaluation (testing and validating) sluggish — sometimes as much as a million times slower — limiting their wider adoption.
A novel encryption method was developed that secures data used in online neural networks without dramatically slowing their runtimes. This approach holds promise for using cloud-based neural networks for medical image analysis and other applications that use sensitive data.
The system blends two conventional techniques — homomorphic encryption and garbled circuits — in a way that helps the networks run orders of magnitude faster than they do with conventional approaches. The system, called GAZELLE, was tested on two-party image-classification tasks. A user sends encrypted image data to an online server evaluating a CNN running on GAZELLE. After this, both parties share encrypted information back and forth in order to classify the user's image. Throughout the process, the system ensures that the server never learns any uploaded data, while the user never learns anything about the network parameters. Compared to traditional systems, however, GAZELLE ran 20 to 30 times faster than state-of-the-art models, while reducing the required network bandwidth by an order of magnitude.
CNNs process image data through multiple linear and nonlinear layers of computation. Linear layers do the complex math (linear algebra) and assign some values to the data. At a certain threshold, the data is outputted to nonlinear layers that do some simpler computation, make decisions (such as identifying image features), and send the data to the next linear layer. The end result is an image with an assigned class, such as vehicle, animal, person, or anatomical feature.
Recent approaches to securing CNNs have involved applying homomorphic encryption or garbled circuits to process data throughout an entire network. These techniques are effective at securing data, but they render complex neural networks inefficient.
Homomorphic encryption, used in cloud computing, receives and executes computation all in encrypted data, called ciphertext, and generates an encrypted result that can then be decrypted by a user. When applied to neural networks, this technique is particularly fast and efficient at computing linear algebra; however, it must introduce a little noise into the data at each layer. Over multiple layers, noise accumulates, and the computation needed to filter that noise grows increasingly complex, slowing computation speeds.
Garbled circuits are a form of secure two-party computation. The technique takes an input from both parties, does some computation, and sends two separate inputs to each party. In that way, the parties send data to one another, but they never see the other party's data — only the relevant output on their side. The bandwidth needed to communicate data between parties, however, scales with computation complexity, not with the size of the input. In an online neural network, this technique works well in the nonlinear layers, where computation is minimal, but the bandwidth becomes unwieldy in math-heavy linear layers.
Combining the two techniques gets around their inefficiencies. In GAZELLE, a user will upload ciphertext to a cloud-based CNN. The user must have garbled circuits technique running on their own computer. The CNN does all the computation in the linear layer, then sends the data to the nonlinear layer. At that point, the CNN and user share the data. The user does some computation on garbled circuits and sends the data back to the CNN. By splitting and sharing the workload, the system restricts the homomorphic encryption to doing complex math one layer at a time, so data doesn't become too noisy. It also limits the communication of the garbled circuits to just the nonlinear layers, where it performs optimally.