Comments on “VERSA: Verifiable Secure Aggregation for Cross-Device Federated Learning”

Federated learning (FL) allows a large number of users to collaboratively train machine learning (ML) models by sending only their local gradients to a central server for aggregation in each training iteration, without sending their raw training data. The main security issues of FL, that is, the privacy of the gradient vector and the correctness verification of the aggregated gradient, are gaining increasing attention from industry and academia. To protect the privacy of the gradient, a secure aggregation was proposed; to verify the correctness of the aggregated gradient, a verifiable secure aggregation that requires the server to provide a verifiable aggregated gradient was proposed. In 2021, Hahn et al. proposed VERSA, a verifiable secure aggregation. However, in this article, we will point out a flaw in VERSA, which indicates that VERSA does not work. To address the flaw, we present several approaches with different advantages and disadvantages. We hope that by identifying the flaw, similar errors can be avoided in future designs of verifiable secure aggregation.


I. INTRODUCTION
F EDERATED learning (FL) [7] is a promising collaborative ma- chine learning (ML) framework allowing models to be trained on sensitive real-world data while preserving its privacy.In FL, each user trains a copy of a global model locally on their data and computes a local gradient vector, which is then sent to a centralized server (i.e., aggregator).The server combines these gradient vectors and obtains an aggregated gradient, which is then sent back to all users.Upon receiving the aggregated gradient, each user updates the global model and proceeds to the next training iteration.However, this FL process raises at least the following two important privacy concerns: 1) The server can learn some information about the users' local training data by analyzing their gradient vectors.This type of attacks is often referred to as inference attacks [8].2) The server may manipulate the global model at will by providing each user with a malformed aggregated gradient.In particular, a "lazy" server may reduce the aggregation operation to save computational cost, or worse, maliciously forge an aggregated gradient.In response to the above privacy issues, the concepts of secure aggregation [1] and verifiable secure aggregation [4], [9] were proposed.
The first secure aggregation protocol was proposed by Bonawitz et al. [1], which uses double-masking technique, Shamir's secret sharing (SSS), key agreement (KA) protocol, and symmetric encryption to protect the privacy of the local gradient vector and handle dropouts.Recently, Hahn et al. [5] added a verification mechanism without a Trusted Authority (TA) to [1] to ensure the correctness of the aggregated gradient.However, in this comment, we show that the scheme in [5] does not work by pointing out a mathematical error.

II. PRELIMINARIES
Since our proposed cryptanalysis on [5] relies heavily on the key agreement, we review it here.A key agreement protocol contains a tuple of algorithms KA = (KA.Setup, KA.Gen, KA.Agree), defined as follows: r KA.Setup(1 λ ) → KApp: outputs a public parameter KApp.r KA.Gen(KApp) → (pk u , sk u ): outputs a public/secret key pair (pk u , sk u ) for any user u. r KA.Agree(sk u , pk v ) → s u,v : outputs a shared secret s u,v .
The above key agreement protocol satisfies the following requirements: 1) Correctness.For any key pairs (pk u , sk u ), (pk v , sk v ) ← KA.Gen(KApp) generated by users u and v respectively, we have KA.Agree(sk u , pk v ) = KA.Agree(sk v , pk u ), which means that the users u and v successfully negotiated a session key (i.e., the shared secret).2) Security in Honest-but-Curious Model.We want that for any probabilistic polynomial-time (PPT) adversary who is given two honestly generated public keys pk u and pk v (but neither of the corresponding secret keys), the shared secret s u,v computed from those keys is indistinguishable from a uniformly random string.Specifically, for any (pk u , sk u ), (pk v , sk v ) ← KA.Gen(KApp) and s u,v ← KA.Agree(sk u , pk v ), we have in the view of any PPT adversary who is given the public keys (pk u , pk v ), where r is a uniformly random string and "≈ c " indicates that the two distributions are computationally indistinguishable.Like [1], Hahn et al. [5] also used the Diffie-Hellman key agreement scheme [3] with a hash function.That is, KA.Setup(1 λ ) → (G, q, g, H) samples a group G of prime order q, along with a generator g, and a hash function H; KA.Gen(G, q, g, H) → (x u , g xu ) samples a random element x u ← Z q as the secret key sk u and computes g xu as the public key pk u ; and KA.Agree( 1545-5971 © 2023 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

III. VERIFIABLE SECURE AGGREGATION
Secure aggregation is a problem of computing a multiparty sum in a secure manner where no user reveals its input vector in the clear (even to the aggregator), that is, the aggregator (e.g., the server) computes x = u∈U x u without learning each user's individual contribution, where x u is the private vector of the user u ∈ U (U is a set of users).
Verifiable secure aggregation is a mixture of the secure aggregation and the correctness verification of the above aggregated vector x, that is, in addition to computing the sum x in a secure fashion, the aggregator also proves to the users that he/she honestly computes the sum x by providing a "proof" of correctness of the sum.

IV. REVIEW OF HAHN ET AL.'S VERSA
In [5], the authors proposed VERSA, using the double-masking technique, Shamir's secret sharing, key agreement scheme, symmetric encryption, and pseudorandom generator (PRG), where the doublemasking technique was proposed by Bonawitz et al. [1].Structurally speaking, VERSA consists of two components: a secure aggregation protocol that is identical to the one proposed in [1], and a verification mechanism that relies on a KA scheme and a PRG.But unlike other verifiable secure aggregation protocols, the authors in [5] claimed that their verification mechanism does not require a TA.Thus, the innovation of VERSA lies in the design of the verification mechanism without relying on a TA.Next, we will review their verification mechanism and point out an error in it.
The Verification Mechanism in VERSA [5].In VERSA, the server needs to compute a sum and a "proof" for the sum.To generate such a proof, users are required to provide their masked inputs and some corresponding auxiliary information to help the server generate a "proof" of correctness of the sum.Specifically, given an input vector x u , each user u computes and sends two masked vectors y u , ȳu (using the double-masking technique; we refer readers to [1], [5] for technical details), where y u is the masked vector of x u , ȳu is the masked vector of F (x u ) = a • x u + b, and • is the Hadamard product.The two vectors (a, b) are secret vectors hidden from the server, and all users can compute the same pair of vectors (a, b).Upon receiving y u , ȳu , the server computes and returns the sum z = u∈U y u and the proof z = u∈U ȳu , where z = u∈U x u and z = a • u∈U x u + |U| • b.Each surviving user verifies z by checking if the following equality holds: Now let's focus on how the authors in [5] built the two vectors (a, b).
The authors proposed the following approach: With all public keys of surviving users v ∈ U, each user u in a secure aggregation protocol of VERSA first runs the KA scheme locally to compute a set of shared secrets {s u,v } v∈U , where s u,v ← KA.Agree(sk u , pk v ) for v ∈ U, then the user u computes α = u∈U s u,v and a = PRG(α||0), b = PRG(α||1).
Overall, every surviving user v can generate α using {s v,u } u∈U and (a, b), thus verifying (1).
From the verification mechanism described above and (1), every surviving user v needs to compute the same pair of vectors (a, b) using the above approach.In other words, since the two vectors (a, b) are determined by α, the verification mechanism requires that, given any α = v∈U s u,v and α = u∈U s v,u computed by any two users u and v, respectively, the following equality holds: α = α . (2) However, we argue that (2) does not hold.For the two users u and v, they can only compute a unique shared secret that is identical by running the KA scheme, that is, s u,v = s v,u holds, where s u,v ← KA.Agree(sk u , pk v ) and s v,u ← KA.Agree(sk v , pk u ) are computed by the user u and v, respectively.In other words, ex-cept that s u,v and s u,v are equal, other shared secrets in {s u,v } v∈U and {s v,u } u∈U are not necessarily equal, as other shared secrets are computed from other corresponding users' public keys; for example, we cannot guarantee that s u,w = s v,w from the security of KA (recall that the KA used is a two-party key agreement protocol; see Section II), where s u,w ← KA.Agree(sk u , pk w ), s v,w ← KA.Agree(sk v , pk w ).We also provide the following toy example to illustrate the above assertion.Therefore, since (2) does not hold, the verification mechanism does not work.Toy Example.For simplicity, we consider three users u, v, w ∈ U, with the goal of computing the same α.With the private key sk u and the public keys pk v , pk w , the user u can compute two shared secrets s u,v ← KA.Agree(sk u , pk v ), s u,w ← KA.Agree(sk u , pk w ) by running the KA and α u = s u,v + s u,w .Likewise, the user v can compute two shared secrets s v,u ← KA.Agree(sk v , pk u ), s v,w ← KA.Agree(sk v , pk w ) and α v = s v,u + s v,w ; the user v can compute two shared secrets s w,u ← KA.Agree(sk w , pk u ), s w,v ← KA.Agree(sk w , pk v ) and α w = s w,u + s w,v .By the correctness of the KA, we have s u,v = s v,u , s u,w = s w,u , s v,w = s w,v .Putting these together, for α u , α v , α w , we have α u = α v = α w if and only if (s u,w = s v,w , s v,u = s w,u ) holds, which happens with a negligible probability due to the security of the KA.In other words, α u = α v = α w holds with a negligible probability.
Suggestions.As a two-party key agreement protocol, KA cannot be used to compute the same value α by > 2 parties.We provide the following alternative solutions: 1) Multi-party key agreement protocol [6]; 2) Multi-party computation protocol [2]; 3) Using a Trusted Authority to distribute such a value α.However, note that 1) requires a large amount of computation; 2) requires more communication rounds and traffic; and 3) is always a strong assumption.Therefore, users can weigh the pros and cons and then choose the appropriate method.
V. CONCLUSION