A code-based hybrid signcryption scheme

A key encapsulation mechanism (KEM) that takes as input an arbitrary string, i.e., a tag, is known as tag-KEM, while a scheme that combines signature and encryption is called signcryption. In this paper, we present a code-based signcryption tag-KEM scheme. We utilize a code-based signature and an IND-CCA2 (adaptive chosen ciphertext attack) secure version of McEliece's encryption scheme. The proposed scheme uses an equivalent subcode as a public code for the receiver, making the NPcompleteness of the subcode equivalence problem to be one of our main security assumptions. We then base the signcryption tag-KEM to design a code-based hybrid signcryption scheme. A hybrid scheme deploys asymmetric- as well as symmetric-key encryption. We give security analyses of both our schemes in the standard model and prove that they are secure against IND-CCA2 (indistinguishability under adaptive chosen ciphertext attack) and SUF-CMA (strong existential unforgeability under chosen message attack).


Introduction
In public-key cryptography, the authentication and confidentiality of communication between a sender and a receiver are ensured by a two-step approach called signature-then-encryption. In this approach, the sender uses a digital signature scheme to sign a message and then encrypt it using an encryption algorithm. The cost of delivering a message in a secure and authenticated way using the signature-then-encryption approach is essentially the sum of the cost of a digital signature and that of encryption.
In 1997, Y. Zheng introduced a new cryptographic primitive called signcryption to provide both authentication and confidentiality in a single logical step [67]. In general, one can expect the cost of signcryption to be noticeably less than that of signature-then-encryption. Zheng's sincryption scheme is based on the hardness of the discrete logarithm problem. Since Zheng's work, a number of signcryption schemes based on different hard assumptions have been introduced, see for example [67,68,61,63,39,7,8,28,58,64,38,66]. Of these, the most efficient ones have followed Zheng's approach, i.e., used symmetric-key encryption as a black-box component [7,8,28]. It has been of interest to many researchers to study how a combination of asymmetric-and symmetric-key encryption schemes could be used to build efficient signcryption schemes in a more general setting.
To that end, Dent in 2004 proposed the first formal composition model for hybrid signcryption [25] and in 2005 developed an efficient model for signcryption KEMs in the outsider-and the insider -secure setting [26,27]. In the outsider-secure setting the adversary is assumed to be distinct from the sender and receiver, while in the insider-secure setting the adversary is assumed to be a second party (i.e., either sender or receiver).
In order to improve the model for the insider-secure setting in hybrid signcryption, Bjørstad and Dent in 2006 proposed a model based on encryption tag-KEM rather than regular encryption KEM [14]. Their model provides a simpler description of signcryption with a better generic security reduction for the signcryption tag-KEM construction. A year after Bjørstad and Dent's work, Yoshida and Fujiwara reported the first study of multi-user setting security of signcryption tag-KEMs [65] which is a more suitable setting for the analysis of insider-secure schemes.
Motivation Most of the aforementioned signcryption schemes are based on the hardness of either the discrete logarithm or the integer factorization problem and would be broken with the arrival of sufficiently large quantum computers. Therefore it is of interest to design signcryption schemes for the post-quantum Organization This paper is organized as follows. In Section 2, we first recall some basic notions of coding theory and then briefly describe relevant encryption and signature schemes that are of interest to this work. Section 3 has the definition and framework of signcryption and hybrid signcryption, and a brief review of the relevant security model. We present our sigcryption and hybrid sigcryption schemes in Section 4 and then provide security analyses of the proposed schemes in Section 5. We provide a set of parameters for the hybrid sigcryption scheme in Section 6 and then conclude in Section 7.
Notations In this paper we use the following notations: -F q : finite field of size q where q = p m is a prime power.
-C: F q -linear code of length n. x : a word or vector of F n q . -wt(x ): weight of x.
-W q,n,t is the set of q-ary vectors of length n and weight t.

Preliminaries
In this section, we recall some notions pertaining to coding theory and code-based cryptography.

Coding theory and some relevant hard problems
Let us consider the finite field F q . A q-ary linear code C of length n and dimension k over F q is a vector subspace of dimension k of F n q . It can be specified by a full rank matrix G ∈ F k×n q , called generator matrix of C, whose rows span the code. Namely, C = xG s.t. x ∈ F k q . A linear code can also be defined by the right kernel of matrix H ∈ F r×n q , called parity-check matrix of C, as follows: The Hamming distance between two codewords is the number of positions (coordinates) where they differ. The minimal distance of a code is the minimal distance of all codewords.
The weight of a word or vector x ∈ F n q , denoted by wt (x) , is the number of its nonzero positions. Then the minimal weight of a code C is the minimal weight of all nonzero codewords. In the case of linear code C, its minimal distance is equal to the minimal weight of the code.
Below we recall some hard problems that are relevant to our discussions and analyses presented in this article.
Problem 1. (Binary syndrome decoding (SD) problem) Given a matrix H ∈ F r×n 2 , a vector s ∈ F r 2 , and an integer ω > 0, find a vector y ∈ F n 2 such that wt(y) = ω and s = yH T .
The syndrome decoding problem was proven to be NP-complete in 1978 by Berlekamp et al. [13]. It is equivalent to the following problem.
Problem 2. (General decoding (GD) problem) Given a matrix G ∈ F k×n 2 , a vector y ∈ F n 2 , and an integer ω > 0, find two vectors m ∈ F k q and e ∈ F n q such that wt(e) = ω and y = mG ⊕ e.
The following problem is used in the security proof of the underlying signature that we use in this paper. It was first considered by Johansson and Jonsson in [36]. It was analyzed later by Sendrier in [57].
Problem 3. (Decoding One Out of Many (DOOM) problem) Given a matrix H ∈ F r×n q , a set of vector s 1 , s 2 ,...,s N ∈ F r q and an integer ω, find a vector e ∈ F n q and an integer i such that 1 ≤ i ≤ N , wt(e) = ω and s i = eH T . Problem 4. (Goppa code distinguishing (GCD) problem) Given a matrix G ∈ F k×n 2 , decide whether G is a random binary or generator matrix of a Goppa code.
Faugère et al. [30] showed that Problem 4 can be solved in special cases of Goppa codes with high rate. The following is one of the problems, which the security assumption of our scheme's underlying signature mechanism relies on.
Problem 5. (Generalized (U, U + V ) code distinguishing problem.) Given a matrix H ∈ F r×n q , decide whether H is a parity-check matrix of a generalized (U, U + V )-code.
Problem 5 was shown to be hard in the worst case by Debris-Alazard et al. [22] since it is NP-complete. Below, we recall the subcode equivalence problem which is one of the problems on which the security assumption of our scheme is based. This problem was proven to be NP-complete in 2017 by Berger et al. [10]. Problem 6. (Subcode equivalence problem [10]) Given two linear codes C and D of length n and respective dimension k and k ′ , k ′ ≤ k, over the same finite field F q , determine whether there exists a permutation σ of the support such that σ(C) is a subcode of D.

Code-based encryption
The first code-based encryption was introduced in 1978 by R. McEliece [45]. Below (in Figure 1) we give the McEliece scheme Fujisaki-Okamoto transformation [16] which comprises three algorithms: key generation, encryption, and decryption.
The main drawback of the McEliece encryption scheme is its very large key size. To address this issue, many variants of McEliece's scheme have been proposed, see for example [11,12,46,47,9,52]. In order to reduce the size of both public and private keys in code-based cryptography, H. Niederreiter in 1986 introduced a new cryptosystem [49]. Niederreiter's cryptosystem is a dual version of McEliece's cryptosystem with some additional properties such that the ciphertext length is relatively smaller. Indeed, the public key in Niederreiter's cryptosystem is a parity-check matrix instead of a generator matrix. In addition, ciphertexts are syndrome vectors instead of erroneous codewords. However, the McEliece and the Niederreiter schemes are equivalent from the security point of view due to the fact that Problems 1 and 2 are equivalent.
Code-based hybrid encryption: A hybrid encryption scheme is a cryptographic protocol that features both an asymmetric-and a symmetric-key encryption scheme. The first component is known as Key Encapsulation Mechanism (KEM), while the second is called Data Encapsulation Mechanism (DEM). The framework was first introduced in 2003 by Cramer and Shoup [21] and later the first code-based hybrid encryption was introduced in 2013 by Persichetti [53] using Niederreiter's encryption scheme. Persichetti's scheme was implemented in 2017 by Cayrel et al. [17]. After Persichetti's work, some other code-based hybrid encryption schemes have been reported, e.g., [43].

Code-based signature
Designing a secure and practical code-based signature scheme is still an open problem. The first secure code-based signature scheme was introduced by Courtois et al. (CFS) [20]. It is a full domain hash (FDH) like signature with two security assumptions: the indistinguishability of random binary linear codes and the hardness of syndrome decoding problem. To address some of the drawbacks of Courtois et al.'s scheme, Dallot proposed a modified version, called mCFS, which is provably secure. Unfortunately, this scheme is not practical due to the difficulties of finding a random decodable syndrome. In addition, the assumption of the indistinguishability of random binary Goppa codes has led to the emergence of attacks as described in [30]. One of the latest code-based signature schemes of this type is called Wave [23]. It is based on generalized (U, U + V )-codes. It is secure and more efficient than the CFS signature scheme. In addition, it has a smaller signature size than almost all finalist candidates in the NIST post-quantum cryptography standardization process [5].
Apart from the full domain hash approach, it is possible to design signature schemes by applying the Fiat-Shamir transformation [31] to an identification protocol. To this end, one may use a code-based identification scheme like that of Stern [62], Jain et al. [35], or Cayrel et al. [18]. This approach however leads to a signature scheme with a very large signature size. To address this issue, Lyubashevsky's framework [40] can apparently be adapted. Unfortunately, almost all code-based signature schemes in Hamming metric designed by using this framework have been cryptanalyzed [15,54,55,32,41,60]. The only one which has remained secure so far is a rank metric-based signature scheme proposed by Aragon et al. [1].
In Figure 2, we recall Debris-Alazard et al.'s signature scheme (Wave) which is of our interest for this work. In Wave, the secret key is a tuple of three matrices sk = (S, H sk , P), where S ∈ F r×r q is an invertible matrix, H sk ∈ F r×n q is a parity-check matrix of a generalized (U, U + V )-code and P ∈ F n×n 2 is a permutation matrix. The public key is a matrix pk = H pk , where H pk = SH sk P. Steps for signature and verification processes are given in Figure 2. For additional details, the reader is referred to [24,23]. 3 Signcryption and security model In this section, we first recall the definition of signcryption followed by the signcryption tag-KEM framework and its security model under the insider setting.

Signcryption and its tag-KEM framework
Signcryption: A signcryption scheme is a tuple of algorithms SC=(Setup, KeyGen s , KeyGen r , Signcrypt, Unsigncrypt) [3] where: * Setup(1 λ ) is the common parameter generation algorithm with λ, the security parameter, * KeyGen s (resp. KeyGen r ) is a key-pair generation algorithm for the sender (resp. receiver), * Signcrypt is the signcryption algorithm and * Unsigncrypt corresponds to the unsigncryption algorithm.
Signcryption tag-KEM: A signcryption tag-KEM denoted by SCTKEM is a tuple of algorithms [14]: where, -Setup is an algorithm for generating common parameters.
-KeyGen s (resp. KeyGen r ) is the sender (resp. receiver) key generation algorithm. It takes as input the global information I, and returns a private/public keypair (sk s , pk s ) (resp. (sk r , pk r )) that is used to send signcrypted messages. -Sym is a symmetric key generation algorithm. It takes as input the private key of the sender sk s and the public key of the receiver pk r and outputs a symmetric key K together with internal state information ̟. -Encap takes as input the state information ̟ together with an arbitrary string τ , which is called a tag, and outputs an encapsulation E. -Decap is the decapsulation/verification algorithm. It takes as input the sender's public key pk s , the receiver's private key sk r , an encapsulation E, and a tag τ . It returns either symmetric key K or the unique error symbol ⊥.
Hybrid signcryption tag-KEM+DEM: It is simply a combination of a sctkem and a regular Data Encapsulation Mechanism (DEM).

Insider security for signcryption tag-KEM
IND-CCA2 game in signcryption tag-KEM: It corresponds to a game between a challenger and a probabilistic polynomial-time adversary A CCA2 such that the latter tries to distinguish whether a given session key K is the one embedded in an encapsulation or not. During this game, A CCA2 has adaptive access to three oracles for the attacked user corresponding to algorithms Sym, Encap, and Decap [14,29,65]. The game is described in Figure 3 below. During Step 7, the adversary A CCA2 is restricted not to make decapsulation queries on (E, τ ) to the decapsulation oracle. The advantage of the adversary A is defined by: for any adversary A, its advantage in the IND-CCA2 game is negligible with respect to the security parameter λ.
SUF-CMA game for signcryption tag-KEM: This game is a challenge between a challenger and a probabilistic polynomial-time adversary (i.e., a forger) F CMA . In this game, the forger tries to generate a valid encapsulation E from the sender to any receiver, with adaptive access to the three oracles. The adversary is allowed to come up with the presumed secret key sk r as part of his forgery [65]: The adversary F CMA wins the SUF-CMA game if ⊥ = Decap(pk s , sk r , E, τ ) and the encapsulation oracle never returns E when he queries on the tag τ . The advantage of F CMA is the probability that F CMA wins the SUF-CMA game. A signcryption tag-KEM is SUF-CMA secure if the winning probability of the SUF-CMA game by F CMA is negligible.

Definition 1.
A signcryption tag-KEM is said to be secure if it is IND-CCA2 and SUF-CMA secure.
Oracles 1. O Sym is the symmetric key generation oracle with input a public key pk, and computes (K, ω) = Sym(sks, pk). It then stores the value ω (hidden from the view of the adversary, and overwriting any previously stored values), and returns the symmetric key K. 2. O Encap is the key encapsulation oracle. It takes an arbitrary tag τ as input and checks whether there exists a stored value ω. If there is not, it returns ⊥ and terminates. Otherwise, it erases the value from storage and returns E = Encap(ω, τ ). 3. O Decap corresponds to the decapsulation/verification oracle. It takes an encapsulation E, a tag τ , any sender's public key pk as input and returns Decap(pk, skr, E, τ ).

Generic security criteria of hybrid signcryption tag-KEM+DEM
Security criteria for hybrid signcryption: The security of a hybrid signcryption tag-KEM+DEM depends on those of the underlying signcryption tag-KEM and DEM. However, it is important to note that in the standard model a signcryption tag-KEM is secure if it is both IND-CCA2 and SUF-CMA secure. Therefore, the generic security criteria for hybrid signcryption tag-KEM+DEM is given by the following theorem: Theorem 1. [65,14] Let HSC be a hybrid signcryption scheme constructed from a signcryption tag-KEM and a DEM. If the signcryption tag-KEM is IND-CCA2 secure and the DEM is one-time secure, then HSC is IND-CCA2 secure. Moreover, if the signcryption tag-KEM is SUF-CMA secure, then HSC is also SUF-CMA secure.

Code-based hybrid signcryption
In this section, we first design a code-based signcryption tag-KEM scheme. Then we combine it with a one-time (OT) secure DEM for designing a hybrid signcryption tag-KEM+DEM scheme.

Code-based signcryption tag-KEM scheme
For designing our code-based signcryption tag-KEM scheme, we use the McEliece scheme as the underlying encryption scheme. More specifically, in order to achieve the IND-CCA2 security for our schemes, we use McEliece's scheme with the Fujisaki-Okamoto transformation [33,16]. The authors of [16] gave an instantiation of this scheme using generalized Srivastava (GS) codes. Indeed, by using GS codes, it seems possible to choose secure parameters even for codes defined over relatively small extension fields. However, Barelli and Couvreur recently introduced an efficient structural attack [6] against some of the candidates in the NIST post-quantum cryptography standardization process. Their attack is against code-based encryption schemes using some quasi-dyadic alternant codes with extension degree 2. It works specifically for schemes based on GS code called DAGS [4]. Therefore, in our work, we use the Goppa code with the Classic McEliece parameters. As for the underlying signature scheme, we use the code-based Wave [23] as described earlier.
The fact that we use Wave, the sender's secret key is a generalized (U, U + V )-code over a finite field F q with q > 2. Its public key is a parity-check matrix of a code equivalent to the previous one. To reduce the public key size, we use a permuted Goppa subcode for the receiver's public key. Thus, we include the subcode equivalence problem as one of the security assumptions of our scheme. In Fig. 5, we describe the algorithm Setup which will provide common parameters for our scheme.  We give key generation algorithms in Figure 6, where we denote the sender key generation algorithm by KeyGen s and that of the receiver by KeyGen r . The receiver algorithm KeyGen r returns as signcryption public key a generator matrix G pk,r ∈ Fk ×nr 2 of a Goppa subcode equivalent. It returns as signcryption secret key the tuple (g r , Γ r , S −1 r , P r ), where Γ r and g r are, respectively, the support and the polynomial of a Goppa code. S r ∈ Fk ×kr 2 is a full rank matrix and P r a permutation matrix. The sender key generation algorithm KeyGen s returns as private key three matrices S s ∈ F is an invertible matrix, H sk,s ∈ F (ns−ks)×ns 3 a parity-check matrix of a random generalized (U, U + V )-code and P ∈ F ns×ns 2 a permutation matrix. The sender public key is a parity-check matrix H pk,s ∈ F (ns−ks)×ns 3 of a generalized (U, U + V ) equivalent code given by H pk,s = S s H sk,s P s . In Figure 7, we give the design of the symmetric key generation algorithm Sym of our scheme. The algorithm Sym takes as input the bit length ℓ of the symmetric encryption key. It outputs an internal state information ̟ and the session key K, where ̟ is randomly chosen from F ℓ 2 , and K is computed by using the hash function H 0 . Figure 8 provides a description of the encapsulation and decapsulation algorithms of our signcryption tag-KEM scheme. We denote the encapsulation algorithm by Encap and the decapsulation by Decap. In the encapsulation algorithm, the sender first performs a particular Wave signature on the message m = τ ̟, where ̟ corresponds to an internal state information and τ is the input tag. The signature in the Wave scheme comprises two parts: an error vector e ∈ F ns 3 and a random binary vector y. In our scheme, z is the hash of a random coin y ∈ F κ 2 . The sender then performs an encryption of m ′ = H 1 (τ ) ̟. The encryption with kr = nr − mt. 5. Set skr = (gr, Γr, Sr, Pr) and pk r = G pk,r = SrG sk,r Pr. 6. Return skr and pk r .   that we use in our scheme is the IND-CCA2 secure McEliece encryption scheme with the Fujisaki-Okamoto transformation introduced by Cayrel et al. [16]. During the encryption, the sender adaptively uses the random binary vector y as a random coin. The resulting ciphertext is denoted by c. The output is given by E = (e, c).
In the decapsulation algorithm Decap, the receiver first performs recovery of the internal state information ̟ by using the algorithm Decrypt and the second part of the signature of m. Then it verifies the signature and computes the session K by using ̟.
The algorithm Decrypt that we use in the decapsulation algorithm of our scheme is described in Figure  9. It is similar to that described in [16] but we introduce some modifications which are: • we use an encoding function φ • the output is not only the clear message m, but a pair (m, y) where y is the reciprocal image the error vector σ by the encoding function φ Completeness of our signcryption tag-KEM Let τ be a tag, (sk s , pk s ) (resp. sk r and pk r ) be sender's (resp. receiver's) key pair generated by the algorithm KeyGen with input 1 λ . Let (K, ̟):=Sym(sk s , pk r ) be a pair of a session key and an internal state information. Let E :=(e, c) be an encapsulation of the internal state information ̟. Assuming that the encapsulation and decapsulation are performed by an honest user, we have: Output: An encapsulation of the internal state information ̟. Return ⊥ 6. Compute K := H0(̟) 7. Return K. -The receiver can recover the pair (τ ′ ̟, y) from c and verify successfully that eH T pk,s = H 2 (τ ̟|y) and τ ′ = H 1 (τ ) Otherwise, the receiver performs a successful signature verification of message m := τ ̟ signed by an honest user using the dual version of mCFS signature. -Therefore it can compute the session key K := H 0 (̟).

Code-based hybrid signcryption
Here we use the signcryption tag-KEM described in Section 4.1 for designing a code-based hybrid signcryption. For the data encapsulation, we propose the use of a regular OT-secure symmetric encryption scheme. We denote the symmetric encryption algorithm being used by SymEncrypt and the symmetric decryption algorithm by SymDecrypt. Figure 10 gives the design of our code-based hybrid signcryption tag-KEM+DEM. In this design, algorithms Setup, KeyGen s and KeyGen r are the same as those of our signcrytion tag-KEM. Algorithms Sym and Encap are those of our signcryption tag-KEM in Section 4.1.

Security analysis
Before discussing the security of our hybrid scheme, let us consider the following assumptions for our security analysis: Assumption 1 : The advantage of probabilistic polynomial-time algorithm A to solve the decoding random linear codes problem is negligible with respect to the length n and dimension k of the code.
Return ⊥ 8. Return (x, y) Assumption 2 : The advantage of probabilistic polynomial-time algorithm A to solve the (U, U + V ) distinguishing problem is negligible with respect to the length n and dimension k of the code.
Assumption 3 : The advantage of probabilistic polynomial-time algorithm A to solve the subcode equivalence problem is negligible with respect to the length n and dimension k of the code.
Assumption 4 : The advantage of probabilistic polynomial-time algorithm A to solve the decoding one out of many (DOOM) problem is negligible with respect to the length n and dimension k of the code. Assumption 5 : The advantage of probabilistic polynomial-time algorithm A to solve the Goppa code distinguishing problem is negligible with respect to the length n and dimension k of the code.

Information-set decoding algorithm
In code-based cryptography, the best-known non-structural attacks rely on information-set decoding. The information-set decoding algorithm was introduced by Prange [56] for decoding cyclic codes. After the publication of Prange's work, there have been several works studying to invert code-based encryption schemes based on information-set decoding (see [2] Section 4.1).
For a given linear code of length n and dimension k, the main idea behind the information-set decoding algorithm is to find a set of k coordinates of a garbled vector that are error-free and such that the restriction of the code's generator matrix to these positions is invertible. Then, the original message can be computed by multiplying the encrypted vector by the inverse of the submatrix.
Thus, those k bits determine the codeword uniquely, and hence the set is called an information set. It is sometimes difficult to draw the exact resistance to this type of attack. However, they are always lower-bounded by the ratio of information sets without errors to total possible information sets, i.e., where ω is the Hamming weight of the error vector. Therefore, well-chosen parameters can avoid these nonstructural attacks. In our scheme, we use the parameters of the Wave signature [23] for the sender and those of Classic McEliece [2] for the receiver in the underlying encryption scheme.

Key recovery attack
In code-based cryptography, usually, the first step in the key recovering attack is to perform a distinguishing attack on the public code in order to identify the family of the underlying code. Once successful, the attacker can then perform any well-known attack against this family of underlying codes to recover the secret key. When the underlying code is a Goppa code, the main distinguishing attack technique consists of evaluating the square code or the square of the trace code of the corresponding public code [30,?,?]. Note that this technique usually works for a Goppa code with a high rate. Compared to many other code-based encryption schemes, in which the public code is equivalent to an alternant or a Goppa code, in this work the public code is a permuted Goppa subcode. Thus, in addition to the indistinguishability of Goppa codes, the subcode equivalence problem becomes one of our security assumptions. Moreover, to the best of our knowledge, there is no attack reported in the literature on distinguishing a code equivalent to a Goppa subcode. Therefore, by using the subcode equivalence problem as a security assumption, we can keep our scheme out of the purview of the distinguishing attack even though the underlying code is a Goppa code. Throughout the rest of our analysis, we assume that the attacker knows that the family of the underlying code is a Goppa code. In our case, the key recovery attack is at two different levels: the first one is on the sender side, and the second one is on the receiver side.
On the receiver side, the key recovery attack consists of the recovery of the Goppa polynomial g r and the support γ r = (α 0 , ..., α n−1 ) from the public matrix. Therefore, the natural way for this is to perform a brute-force attack: one can determine the sequence (α 0 , ..., α n−1 ) from g r and the set {α 0 , ..., α n−1 }, or alternatively determine g r from (α 0 , ..., α n−1 ). A good choice of parameters can avoid this attack for the irreducible Goppa code the number of choices of g r is given by By using the parameters of Classic McEliece, we can see that the complexity for performing a brute-force attack to find Goppa polynomial is more than 2 800 for the parameters proposed in [2].
It is also important to note that if the adversary has the knowledge of the underlying Goppa code C sk , performing the key recovery attack implies solving a computational instance of a subcode equivalence problem. Indeed, this corresponds to finding the permutation σ such that σ(C pk ) is a subcode of C sk . We can see that finding the permutation σ is equivalent to solving the following system: where H sk,r is a parity-check matrix of the underlying Goppa code C sk,r , G sk,r is the generator matrix of the public code C pk and X σ = (x i,j ) is the matrix of the unknown permutation σ. Note that solving (2) is equivalent to solving a variant of permuted kernel problem [37]. A natural way to solve (2) is to use the brute force attack and such an attack is of order O(n!). However, the adversary could use Georgiades' technique [34] where its complexity is given in our case by Recently Paiva and Terada introduced in [51] a new technique for solving (2). The workfactor of their attack applied to our scheme is given by: From (3) and (4), we can see that a well-chosen set of parameters can avoid the attack of Georgiades as well as that of Paiva and Terada.
In the case of the sender, the key recovery attack consists of first solving the (U, U + V ) distinguishing problem for finite fields of cardinality q = 3. Therefore under Assumption 3 and with a well-chosen set of parameters, this attack would fail.

IND-CCA and SUF-CMA security
In code-based cryptography, the main approach to a chosen-ciphertext attack against the McEliece encryption scheme consists of adding two errors to the received word. If the decryption succeeds, it means that the error vector in the resulting word has the same weight as the previous one. In our signcryption tag-KEM scheme, this implies either recovering the session key K or distinguishing encapsulation of two different session keys from (e, c, τ ). We see that the recovery of the session key K corresponds to the recovery of plaintext in a IND-CCA2 secure version of McEliece's cryptosystem (see [16] Subsection 3.2). We now have the following theorem: Theorem 2. Under Assumptions 1, 3, and 5, the signcryption tag-KEM scheme described in Subsection 4.1 is IND-CCA2 secure.
Proof. Let A CCA2 be a PPT adversary against the signcryption tag-KEM scheme described in Subsection 4.1 in the signcryption tag-KEM IND-CCA2 game. Let us denote its advantage by ǫ CCA2,SCTKEM . For proving Theorem 2 we need to bound ǫ CCA2,SCTKEM .
Game 0: This game is the normal signcryption tag-KEM IND-CCA2 game. Let us denote by X 0 the event that the adversary wins Game 0 and Pr(X 0 ) the probability that it happens. Then we have Pr(X 0 ) = ǫ CCA2,SCTKEM Game 1: This game corresponds to the simulation of the hash function oracle. Indeed it is the same as Game 0 except that adversary can have access to the hash function oracle: It looks for some pair (τ * , y * ) ∈ F λ 2 × F κ 2 such that eH T s = H 2 (τ * ̟ H 1 (y * )). Then, it tries to continue by computing c ′ . We can see that it could succeed at least when the following collisions happen: Therefore, if q h is the number of queries allowed and X 1 the event that A CCA2 wins game X 1 , then we have: Game 2: This game is the same as Game 1 except that the error vector e in the encapsulation output is generated randomly. We can see that the best to proceed is to split c as (c 0 c 1 ) and then try to invert either c 0 for recovering the error σ or c 1 for recovering directly the internal state ̟ b . That means that the adversary is able either to solve the syndrome decoding problem or to invert a one-time pad function. Therefore we have: where ǫ SD is the advantage of an adversary against the syndrome decoding problem, ν is a negligible function, and ℓ is the bit length of the symmetric encryption. Game 3: This game is the same as Game 2. However, the change is in the key generation algorithm. Indeed, a random code is chosen as the underlying code instead of Goppa. We can see that this change is indistinguishable. In fact, distinguishing this change corresponds to solving in part the Goppa code distinguishing problem. Thus, we have where ǫ GCD (λ) is the advantage of a PPT adversary in the Goppa code distinguishing problem and λ the security parameter. If there is a PPT adversary A capable of distinguishing this change, we can use it to construct an adversary A GCD to solve the Goppa code distinguishing problem as follows: 1. Once receiving an instance G ∈ F k×n 2 of a generator matrix of a code C in Goppa code distinguishing problem, A GCD extracts a generator matrix G ′ of a subcode C ′ of C and forward it to A. 2. A will reply by 1 if the change has happened, i.e., the underlying code is not a Goppa code. It will reply by 0 otherwise. 3. If A GCD receives 1 from A, it means that C is not a Goppa code and A GCD outputs 0, otherwise it returns 1, i.e, C is a Goppa code.

Game 4:
This game is the same as Game 3 except that the public key is a random matrix instead of a generator matrix of a permuted subcode. We can see that this change is indistinguishable according to the subcode equivalence assumption. Thus we have: where ǫ ES (λ) is the advantage of a PPT adversary in the subcode equivalence problem and λ is the security parameter. Moreover, we can show that if an adversary A CCA2 wins this game, we can use it to construct an adversary A McE for attacking the underlying McEliece scheme in the public key encryption IND-CCA2 game (called PKE. Game in Appendix A). For more details on the underlying McEliece encryption scheme and its IND-CCA2 security proof, the reader is referred to Appendix C. We now proceed as follows: • Given the receiver public key pk which corresponds to a receiver public key signcryption tag-KEM, A McE does the following: Let ǫ PKE be the advantage of A McE in the PKE. Game. Note that the target ciphertext c can be uniquely decrypted to H 1 (τ )) ̟ δ . Therefore any (c, τ ′ ) other than (c, τ ) cannot be a valid signcryption ciphertext unless collusion of H 1 takes place, i.e., H 1 (τ i ) = H 1 (τ ). The correct answer to any decryption query with c i = c is ⊥. Decryption queries from A CCA2 are correctly answered since c i is decrypted by the decryption oracle of PKE. Game.
When A CCA2 outputsδ, it means that ̟ δ is embedded in c i otherwise ̟ 1−δ is embedded. It means that the adversary A McE wins game PKE. Game with the same probability as A CCA2 wins Game 4 when collision of H 1 has happened. LetX be the event collision of H 1 has happened andX 4 the event A McE wins the PKE. Game. Let us denote by ǫ pke the probability of the eventX 4 and ǫ col that ofX. Therefore we have: Pr(X 4 |X) = Pr(X 4 ) =⇒ Pr(X 4 ) ≤ Pr(X 4 ) + Pr(X) By putting it all together, we conclude our proof. Proof. Let F CMA be an adversary against our signcryption tag-KEM in the SUF-CMA game and ǫ CMA its advantage. For the forgery of our signcryption, adversary F CMA needs to first find a pair (e, y) ∈ W q,n,ω × Fk 2 such that eH T pk,s = H 2 (τ ̟ y). Then, it will try to find r ∈ F κ 2 such that H 1 (r) = y, i.e., it wins in the target pre-image free game (see Appendix B) against the cryptographic hash function H 1 . We can see that finding (e, y) ∈ W q,n,ω × Fk 2 such that eH T pk,s = H 2 (τ ̟ y) corresponds to the forgery of the underlying Wave signature scheme. Let ǫ PreIm be the advantage of an adversary in the pre-image free game against a cryptographic hash function. Let A Wave,CMA be an adversary against the Wave signature in the EUF-CMA game and ǫ W ave,EUF its advantage. Let X be the event that A Wave,CMA wins. LetX be the event that the adversary is able to find a pre-image x of y by H 1 such that x ∈ F κ 2 . We have: Pr(F CMA wins) = Pr(X andX) ≤ Pr(X) + Pr(X) ≤ ǫ Wave,EUF + ǫ PreIm 2 κ Note that due to the fact that H 1 is a cryptographic hash function, ǫ PreIm is negligible and that concludes our proof. Corollary 1. The signcryption tag-KEM described in Subsection 4.1 is secure.
The above corollary is a consequence of Theorems 2 and 3. We then have the following. Proof. Proposition 1 is a consequence of Theorem 1. Indeed, under Assumptions 1, 3, and 5, the underlying signcryption tag-KEM is IND-CCA2 secure (see Theorem 2). In addition, the symmetric encryption scheme used is OT-secure. Therefore, a direct application of Theorem 1 allows us to achieve the proof. Proof. Under Assumptions 2 and 4, the underlying signcryption tag-KEM is SUF-CMA secure and, therefore, according to the Theorem 1, the proposed hybrid signcryption tag-KEM + DEM is SUF-CMA secure.

Parameter values
For our scheme, we choose parameters such that λ 0 = λ + 2 log 2 (q sign ) and λ McE of the underlying Wave signature and McEliece's encryption, respectively, satisfy max(λ 0 , λ McE ) ≤ nr t . According to the sender and receiver keys, the size of our ciphertext is given by |E| = |e| + |c| + |C| = 2n s + n r +k + 2ℓ. Table 1 gives suggested values of the parameters of our scheme. These values have been derived using those of Wave [5] and Classic McEliece [2] for NIST PQC Level 1 security. According to the values given in Table 1, the ciphertext size in bits of our scheme is in the order of |E| = 2.9 × 10 4 .
Parameter ns kU kV ω m t nrk ℓ Value 8492 3558 2047 7980 12 64 3488 1815 512 Table 1. Parameter values of the proposed scheme. Table 2 provides key sizes of our scheme in terms of relevant parameters. Then in Table 3 we give a numerical comparison of key and ciphertext sizes of our scheme with some existing lattice-based hybrid signcryption schemes. The rationale behind comparing our scheme against lattice-based schemes is that no code-based hybrid signcryption scheme exists in the literature and the underlying hard problems in both codes-and lattice-based schemes are considered quantum-safe. For the lattice-based schemes in our comparison, the parameters, including plaintext size of 512 bits, are from [58, Table 2]. We can see that for post-quantum security level 1 the proposed scheme has the smallest key and ciphertext sizes.

Conclusion
In this paper, we have proposed a new signcryption tag-KEM based on coding theory. The security of our scheme relies on known hard problems in coding theory. We have used the proposed signcryption scheme to design a new code-based hybrid signcryption tag-KEM+DEM. We have proven that the proposed schemes are IND-CCA2 and SUF-CMA secure against any probabilistic polynomial-time adversary. The proposed scheme has a smaller ciphertext size compared to the pertinent lattice-based schemes.
Step 3: x −→ APreIm(H, y) such that x ∈ X.  Figure 1, we need the following definition: Definition 2. (γ-uniformity [16]) A public key encryption scheme Π is called γ-uniform and R be the set where the randomness to be used in the (probabilistic) encryption is chosen. For a given key-pair (pk, sk), x be a plaintext and a string y, we define γ(y) = P r[r $ ← R : y = E pk (x, r)] where the notation E pk (x, r) makes the role of the randomness r explicit. We say that Π is γ-uniform if, for any key-pair (pk, sk), any plaintext x and any ciphertext y, γ(x, y) ≤ γ for a certain γ ∈ R.
We now can state the following lemma. Proof. For any vector y ∈ F nr 2 , either y is a word at distance t from the code C of generator matrix G pk,r , or it isn't. When y is not a distance t of C, the probability for it to be a valid ciphertext is equal to 0. Else there is only one choice for r and e such that y = rG pk,r ⊕ e, i.e., Pr(d(y, C)) = t) = 1 2k nr t Theorem 4. Under Assumptions 1, 3, and 5 the McEliece scheme based on a subcode of Goppa code with the Fujisaki-Okamoto transformation described in Figure 1 is IND-CCA2 secure.
Proof. In Figure 1, the symmetric encryption used is the XOR function which is a one-time pad. Under Assumptions 1 and 3, the old McEliece encryption scheme is one-way secure. Therefore according to Theorem 12 of [33], the McEliece scheme with the Fujisaki-Okamoto transformation is IND-CCA2 secure.