Block-Sparse Recovery with Optimal Block Partition

—This paper presents a convex recovery method for block-sparse signals whose block partitions are unknown a priori. We ﬁrst introduce a nonconvex penalty function, where the block partition is adapted for the signal of interest by minimizing the mixed (cid:96) 2 /(cid:96) 1 norm over all possible block partitions. Then, by exploiting a variational representation of the (cid:96) 2 norm, we derive the proposed penalty function as a suitable convex relaxation of the nonconvex one. For a block-sparse recovery model designed with the proposed penalty, we develop an iterative algorithm which is guaranteed to converge to a globally optimal solution. Numerical experiments demonstrate the effectiveness of the proposed method.


I. INTRODUCTION
B LOCK-SPARSITY is a special type of sparsity where nonzero components are clustered in blocks. For recovery of a block-sparse signal whose block partition is known a priori, extensive researches, e.g., [1]- [10], show the effectiveness of the mixed 2 / 1 norm using the known block partition. However, in fact, the information on the block partition is not available in many applications, e.g., acoustic signal recovery [11]- [13], image restoration [14]- [16], change detection [17]- [19], and radar signal processing [20]- [23]. For instance, the target signal of the phased array weather radar [21]- [25] is block-sparse in the Fourier domain due to the narrow bandwidth of the power spectrum, but the block partition is unknown because it depends on the unknown mean and standard deviation of the Doppler frequency. In such situations, the estimation accuracy of the mixed 2 / 1 norm often degrades due to a pre-fixed block partition used instead of the ideal one.
Motivated by these observations, we consider the estimation problem of x ∈ C N which is block-sparse across unknown non-overlapping blocks B 1 , . . . , B K (see Fig. 1 for an illustration). More precisely, we suppose that the subvector x B k := (x n ) n∈B k contains only (nearly) zero components for many k, i.e., x B k ≈ 0 for many k ∈ {1, . . . , K }.
We use the term block-sparse in a strict sense, i.e., B k consists of only consecutive indices for each k = 1, . . . , K : B k = {n ∈ {1, . . . , N } | n k ≤ n ≤ m k }, This work was supported in part by the Japan Society for the Promotion of Science under Grants-in-Aid 21K17827, 21H01592, and 19K20361. (Corresponding author: Hiroki Kuroda.) The authors are with the College of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan (e-mail: kuroda@media.ritsumei.ac.jp; d-kita@media.ritsumei.ac.jp).
Several attempts have been made to deal with the unknown blocks B 1 , . . . , B K . A commonly used modification is to use overlapping blocks in the mixed 2 / 1 norm, e.g., [26]- [33]. Among them, the latent group lasso (LGL) method [32], [33] presents a sophisticated approach to select relevant blocks from pre-defined overlapping blocks. The LGL penalty is defined as the minimum of a convex function, and thus the corresponding regularization model can be solved as a convex optimization problem (see Appendix A for detail). However, since the problem size grows with the number of candidate blocks, the LGL method has a restriction on the number of candidate blocks, which leads to the degradation of the estimation accuracy. Meanwhile, the greedy methods [34], [35] have similar restrictions on the number of candidate blocks due to the computational complexity. Bayesian approaches [36], [37] have also been presented, but they have to solve challenging nonconvex optimization problems. Thus, it is highly desired to realize a convex method which can cope with the unknown blocks B 1 , . . . , B K whose sizes are different.
In this paper, we propose a convex recovery method for x block-sparse across the unknown blocks B 1 , . . . , B K . Our main contribution is to introduce a novel convex penalty function, named latent optimally partitioned (LOP)-2 / 1 penalty, where the block partition is automatically adapted for the signal of interest. More precisely, we first design a nonconvex penalty function as the minimum of the mixed 2 / 1 norm over all possible block partitions. Then, to derive a tight convex relaxation of the nonconvex penalty function, using a variational representation of the 2 norm (see Lemma 1), we represent the nonconvex penalty function by the minimization of a convex function under the 0 pseudo-norm constraint on latent variables. Finally, the LOP-2 / 1 penalty is derived by replacing the 0 pseudo-norm constraint with its convex envelope, i.e., the 1 norm constraint. To compute an optimal solution of a block-sparse recovery model designed with the LOP-2 / 1 penalty, we reformulate it into a convex optimization problem involving the latent variables of the LOP-2 / 1 penalty. By applying proximal splitting techniques, e.g., [38]- [46], to the reformulated problem with the aid of the computation of the proximity operator shown in [47], [48], we obtain an iterative algorithm with guaranteed convergence to the optimal solution. Numerical experiments on both synthetic examples and real-block sizes and positions are unknown world data show that the proposed method achieves superior estimation accuracy to existing convex methods including the LGL method.
The rest of this paper is organized as follows. In Section II, we introduce the LOP-2 / 1 penalty, and also show its extension to higher dimensions. In Section III, we design the block-sparse recovery model with the LOP-2 / 1 penalty, and develop the iterative algorithm which converges to the optimal solution of the model. Section IV provides numerical examples, followed by conclusion in Section V.
A preliminary short version of this paper is to be presented at a conference [49].
Notations: R, R + , and C respectively denote the sets of all real numbers, all nonnegative real numbers, and all complex numbers. For matrices or vectors, we denote the simple transpose and the Hermitian transpose respectively by (·) and (·) H . For x = (x 1 , . . . , x N ) ∈ C N and an index set I ⊂ {1, . . . , N }, x I := (x n ) n∈I denotes the subvector of x indexed by I. We define the support of x ∈ C N by supp(x) := {n ∈ {1, . . . , N } | x n = 0}. For a set S, |S| denotes the cardinality of S. The 2 norm, the 1 norm, and the 0 pseudo-norm of x ∈ C N are respectively defined by x 2 := √ x H x, x 1 := N n=1 |x n |, and x 0 := |supp(x)|. We define the operator norm of L ∈ C M ×N by for every u, v ∈ U and β ∈ (0, 1). The set of all proper lower semicontinuous convex functions from U to R ∪ {∞} is denoted by

II. DESIGN OF PROPOSED PENALTY FUNCTION
To evaluate the block-sparsity of x ∈ C N across the fixed non-overlapping blocks B 1 , . . . , B j ⊂ {1, . . . , N }, the mixed 2 / 1 norm is widely used as a convex penalty function. The  mixed 2 / 1 norm is defined as the sum (i.e. the 1 norm) of the block-wise 2 norms: where we employ the weight |B k | by following the suggestions in, e.g., [5]- [8]. The mixed 2 / 1 norm promotes the block-sparsity by pushing components in B k toward zeros together. However, since the mixed 2 / 1 norm excessively penalizes blocks composed of both zero and nonzero components, its performance degrades when B 1 , . . . , B j do not match with the ground-truth B 1 , . . . , B K . To avoid the block mismatch, the block partition is adapted for x in the proposed penalty function named latent optimally partitioned (LOP)-2 / 1 penalty. As shown in Fig. 2, we first introduce a nonconvex penalty function ψ K (x) as the minimum of the mixed 2 / 1 norm over the partition of at most K blocks. Then, as a suitable convex relaxation of ψ K (x), we derive the LOP-2 / 1 penalty Ψ α (x). Concretely, we define ψ K (x) by where P j consists of all j block partitions of {1, . . . , N }, i.e., for some n k , m k ∈ {1, . . . , N } (k = 1, . . . , j).
Note that K can be set to an upper bound of the ground-truth K . Note also that, since the sizes of B 1 , . . . , B j are adapted, the non-overlapping condition B k ∩ B k = ∅ is not restrictive.
To derive the convex relaxation of ψ K (x), we exploit the following lemma which shows a variational representation of the 2 norm.
which is a coercive lower semicontinuous convex function. Then, the block-wise 2 norm is variationally represented as for any B ⊂ {1, . . . , N }.
Proof. See Appendix C.

Remark 1. Lemma 1 is a slight modification of the existing result
used in, e.g., [33] for the analysis of the LGL penalty and [50] for the design of a graph-sparse penalty. Since the domain of is not closed, this representation is inappropriate for the use in concrete optimization algorithms, as the applicability of the algorithm of [50] is limited [51]. Lemma 1 resolves this issue by using the lower semicontinuous convex function φ.
By applying Lemma 1 for each |B k | x B k 2 in (2), we can rewrite ψ K (x) as Introducing a latent vector σ = (σ 1 , . . . , σ N ) ∈ R N as as illustrated in Fig. 3, we see that σ is characterized by the condition that where D is the first difference operator defined by because B k consists of only consecutive indices for each k = 1, . . . , j. Moreover, from It can be seen that Dσ has only j − 1 = 6 nonzero components.
Thus, we can represent ψ K (x) as Finally, based on the fact that the 1 norm is the tightest convex relaxation of the 0 pseudo-norm in the constraint, we derive the LOP-2 / 1 penalty as where α ∈ R + is a tuning parameter related to the number of blocks.
Proof. See Appendix D.
As its special instances, the LOP-2 / 1 penalty reproduces the mixed 2 / 1 norm with the coarsest partition and the finest partition. This is more precisely shown in the next theorem.
Theorem 2. The LOP-2 / 1 penalty reproduces the mixed 2 / 1 norm with the largest block {1, . . . , N } and the smallest blocks ({k}) N k=1 respectively for α = 0 and α → ∞, i.e., This theorem also suggests that the tuning parameter α indeed controls the number of blocks in the LOP-2 / 1 penalty, though blocks do not explicitly appear in (6). Note that, from our experiments shown in Section IV, the performance of the LOP-2 / 1 penalty seems fairly robust against the choice of α.
Remark 2 (Generalization to higher dimensions). For a matrix X ∈ C N ×M , the LOP-2 / 1 penalty can be naturally extended by replacing the 1D difference operator with a 2D one: computes differences in vertical and horizontal directions by Note that we do not only focus on rectangular blocks, and Ψ (2d) α can manage blocks of various shapes because most components of D 2d (Σ) are zeros when Σ is constant on each block, irrespective of block shapes. The LOP-2 / 1 penalty can be naturally extended to signals in higher dimensions, e.g., pth-order tensors, by modifying the difference operator in similar ways.

III. PROPOSED BLOCK-SPARSE RECOVERY MODEL AND ITS OPTIMIZATION ALGORITHM
We present a block-sparse recovery model using the LOP-2 / 1 penalty (6), and develop an iterative algorithm which converges to an optimal solution of the model. Specifically, we consider the following regularization model: where f (Lx) is some convex data-fidelity function with f : C J → R + and L ∈ C J×N , and λ > 0 is the regularization parameter. We suppose that f ∈ Γ 0 (C J ), and its proximity operator can be computed efficiently. Such examples include the square error 1 f (u) = 1 2 y − Au 2 2 and L = I, and the absolute error f (u) = y − u 1 and L = A, where y is the known observation vector, A is the known measurement matrix, and I denotes the identity matrix. The existence of the solution of (10) is guaranteed by the coercivity of Ψ α ∈ Γ 0 (C N ), as shown in the next theorem. Proof. It follows from Theorem 1 by [52,Corollary 11.16].
We show that an optimal solution of the proposed model (10) can be obtained by utilizing proximal splitting techniques [38]- [46] as follows. By plugging the definition of Ψ α (x) in (6) into (10), the optimization problem (10) is translated into where we let To apply the primal-dual algorithm [38]- [43] shown in Appendix B, by introducing auxiliary variables u = Lx and η = Dσ, we further rewrite the optimization problem (11) as is the indicator function of the 1 ball and defined by I and O respectively denote the identity and zero matrices of appropriate sizes, and µ 1 and µ 2 satisfying are introduced so that H op ≤ 1.
Applying the primal-dual algorithm (26) in Appendix B to the problem (12), we obtain the proposed algorithm shown in Algorithm 1, where the updates are written in terms of (x, σ, u, η). The proximity operator of γG in (26) reduces to the proximity operators of γλϕ, γf , and γι B α 1 , since γG is the separable sum of these functions. Based on [48, Example 2.4], we can compute the proximity operator of γλϕ as with where s > 0 is the unique positive root of s 3 + 2 γλ σ + 1 s − 2 γλ |x| = 0, and can be explicitly given via Cardano's formula as follows.
Proof. See Appendix F. Remark 3 (Computational complexity of proposed method). We summarize the computational cost of Algorithm 1 per an iteration. We can computex (i+1) ,ũ (i+1) , and r with O(N ) operations since D has only 2(N − 1) nonzero entries. The proximity operator of γλϕ can be computed by (14) and (15) with O(N ) operations. By setting L = A, prox γf can be computed with O(N ) operations for both the absolute error and the square error (see footnote 1). Note that, for the square error, setting L = I would be beneficial when, e.g., A H A is diagonal. The computation of the 1 ball projection P B α 1 by (18) and (19) requires O(N log N ) operations for sorting. We can also use sophisticated 1 ball projection algorithms having O(N ) expected complexity, e.g., [53]. In particular, this discussion implies that computations regarding the LOP-2 / 1 penalty can be implemented with O(N ) operations. Thus, per an iteration, the proposed method has the same computational complexity with the mixed 2 / 1 regularization method using the non-overlapping blocks when the primal-dual algorithm (26) in Appendix B is applied.

IV. NUMERICAL EXPERIMENTS
To show the effectiveness of the proposed block-sparse recovery model (10) using the LOP-2 / 1 penalty (6), we conduct numerical experiments on both synthetic and real-world data. We compare the LOP-2 / 1 model with regularization models using existing convex penalties: the 1 norm, the mixed 2 / 1 norm (1), and the LGL penalty (24) in Appendix A.

A. Synthetic Examples
We consider the estimation of a block-sparse signal x ∈ R N from noisy compressive measurements. More precisely, we define the measurements by y := Ax + ε, where the entries of A ∈ R d×N (d < N ) are drawn from i.i.d. Gaussian distribution N (0, 1), and ε ∈ R d is the white Gaussian noise. The block-sparse signal x is randomly generated by the following scheme. We set N = 250, and x has 80 nonzero components, which are randomly divided into 4 blocks. The blocks are randomly located under the condition |supp(x )| = 80. Amplitudes of nonzero components are drawn from i.i. d. N (0, 1).
In the proposed regularization model (10), we use the square error by setting f (u) = 1 2 y − Au 2 2 and L = I. The existing penalties are also combined with the square error in similar ways. Note that the regularization parameter is tuned independently for each model to obtain the best accuracy. The proposed model (10) is solved by Algorithm 1, where we set γ = 10 −1 , µ 1 = 1/ √ 2, and µ 2 = 1/ √ 5 which satisfy the condition (13) for the convergence to an optimal solution. The existing models are also solved by applying the primal-dual algorithm shown in Appendix B. We terminate the iteration when the norm of the differences between the variables of successive iterates is below the threshold 10 −4 .
We compare the proposed model (10) and the existing models in terms of the normalized mean square error (NMSE) x −x

B. Application to Phased Array Weather Radar
A major goal of the phased array weather radar (PAWR) [24], [25] is to estimate the backscattered signal X ∈ C M ×N from noisy observations Y := AX +E ∈ C d×N obtained by transmitting fan beams, where A ∈ C d×M is a known array manifold matrix, E ∈ C d×N is the white Gaussian noise, and M , N , and d respectively are the numbers of elevation angles, pulses, and array elements. This estimation problem is called beamforming in the PAWR literature [21]- [23]. As shown in [22], [23], the backscattered signal X exhibits block-sparsity in the Fourier domain for each elevation angle. Since the block partition depends on the unknown mean and standard deviation of the Doppler frequency at each elevation angle, it is also unknown and different for each elevation angle. Thus, the proposed approach is suitable for the PAWR beamforming. Specifically, we design a LOP-2 / 1 model for the PAWR beamforming as where · fro denotes the Frobenius norm, and F ∈ C N ×N is the normalized discrete Fourier transform matrix. To exploit the column-wise block-sparsity of F X , we useΨ (2d) α slightly modified from (9) by replacing D 2d with the vertical difference operator D v . To apply Algorithm 1, we rewrite (20) in the form of (11) as follows. Since F is a unitary matrix, we have Thus, by letting Z = F X and substituting the definition of Ψ (2d) α , the optimization problem (20) can be translated into

Further, by definingĀ
Since (22) is in the form of (11), we can solve (22) by Algorithm 1 with setting f (z) = ȳ −Āz 2 2 , L = I, and D =D v . Fromẑ the solution of (22), we obtain the solution of (20) byX = (F H vec −1 (ẑ)) , where vec −1 is the inverse of the vectorization operator. Existing penalties are also combined with the square error and solved in similar ways, LGL  where the regularization parameters are tuned independently to obtain the best results. We conduct a numerical simulation in a setting similar to [21]- [23]. The backscattered signal X ∈ C M ×N is generated based on the real reflection intensity measured by the PAWR at Osaka University, 3 Table I shows the NMSE averaged over 50 independent trails, where the LGL model uses overlapping blocks of size 2, and 2 / 1 (a) and (b) respectively refer to the mixed 2 / 1 models using block sizes 2 and 4. The result shows that the LOP-2 / 1 model achieves better estimation accuracy than the existing models, and is quite robust against the choice of the tuning parameter α. In Fig. 5, we show examples of the power spectrums of the ground-truth and the estimates, i.e., squared magnitudes of entries of F X and FX . It can be seen that the LOP-

C. Application of Proposed 2D Penalty to Speech Denoising
To show the effectiveness of the 2D extension (9) of the LOP-2 / 1 penalty, we conduct experiments on the speech denoising exploiting block-sparsity of the spectrogram. We generate noisy speech as y := s + ε ∈ R d , where s is a 3 The PAWR data is courtesy of Prof. Tomoo Ushio, Dr. Hiroshi Kikuchi, and Dr. Eiichi Yoshikawa.
2-second clip of speech taken from [54] with 16kHz sampling rate (i.e. d = 32000), and ε is the white Gaussian noise. The spectrogram X := F(s ) of the clean speech, where F denotes the short-time Fourier transform (STFT), is expected to be block-sparse because entries of X in adjacent frequency bins and frames tend to be zeros or nonzeros together. The block partition depends on the characteristics of the speech to be recovered, and thus is unknown a priori. Hence, the proposed approach is suitable for exploiting the block-sparsity of the spectrogram. Specifically, we apply the 2D LOP-2 / 1 penalty in (9) to the speech denoising aŝ X ∈ arg min where N and M are respectively the numbers of frequency bins and frames, and F −1 denotes the inverse STFT. We implement the STFT as a semi-orthogonal transformation by using the hann window of 32ms (i.e. 512 samples) with 75% overlap and appropriate zero padding. Note that this setting implies N = 512 and M = 253. Since D 2d and F −1 are linear operators, we can solve (23) by Algorithm 1 through reformulation with the vectorization of X, similarly to the reformulation of (21) into (22), where we should choose µ 2 ≤ 1/3 according to the operator norm of D 2d . Note that F −1 can be computed as F * , the adjoint of F, thanks to the semi-orthogonality of F. The existing penalties are used in manners similar to (23), where the regularization parameters are tuned independently to obtain the best NMSE. Experiments are conducted for male and female speech with input SNRs of 5dB and 10dB. In Table II,  , averaged over 20 independent trials, where 2 / 1 (a) and (b) respectively denote the mixed 2 / 1 models using block sizes 2 × 4 and 2 × 2, which yield best results among    Table II, we see that the 2D LOP-2 / 1 model achieves better estimation accuracy than the existing models for every experimental condition. In addition, the performance of the 2D LOP-2 / 1 model is fairly robust against the choice of α. Fig. 6 illustrates the magnitude spectrograms of the clean speech and the estimates, i.e., magnitudes of entries of X andX, for the denoising of male speech with 10dB input SNR. Rectangular artifacts are observed in Fig. 6(b) which shows the magnitude spectrogram obtained by the mixed 2 / 1 model with the block size 2 × 4. These artifacts are considered to be caused by the rectangular blocks used in the mixed 2 / 1 norm. Note that, since the blocks are used explicitly, it is difficult to employ blocks of various shapes in the mixed 2 / 1 norm due to the computational complexity. On the other hand, the 2D LOP-2 / 1 penalty can handle blocks of various shapes because the blocks are implicitly controlled with the 2D difference operator D 2d (see Remark 2). Indeed, from Fig. 6(c), we see that the 2D LOP-2 / 1 model considerably reduces such artifacts, and more accurately recovers detailed shapes of the spectrogram.
V. CONCLUSION We presented a convex recovery method for block-sparse signals whose block partitions are not available a priori. We first introduce a nonconvex penalty function ψ K in (2), where the block partition is optimized for the signal of interest by minimizing the mixed 2 / 1 norm. Then, as the tight convex relaxation of ψ K , we derive the proposed latent optimally partitioned (LOP)-2 / 1 penalty Ψ α in (6). For the block-sparse recovery model (10) using the LOP-2 / 1 penalty, by applying proximal splitting techniques, we develop the iterative solver as Algorithm 1 which is guaranteed to converge to an optimal solution. Numerical experiments on both synthetic and realworld data illustrate the effectiveness of the proposed method. where the non-overlapping condition B k ∩ B k = ∅ (k = k ) is omitted, e.g., as in [26]- [31]. However, this modification cannot remove the blocks containing both zero and nonzero components. To alleviate this drawback, the latent group lasso (LGL) method [32], [33] is presented to select relevant blocks from the pre-defined overlapping blocksB 1 , . . . ,BK. Specifically, the LGL penalty is defined by 4 LGL(x) := min (w1,...,wK )∈C N ×K The LGL penalty can also be written as with I p ∈ R p×p and O p,q ∈ R p×q respectively denoting the identity matrix of order p and the zero matrix of size p × q. Thus, when the LGL penalty is combined with some convex data-fidelity function, the resulting regularization model can be solved as a convex optimization problem of w ∈ CÑ . However, it is intractable to use all possible blocks asB 1 , . . . ,BK because the problem size becomes too large, i.e.,Ñ = N (N + 1)(N + 2)/6, in this case. Due to this issue, B 1 , . . . ,BK are typically restricted to blocks of a fixed size. On the other hand, the proposed approach is free from such issue because the block partition is implicitly optimized in the proposed penalty (6) with the latent vector σ of size N .

APPENDIX B PRIMAL-DUAL ALGORITHM FOR NONSMOOTH CONVEX OPTIMIZATION PROBLEM
The Chambolle-Pock primal-dual algorithm [38], [39] and the Loris-Verhoeven algorithm [40], [41] are widely used for solving nonsmooth convex optimization problems. As explained in [46], these algorithms are essentially the same for the following problem minimize v∈V G(v) subject to Hv = 0, (25) and the iteration can be written as where we suppose that G ∈ Γ 0 (V), H : V → W is a linear operator, H * : W → V is the adjoint of H, V and W are finite-dimensional Hilbert spaces, and γ > 0. This algorithm is also known as the linearized augmented Lagrangian algorithm [42], [43]. The convergence of (26) to an optimal solution of (25) is shown in the following lemma from, e.g., [ Lemma 2. Suppose that an optimal solution of (25) exists, the so-called qualification condition 5 0 ∈ ri(H(dom(G))) holds, H op ≤ 1, and γ > 0. Let v (0) ∈ V and r (0) ∈ W. Then, (v (i) ) ∞ i=1 generated by (26) converges to an optimal solution of (25).
Note that, when H op > 1, by replacing H in (25) with H/C where C ≥ H op , the condition H op ≤ 1 can be satisfied without changing the solution set of (25). Thus, the iteration (26) can compute an optimal solution of (25) for general H.
APPENDIX C PROOF OF LEMMA 1 First, we prove the properties of φ defined by (4). Notice that φ can be written as otherwise.
a) (Case of x B = 0) For τ > 0, we have Since x B = 0 in this case, 2τ + |B| 2 τ is a strictly convex function of τ > 0, and the minimum is attained when τ = x B 2 / |B|. Thus, we have For τ = 0, since there exists n ∈ B such that x n = 0 in this case, we have n∈B φ(x n , 0) = ∞. For τ < 0, the definition of φ in (4) readily implies n∈B φ(x n , τ ) = ∞. Summarizing, we have the equation (5). b) (Case of x B = 0) For τ > 0, since x n = 0 for every n ∈ B in this case, we have For For τ < 0, we have n∈B φ(x n , τ ) = ∞. Thus, the minimum is attained when τ = 0, and min τ ∈R n∈B φ(x n , τ ) = 0, which implies (5) since x B = 0 in this case.

APPENDIX E PROOF OF THEOREM 2
The equation (7) is proven as where the second equality holds due to where the last equality holds by Lemma 1.
Hiroki Kuroda received the B.E. degree in computer science, and the M.E. and the Ph.D. degrees in information and communications engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2013, 2015 and 2019, respectively. From April 2019 to March 2020, he was a postdoctoral researcher with the National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan. Since April 2020, he has been an Assistant Professor with the College of Information Science and Engineering, Ritsumeikan University, Shiga, Japan. His current research interests are in signal processing and its applications, sparse modeling, and optimization theory. He received the Seiichi Tejima Doctoral Dissertation Award from the Tokyo Institute of Technology in 2020.
Daichi Kitahara received the B.E. degree in computer science in 2012, and the M.E. and the Ph.D. degrees in communications and computer engineering in 2014 and 2017 from the Tokyo Institute of Technology, Tokyo, Japan, respectively. From April 2014 to March 2017, he was a recipient of the Research Fellowship of the Japan Society for the Promotion of Science. Since April 2017, he has been an Assistant Professor with the College of Information Science and Engineering, Ritsumeikan University, Shiga, Japan. His current research interests include signal processing and its applications, inverse problem, optimization theory, and multivariate spline theory.