This paper introduces a new spatial domain-based self-interference cancellation (SIC) precoding method named constrained minimum mean square error (C-MMSE) for an asymmetric massive multiple-input multiple-output (mMIMO) full-duplex (FD) system. The main idea is to translate the commonly used singular value decomposition (SVD)-based null-space projection approach, which is unfeasible in our considered system model, into an optimization problem under MMSE criterion, where additional constraints are implemented to perform SIC. Theoretical derivation of the C-MMSE precoder is presented, followed by performance comparison with conventional MMSE precoding, where no constraints are added for SIC. We theoretically show that the C-MMSE scheme outperforms the conventional one in terms of SIC, and allows the FD system to work under an almost interference-free environment. Additionally, we also assess the performance of the proposed method under imperfect channel state information (CSI), to further evaluate the robustness of our spatial precoder in more realistic conditions. We show that the C-MMSE precoder outperforms MMSE in terms of interference suppression ratio (ISR), even in CSI imperfection. Additionally, the C-MMSE achieves the same spectral efficiency (SE) as an hypothetical perfect SIC in a wide SNR range, whereas the MMSE is upper bounded in large SNR range.
We introduce SIGNOVA, a new semi-supervised framework for detecting anomalies in streamed data. While our initial examples focus on detecting radio-frequency interference (RFI) in digitized signals within the field of radio astronomy, it is important to note that SIGNOVA's applicability extends to any type of streamed data. The framework comprises three primary components. Firstly, we use the signature transform to extract a canonical collection of summary statistics from observational sequences. This allows us to represent variable-length visibility samples as finite-dimensional feature vectors. Secondly, each feature vector is assigned a novelty score, calculated as the Mahalanobis distance to its nearest neighbor in an RFI-free training set. By thresholding these scores we identify observation ranges that deviate from the expected behavior of RFI-free visibility samples without relying on stringent distributional assumptions. Thirdly, we integrate this anomaly detector with Pysegments, a segmentation algorithm, to localize consecutive observations contaminated with RFI, if any. This approach provides a compelling alternative to classical windowing techniques commonly used for RFI detection. Importantly, the complexity of our algorithm depends on the RFI pattern rather than on the size of the observation window. We demonstrate how SIGNOVA improves the detection of various types of RFI (e.g., broadband and narrowband) in time-frequency visibility data. We validate our framework on the Murchison Widefield Array (MWA) telescope and simulated data and the Hydrogen Epoch of Reionization Array (HERA).
Functional near-infrared spectroscopy (fNIRS) is a valuable non-invasive tool for monitoring brain activity. The classification of fNIRS data in relation to conscious activity holds significance for advancing our understanding of the brain and facilitating the development of brain-computer interfaces (BCI). Many researchers have turned to deep learning to tackle the classification challenges inherent in fNIRS data due to its strong generalization and robustness. In the application of fNIRS, reliability is really important, and one mathematical formulation of the reliability of confidence is calibration. However, many researchers overlook the important issue of calibration. To address this gap, we propose integrating calibration into fNIRS field and assess the reliability of existing models. Surprisingly, our results indicate poor calibration performance in many proposed models. To advance calibration development in the fNIRS field, we summarize three practical tips. Through this letter, we hope to emphasize the critical role of calibration in fNIRS research and argue for enhancing the reliability of deep learning-based predictions in fNIRS classification tasks. All data from our experimental process are openly available on GitHub.
In this paper, we explore an indoor downlink cooperative hybrid visible light communication (VLC)/radio frequency (RF) scenario using a relay node to reduce system outage probability. In particular, information can be transmitted to the end user either directly through the VLC link or via the relay node. To re-transmit the decoded information to the end user through the RF link the relay utilizes harvested energy from the source light emitting diode (LED) at the ceiling. We derive the analytical expression for the outage probability of the relayaided hybrid VLC/RF system, considering the randomness of location and receiver orientation for both the relay and the end user. Furthermore, we investigate the effects of the direct current (DC) bias, data rate threshold, and different distributions for the location and orientation of the end user and relay on the outage probability of the system.
Speech overlap, which occurs when multiple people speak simultaneously, poses a significant challenge in audio and speech processing. The presence of overlapping speech segments significantly degrades the performance of technologies such as Automatic Speech Recognition (ASR), speaker identification, and diarization systems. This degradation in performance becomes more significant in diverse acoustic environments with background noise and reverberation. To effectively address this issue, we introduce BiConNet. This novel dual-branch architecture combines the strengths of Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) for robust detection of overlapping speech in diverse acoustic conditions. The CNN branch is used for frame-level spectral feature extraction, while the BiLSTM branch captures temporal dependencies from both forward and backward directions. Features from both branches are concatenated, resulting in a robust feature representation. We also examined the impact of Mel Frequency Cepstral Coefficients (MFCC), Gammatone Frequency Cepstral Coefficients (GTCC), and Power Normalized Cepstral Coefficients (PNCC) as spectral-based features on BiConNet's performance. To validate its effectiveness in various acoustic environments, we developed a constructed data set derived from the GRID corpus, including conversations with different gender combinations and recording conditions, such as clean, noisy, reverberant, and combined noise and reverberation conditions. Experimental results show that BiConNet outperforms various state-of-the-art methods in detecting overlapping speech segments under these conditions. Furthermore, our analysis of computational efficiency reveals that BiConNet provides competitive training and inference times, demonstrating its practicability for real-world applications.
This paper describes a method for automatically transforming the structure and characteristics of an image processing dataflow graph for the purpose of improving performance and/or lowering memory utilization as compared to the baseline tools. Embedded image processing applications are often executed on Digital Signal Processors, or their modern equivalent Visual Processor Units. The software usually performs a series of pixel-level operations for basic color conversion, channel extraction and combining, arithmetic, and filtering. These steps can often be efficiently described as a graph. For this reason, standard libraries such as OpenVX are used, which provide a graph-based programming model where the nodes are chosen from a repertoire of common pixel-level operations and the edges represent the flow of images as they progress though the processing stages. Generally speaking, each node is processed sequentially in the order implied by the data dependencies defined by the graph structure, with all intermediate values stored in external memory. In the proposed framework, we developed performance models for both the direct memory access subsystem and the L1 data cache to allow for selection of certain intermediate values to be stored in on-chip scratchpad memory as well as selecting the most appropriate tile size. In this way, we effectively decompose the graph in a way to fuse specific sets of nodes to associate their internal edges with on-chip buffers. Additionally, the tile size is optimized for each fused set of nodes. In this paper, we describe our performance models and approach for graph decomposition and tile size selection. The proposed performance models are accurate to within 2% on average, and the overall approach of graph optimization achieves an average speedup of 1.3 and allows for reduction of average DRAM utilization from 100% to as low as 15%.
Cryptography has become an essential tool in information security, preserving data confidentiality, integrity, and availability. However, despite rigorous analysis, cryptographic algorithms may still be susceptible to attack when used on real-world devices. Side-channel attacks (SCAs) are physical attacks that target cryptographic equipment through quantifiable phenomena such as power consumption, operational times, and EM radiation. These attacks are considered to be a significant threat to cryptography since they compromise the integrity of the algorithm by obtaining the internal cryptographic key of a device by seeing its physical implementation. The literature on SCAs has focused on real-world devices, yet with the growing popularity of sophisticated devices like smartphones, fresh approaches to SCAs are necessary. One such approach is electromagnetic side-channel analysis (EM-SCA), which gathers information by listening to electromagnetic (EM) radiation. EM-SCA has been demonstrated to recover sensitive data like encryption keys and has the potential to identify malicious software, retrieve data, and identify program activity. This study aims to evaluate how well EM-SCA compromises encryption under various application scenarios, as well as examine the role of EM-SCA in digital forensics and law enforcement. Regarding this, addressing the susceptibility of encryption algorithms to EM-SCA approaches can provide digital forensic investigators with the tools they desire to overcome the challenges posed by strong encryption, allowing them to continue playing a crucial role in law enforcement and the justice system. Furthermore, this paper seeks to define the current state of EM-SCA in terms of attacking encryption, the encryption algorithms and encrypted devices that are most vulnerable and resistant to EM-SCA, and the most promising EM-SCA on encryption approaches. This study will provide a comprehensive analysis of EM-SCA in the context of law enforcement and digital forensics and point towards potential directions for further research.
The application of millimeter-Wave (mmWave) Radar sensors for people monitoring raised a lot of interest in the context of Active Assisted Living (AAL), especially since the processing of Radar signals can provide interesting information about the observed subjects. Correct recognition of the ongoing behavior, however, cannot disregard from detecting where the subject is positioned. Detection approaches, based on Constant False Alarm Rate (CFAR) algorithms, sometimes fail to correctly identify the presence of targets within the observed scenario, especially in complex environments such as indoors. This paper proposes the use of a mmWave Multiple Input Multiple Output (MIMO) Radar in combination with a You Only Look Once (YOLO) neural network-based algorithm for the detection of moving people in indoor environments by processing all the data cube information at the same time. Results are validated through experimental tests which involve subjects walking in linear or random mode, different Radar configurations, and different indoor environments. By exploiting at the same time information such as the angle, Doppler, and range distance of the target, the proposed approach proves to be very effective in the examined scenarios. Experimental results will be discussed in this work to demonstrate the effectiveness of the proposed method.
Motor unit (MU) decomposition, generally, requires a time-consuming and labor-intensive manual inspection/editing process from human operators to ensure high accuracy. In this study, we propose and validate a rule-based auto-editing method that could potentially substitute the manual process. Methods: The proposed auto-editing framework (autoeditor) consists of four main rules for adding and removing spikes based on the height of the innervation pulse train (IPT) and the regularity of the firing rate of the identified motor unit. The rules were optimized and validated based on an open-source database including raw MU spike trains estimated from the convolution kernel compensation method and the manually edited MU spike trains from eight human operators. Results: Across 110 motor units, the average rate of agreement between the auto-editor and human operators reached 99.2% after the auto-editor corrected more than 10 edits for each motor unit on average from the raw spike trains. More importantly, the characteristics of the motor unit behaviors, including the MU firing rate and recruitment threshold, were consistent across human operators and the proposed auto-editor. Conclusion: With a simple but effective rulebased auto-editing framework, comparable performance in MU refinement was achieved as human operators. Significance: The proposed auto-editing framework has the potential to standardize the MU editing practice, lower the requirements for expert knowledge and specialized training for MU decomposition, and provide an expandable framework allowing contributions from the community.
This paper is a continuation on my revolutionary theory of solving the pointwise fluid flow approximation model for time-varying queues. Thus, the long-standing simulative approach has now been replaced by an exact solution by using a constant ratio 𝛽 (Ismail's ratio) , offering an exact analytical solution. The stability dynamics of the time-varying 𝑀/𝐸 𝑘 /1 queueing system are then examined numerically in relation to time, 𝛽, and the queueing parameters.
The paper provides a comprehensive overview of Neural Architecture Search (NAS), emphasizing its evolution from manual design to automated, computationally-driven approaches. It covers the inception and growth of NAS, highlighting its application across various domains, including medical imaging and natural language processing. The document details the shift from expert-driven design to algorithm-driven processes, exploring initial methodologies like reinforcement learning and evolutionary algorithms. It also discusses the challenges of computational demands and the emergence of efficient NAS methodologies, such as Differentiable Architecture Search and hardware-aware NAS. The paper further elaborates on NAS's application in computer vision, NLP, and beyond, demonstrating its versatility and potential for optimizing neural network architectures across different tasks. Future directions and challenges, including computational efficiency and the integration with emerging AI domains, are addressed, showcasing NAS's dynamic nature and its continued evolution towards more sophisticated and efficient architecture search methods.
This paper details an experiment utilizing ESP8266 modules as servers to wirelessly control diverse electrical appliances in home automation. The experiment showcased the modules' capability to respond to commands via a web interface on both mobile and desktop platforms or even tablets. While most of the experiment ran smoothly, occasional freezing and connectivity disruptions were observed. The abstract encapsulates the experiment's successes, discusses encountered challenges, and outlines a forward-looking perspective, including the integration of a custom PCB for enhanced system stability.
The futuristic sixth-generation (6G) networks will empower ultra-reliable and low latency communications (URLLC), enabling a wide array of mission-critical applications such as mobile edge computing (MEC) systems, which are largely unsupported by fixed communication infrastructure. To remedy this issue, unmanned aerial vehicle (UAV) has recently come to the limelight to facilitate MEC for internet of things (IoT) devices as they provide desirable line-of-sight (LoS) communications compared to fixed terrestrial networks, thanks to their added flexibility and three-dimensional (3D) positioning. In this paper, we consider UAV-enabled relaying for MEC systems for uplink transmissions in 6G networks, and we aim to optimize mission completion time subject to the constraints of resource allocation, including UAV transmit power, UAV CPU frequency, decoding error rate, blocklength, communication bandwidth, and task partitioning as well as 3D UAV positioning. Moreover, to solve the non-convex optimization problem, we propose three different algorithms, including successive convex approximations (SCA), altered genetic algorithm (AGA) and smart exhaustive search (SES). Thereafter, based on time-complexity, execution time, and convergence analysis, we select AGA to solve the given optimization problem. Simulation results demonstrate that the proposed algorithm can successfully minimize the mission completion time, perform power allocation at the UAV side to mitigate information leakage and eavesdropping as well as map a 3D UAV positioning, yielding better results compared to the fixed benchmark sub-methods. Lastly, subject to 3D UAV positioning, AGA can also effectively reduce the decoding error rate for supporting URLLC services.
Point cloud classification and segmentation are crucial tasks for point cloud processing and have wide range of applications, such as autonomous driving and robot grasping. Some pioneering methods, including PointNet, VoxNet, DGCNN, etc., have made substantial advancements. However, most of these methods don't consider the large-distance geometric relationships among points in different perspectives within the point cloud, which limits the features extraction and leads to the accuracy of classification and segmentation cannot be further improved. To address this issue, we propose an adaptive multiview graph convolutional network (AM-GCN), which comprehensively synthesizes both the global geometric features of the point cloud and the local features within the projection planes of multiple views through an adaptive graph construction method. First, an adaptive rotation module in AM-GCN is proposed to predict a more favorable angle of view for projection. Then, a multi-level feature extraction network can flexibly be constructed by spatial-based or spectral-based graph convolution layers. Finally, AM-GCN is evaluated on ModelNet40 for classification, ShapeNetPart for part segmentation, ScanNetv2 and S3DIS for scene segmentation, which demonstrates the robustness of the AM-GCN with competitive performance compared with existing methods. It's worth noting that it performs state-of-the-art performance in many categories.
This paper introduces a novel nonparametric framework for data imputation, coined multilinear kernel regression and imputation via the manifold assumption (MultiL-KRIM). Motivated by manifold learning, MultiL-KRIM models data features as a point cloud located in or close to a user-unknown smooth manifold embedded in a reproducing kernel Hilbert space. Unlike typical manifold-learning routes, which seek low-dimensional patterns via regularizers based on graph-Laplacian matrices, MultiL-KRIM builds instead on the intuitive concept of tangent spaces to manifolds and incorporates collaboration among point-cloud neighbors (regressors) directly into the data modeling term of the loss function. Multiple kernel functions are allowed to offer robustness and rich approximation properties, while multiple matrix factors offer low-rank modeling, integrate dimensionality reduction, and streamline computations with no need of training data. Two important application domains showcase the functionality of MultiL-KRIM: time-varying graph-signal (TVGS) recovery, and reconstruction of highly accelerated dynamic-magnetic-resonance-imaging (dMRI) data. Extensive numerical tests on real and synthetic data demonstrate MultiL-KRIM's remarkable speedups over its predecessors, and outperformance over prevalent "shallow" data-imputation techniques, with a more intuitive and explainable pipeline than deep-image-prior methods.
Objective: Digital subtraction angiography (DSA) is significantly important for cerebrovascular disease diagnosis and treatment. However, artifacts and noise are inevitable and reduce image quality. These problems could make clinical diagnosis difficult. In this paper, we introduce a novel deep learning architecture, exploiting the information decoupling training strategy to generate highquality DSA images. Methods: We propose the generative decoupling network, a feature decoupling convolutional network, which maximizes the difference between different structures throughout a decoupling training strategy. In this network, an axial residual block and a learnable sampling method are proposed to enhance the strength of feature extraction. Results: The results showed that our proposed method significantly outperforms the existing methods in the DSA generation task. Furthermore, we quantified the method using the metrics of SSIM, PSNR, VSI, FID and FSIM, with the results of 93.57%, 24.18dB, 98.04%, 351.59, and 89.95%, respectively. Conclusion: Our method can produce high-quality DSA images with little or even no artifact and noise. Significance: The proposed method can effectively reduce artifacts and noise, and generate high quality DSA images with complete and clear vascular structures.
In this research, we present SLYKLatent, a novel approach for enhancing gaze estimation by addressing appearance instability challenges in datasets due to aleatoric uncertainties, covariant shifts, and test domain generalization. SLYKLatent utilizes Self-Supervised Learning for initial training with facial expression datasets, followed by refinement with a patch-based tri-branch network and an inverse explained variance weighted training loss function. Our evaluation on benchmark datasets achieves an 8.7% improvement on Gaze360, rivals top MPI-IFaceGaze results, and leads on a subset of ETH-XGaze by 13%, surpassing existing methods by significant margins. Additionally, adaptability tests on RAF-DB and Affectnet show 86.4% and 60.9% accuracies, respectively. Ablation studies confirm the effectiveness of SLYKLatent's novel components. This approach has strong potential in human-robot interaction.
The year 1948 witnessed the historic moment of the birth of classic information theory (CIT). Guided by CIT, modern communication techniques have approached the theoretic limitations, such as, entropy function H(U), channel capacity C = max p(x) I(X; Y) and rate-distortion function R(D) = min p(x|x):Ed(x,x)≤D I(X; X). Semantic communication paves a new direction for future communication techniques whereas the guided theory is missed. In this paper, we try to establish a systematic framework of semantic information theory (SIT). We investigate the behavior of semantic communication and find that synonym is the basic feature so we define the synonymous mapping between semantic information and syntactic information. Stemming from this core concept, synonymous mapping, we introduce the measures of semantic information, such as semantic entropy H s (Ũ), up/down semantic mutual information I s (X; Ỹ) (I s (X; Ỹ)), semantic capacity C s = max p(x) I s (X; Ỹ), and semantic rate-distortion function R s (D) = min p(x|x):Eds(x, x)≤D I s (X; X). Furthermore, we prove three coding theorems of SIT by using random coding and (jointly) typical decoding/encoding, that is, the semantic source coding theorem, semantic channel coding theorem, and semantic rate-distortion coding theorem. We find that the limits of SIT are extended by using synonymous mapping, that is, H s (Ũ) ≤ H(U), C s ≥ C and R s (D) ≤ R(D). All these works composite the basis of semantic information theory. In addition, we discuss the semantic information measures in the continuous case. Especially, for band-limited Gaussian channel, we obtain a new channel capacity formula, C s = B log S 4 1 + P N0B with the synonymous length S. In summary, the theoretic framework of SIT proposed in this paper is a natural extension of CIT and may reveal great performance potential for future communication.