As conventional communication systems based on classic information theory have closely approached Shannon capacity, semantic communication is emerging as a key enabling technology for the further improvement of communication performance. However, it is still unsettled on how to represent semantic information and characterise the theoretical limits of semantic-oriented compression and transmission. In this paper, we consider a semantic source which is characterised by a set of correlated random variables whose joint probabilistic distribution can be described by a Bayesian network. We give the information-theoretic limit on the lossless compression of the semantic source and introduce a low complexity encoding method by exploiting the conditional independence. We further characterise the limits on lossy compression of the semantic source and the upper and lower bounds of the rate-distortion function. We also investigate the lossy compression of the semantic source with two-sided information at the encoder and decoder, and obtain the corresponding rate distortion function. We prove that the optimal code of the semantic source is the combination of the optimal codes of each conditional independent set given the side information.
As a novel paradigm, semantic communication provides an effective solution for breaking through the future development dilemma of classical communication systems. However, it remains an unsolved problem of how to measure the information transmission capability for a given semantic communication method and subsequently compare it with the classical communication method. In this paper, we first present a review of the semantic communication system, including its system model and the two typical coding and transmission methods for its implementations. To address the unsolved issue of the information transmission capability measure for semantic communication methods, we propose a new universal performance measure called Information Conductivity. We provide the definition and the physical significance to state its effectiveness in representing the information transmission capabilities of the semantic communication systems and present elaborations including its measure methods, degrees of freedom, and progressive analysis. Experimental results in image transmission scenarios validate its practical applicability.
Video transmission requires considerable bandwidth, and current widely employed schemes prove inadequate when confronted with scenes featuring prominently. Motivated by the strides in talking-head generative technology, the paper introduces a semantic transmission system tailored for talking-head videos. The system captures semantic information from talking-head video and faithfully reconstructs source video at the receiver, only one-shot reference frame and compact semantic features are required for the entire transmission. Specifically, we analyze video semantics in the pixel domain frame-by-frame and jointly process multi-frame semantic information to seamlessly incorporate spatial and temporal information. Variational modeling is utilized to evaluate the diversity of importance among group semantics, thereby guiding bandwidth resource allocation for semantics to enhance system efficiency. The whole end-to-end system is modeled as an optimization problem and equivalent to acquiring optimal rate-distortion performance. We evaluate our system on both reference frame and video transmission, experimental results demonstrate that our system can improve the efficiency and robustness of communications. Compared to the classical approaches, our system can save over 90% of bandwidth when user perception is close.
Recently, deep learning-based semantic communication has garnered widespread attention, with numerous systems designed for transmitting diverse data sources, including text, image, and speech, etc. While efforts have been directed toward improving system performance, many studies have concentrated on enhancing the structure of the encoder and decoder. However, this often overlooks the resulting increase in model complexity, imposing additional storage and computational burdens on smart devices. Furthermore, existing work tends to prioritize explicit semantics, neglecting the potential of implicit semantics. This paper aims to easily and effectively enhance the receiver's decoding capability without modifying the encoder and decoder structures. We propose a novel semantic communication system with variational neural inference for text transmission. Specifically, we introduce a simple but effective variational neural inferer at the receiver to infer the latent semantic information within the received text. This information is then utilized to assist in the decoding process. The simulation results show a significant enhancement in system performance and improved robustness.
In the future development direction of the sixth generation (6G) mobile communication, several communication models are proposed to face the growing challenges of the task. The rapid development of artificial intelligence (AI) foundation models provides significant support for efficient and intelligent communication interactions. In this paper, we propose an innovative semantic communication paradigm called task-oriented semantic communication system with foundation models. First, we segment the image by using task prompts based on the segment anything model (SAM) and contrastive language-image pre-training (CLIP). Meanwhile, we adopt Bezier curve to enhance the mask to improve the segmentation accuracy. Second, we have differentiated semantic compression and transmission approaches for segmented content. Third, we fuse different semantic information based on the conditional diffusion model to generate high-quality images that satisfy the users' specific task requirements. Finally, the experimental results show that the proposed system compresses the semantic information effectively and improves the robustness of semantic communication.
We consider an image semantic communication system in a time-varying fading Gaussian MIMO channel, with a finite number of channel states. A deep learning-aided broadcast approach scheme is proposed to benefit the adaptive semantic transmission in terms of different channel states. We combine the classic broadcast approach with the image transformer to implement this adaptive joint source and channel coding (JSCC) scheme. Specifically, we utilize the neural network (NN) to jointly optimize the hierarchical image compression and superposition code mapping within this scheme. The learned transformers and codebooks allow recovering of the image with an adaptive quality and low error rate at the receiver side, in each channel state. The simulation results exhibit our proposed scheme can dynamically adapt the coding to the current channel state and outperform some existing intelligent schemes with the fixed coding block.
To facilitate emerging applications and demands of edge intelligence (EI)-empowered 6G networks, model-driven semantic communications have been proposed to reduce transmission volume by deploying artificial intelligence (AI) models that provide abilities of semantic extraction and recovery. Nevertheless, it is not feasible to preload all AI models on resource-constrained terminals. Thus, in-time model transmission becomes a crucial problem. This paper proposes an intellicise model transmission architecture to guarantee the reliable transmission of models for semantic communication. The mathematical relationship between model size and performance is formulated by employing a recognition error function supported with experimental data. We consider the characteristics of wireless channels and derive the closed-form expression of model transmission outage probability (MTOP) over the Rayleigh channel. Besides, we define the effective model accuracy (EMA) to evaluate the model transmission performance of both communication and intelligence. Then we propose a joint model selection and resource allocation (JMSRA) algorithm to maximize the average EMA of all users. Simulation results demonstrate that the average EMA of the JMSRA algorithm outperforms baseline algorithms by about 22%.
The concept of semantic communication provides a novel approach for applications in scenarios with limited communication resources. In this paper, we propose an end-to-end (E2E) semantic molecular communication system, aiming to enhance the efficiency of molecular communication systems by reducing the transmitted information. Specifically, following the joint source channel coding paradigm, the network is designed to encode the task-relevant information into the concentration of the information molecules, which is robust to the degradation of the molecular communication channel. Furthermore, we propose a channel network to enable the E2E learning over the non-differentiable molecular channel. Experimental results demonstrate the superior performance of the semantic molecular communication system over the conventional methods in classification tasks.
In this paper, we innovatively associate the mutual information with the frame error rate (FER) performance and propose novel quantized decoders for polar codes. Based on the optimal quantizer of binary-input discrete memoryless channels (B-DMCs), the proposed decoders quantize the virtual subchannels of polar codes to maximize mutual information (MMI) between source bits and quantized symbols. The nested structure of polar codes ensures that the MMI quantization can be implemented stage by stage. Simulation results show that the proposed MMI decoders with 4 quantization bits outperform the existing nonuniform quantized decoders that minimize mean-squared error (MMSE) with 4 quantization bits, and yield even better performance than uniform MMI quantized decoders with 5 quantization bits. Furthermore, the proposed 5-bit quantized MMI decoders approach the floating-point decoders with negligible performance loss.
Though belief propagation bit-flip (BPBF) decoding improves the error correction performance of polar codes, it uses the exhaustive flips method to achieve the error correction performance of CA-SCL decoding, thus resulting in high decoding complexity and latency. To alleviate this issue, we incorporate the LDPC-CRC-Polar coding scheme with BPBF and propose an improved belief propagation decoder for LDPC-CRC-Polar codes with bit-freezing (LDPC-CRC-Polar codes BPBFz). The proposed LDPC-CRC-Polar codes BPBFz employs the LDPC code to ensure the reliability of the flipping set, i.e., critical set (CS), and dynamically update it. The modified CS is further utilized for the identification of error-prone bits. The proposed LDPC-CRC-Polar codes BPBFz obtains remarkable error correction performance and is comparable to that of the CA-SCL ($L$ = 16) decoder under medium-to-high signal-to-noise ratio (SNR) regions. It gains up to 1.2dB and 0.9dB at a fixed BLER = $10^{-4}$ compared with BP and BPBF (CS-1), respectively. In addition, the proposed LDPC-CRC-Polar codes BPBFz has lower decoding latency compared with CA-SCL and BPBF, i.e., it is 15 times faster than CA-SCL ($L$ = 16) at high SNR regions.
In this paper, we propose a Multi-token Sector Antenna Neighbor Discovery (M-SAND) protocol to enhance the efficiency of neighbor discovery in asynchronous directional ad hoc networks. The central concept of our work involves maintaining multiple tokens across the network. To prevent mutual interference among multi-token holders, we introduce the time and space non-interference theorems. Furthermore, we propose a master-slave strategy between tokens. When the master token holder (MTH) performs the neighbor discovery, it decides which 1-hop neighbor is the next MTH and which 2-hop neighbors can be the new slave token holders (STHs). Using this approach, the MTH and multiple STHs can simultaneously discover their neighbors without causing interference with each other. Building on this foundation, we provide a comprehensive procedure for the M-SAND protocol. We also conduct theoretical analyses on the maximum number of STHs and the lower bound of multi-token generation probability. Finally, simulation results demonstrate the time efficiency of the M-SAND protocol. When compared to the Q-SAND protocol, which uses only one token, the total neighbor discovery time is reduced by 28% when 6 beams and 112 nodes are employed.
The telecommunications industry is becoming increasingly aware of potential subscriber churn as a result of the growing popularity of smartphones in the mobile Internet era, the quick development of telecommunications services, the implementation of the number portability policy, and the intensifying competition among operators. At the same time, users' consumption preferences and choices are evolving. Excellent churn prediction models must be created in order to accurately predict the churn tendency, since keeping existing customers is far less expensive than acquiring new ones. But conventional or learning-based algorithms can only go so far into a single subscriber's data; they cannot take into consideration changes in a subscriber's subscription and ignore the coupling and correlation between various features. Additionally, the current churn prediction models have a high computational burden, a fuzzy weight distribution, and significant resource economic costs. The prediction algorithms involving network models currently in use primarily take into account the private information shared between users with text and pictures, ignoring the reference value supplied by other users with the same package. This work suggests a user churn prediction model based on Graph Attention Convolutional Neural Network (GAT-CNN) to address the aforementioned issues. The main contributions of this paper are as follows: Firstly, we present a three-tiered hierarchical cloud-edge cooperative framework that increases the volume of user feature input by means of two aggregations at the device, edge, and cloud layers. Second, we extend the use of users' own data by introducing self-attention and graph convolution models to track the relative changes of both users and packages simultaneously. Lastly, we build an integrated offline-online system for churn prediction based on the strengths of the two models, and we experimentally validate the efficacy of cloud-side collaborative training and inference. In summary, the churn prediction model based on Graph Attention Convolutional Neural Network presented in this paper can effectively address the drawbacks of conventional algorithms and offer telecom operators crucial decision support in developing subscriber retention strategies and cutting operational expenses.
In this paper, we formulate the precoding problem of integrated sensing and communication (ISAC) waveform as a non-convex quadratically constrained quadratic programming (QCQP), in which the weighted sum of communication multi-user interference (MUI) and the gap between dual-use waveform and ideal radar waveform is minimized with peak-to-average power ratio (PAPR) constraints. We propose an efficient algorithm based on alternating direction method of multipliers (ADMM), which is able to decouple multiple variables and provide a closed-form solution for each subproblem. In addition, to improve the sensing performance in both spatial and temporal domains, we propose a new criteria to design the ideal radar waveform, in which the beam pattern is made similar to the ideal one and the integrated sidelobe level of the ambiguity function in each target direction is minimized in the region of interest. The limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm is applied to the design of the ideal radar waveform which works as a reference in the design of the dual-function waveform. Numerical results indicate that the designed dual-function waveform is capable of offering good communication quality of service (QoS) and sensing performance.