Introduction

The emergence of semantic communications (SemCom), which relies on deep learning to drive efficient semantic encoding and decoding of information across a wireless network with low spectrum consumption, has been a game changer in wireless communications.  

However, while the information may be transmitted efficiently and have high-quality semantic refinement, the existing SemCom structure is limited by the lack of context-reasoning ability and background knowledge provisioning. Improving this element requires a tremendous amount of pretraining data and background knowledge preparation, which is where Generative Artificial Intelligence (GAI) offers potential. 

In this paper, the authors discuss a novel GAI-integrated SemCom network (GAI-SCN) framework in a cloud-edge-mobile design to enable better SemCom training efficiency. To date, only a few studies have explored the potential of incorporating GAI with SemCom. Key highlights of the paper are outlined below. 

Benefits

GAI models are capable of producing vast multimodal content (text, image, and video) with a certain degree of authenticity – materials that are valuable to semantic training and background knowledge. Existing AI generated content (AIGC) can be easily accessed by SemCom to provide semantic coding models a better understanding for the context information. This helps to finetune the pretraining data with high-quality and precise content.

By utilizing well-trained GAI models, only several prompts need to be sent instead of transmitting the whole source information in each SemCom process, which significantly reduces the required bandwidth resources while retaining the original meaning.

The biggest benefit of combining GAI with a semantic communication network is that is results in low spectrum consumption, which leads to higher spectrum utilization. When there is low spectrum consumption, less of the wireless network’s bandwidth, which is a finite resource, is used. Using less bandwidth also reduces interference with other nearby wireless signals and allows the network to accommodate more devices and applications. 

Solution

A novel GAI-integrated SemCom network (GAI-SCN) framework in a cloud-edge-mobile design incorporates both global and local GAI models with the Joint Source-Channel Coding (JSCC) process. JSCC refers to a technique where the source data (like an image or video) is compressed and encoded simultaneously with the channel coding, essentially optimizing the transmission process by considering both the source information and the characteristics of the wireless channel, rather than treating them as separate steps. This not only enables efficient and high-quality meaning delivery, but also significantly reduces transmission traffic as well as latency, the delay between when a user requests data and when they receive it.

How it works

The framework essentially divides up where data is processed into three areas: the mobile device, referred to as the terminal device (TD); the edge layer, a nearby server that can manage computationally intensive tasks; and the cloud layer where data resources important for pre-training are housed. 

The workflow has three successive stages: Initial Network Preparation Stage (mobile layer), GAI-integrated SemCom Service Provisioning Stage (edge layer), and Model Synchronization and Update Stage (cloud layer). The image below outlines the process.

A SemCom-enabled cellular network scenario, where there are multiple terminal devices (TDs) of senders and receivers within the coverage of base stations (BSs).

 

In a SemCom-enabled cellular network, there are multiple terminal devices (TDs) of senders and receivers within the coverage of base stations (BSs), commonly referred to as cell phone towers. During the process of transmitting messages with text, images and video, the SemCom function kicks in at the TD, with various elements of SemCom handling different jobs to prepare the message for transmission to the BS. The TD is also equipped with a small GAI function to handle lightweight tasks. Some intensive computational tasks are delivered to cloud based GAI models to reduce activity in the wireless network. Once the information arrives at the BS, the BS schedules and coordinates the information.

This proposed GAI-SCN design is primarily focused on how to accurately transmit and recover the core semantics between the source and destination. It does not factor in signal transmission inconsistency which is a common occurrence.

Challenges

A number of issues with this framework have been identified, including device hardware limitations, content inconsistency, and inactive information sharing of users.

Limited Device Resources for Supporting AI Modules
Today’s mobile phones have limited storage, memory, computational units and battery power to support sophisticated AI-enabled computing modules in this proposed GAI-SCN. Next generation mobile phones that are equipped with advanced model compression and acceleration technologies can help to reduce the complexity and size of AI networks to make this model commercially viable.

Comparisons between original and recovered images by the proposed GAI-SCN framework in terms of three metrics: A) Semantic similarity by spaCy; B) Object quantity discrepancy; C) Recovery ratio of original objects.

 

Randomness of Content Rendered in GAI-SCN
Another challenge is that information coming out of the AIGC and from the cloud GAI may vary even given the same keywords and goals. Other message discrepancies come from the uncertainty of information processed by the semantic decoder due to the knowledge mismatching or semantic errors. To improve this, greater training on keyword extraction and increased semantic calibration is needed.

Inactive Sharing of Background Knowledge and Personal Preferences
Both AIGC and SemCom are predicated on access to large amounts of data. If the data is not made available, these functions are limited. Implementing a score or rewards-based incentive would encourage users to spontaneously contribute personal data.

Conclusion

As one of the early forays into the potential of a GAI-assisted SemCom network to enhance communication resource usage and user experience, this paper provides both initial ideas and further lines of inquiry in this area.

Interested in learning more about Generative AI? The IEEE Xplore Digital Library offers over 6,000 publications on Generative AI.

Interested in acquiring full-text access to this collection for your entire organization? Request a free demo and trial subscription for your organization.