TY - JOUR
T1 - Generative AI-Driven Multi-Agent DRL for Task Allocation in UAV-Assisted EMPD within 6G-Enabled SAGIN Networks
AU - Betalo, Mesfin Leranso
AU - Ullah, Inam
AU - Tesema, Fiseha Berhanu
AU - Wu, Zongze
AU - Li, Jianqiang
AU - Bai, Xiaoshan
PY - 2025/6
Y1 - 2025/6
N2 - The Internet of Health Monitoring (IoHM) plays a vital role in Emergency Medical Package Delivery (EMPD) by enabling real-time monitoring and transmission of critical health data through interconnected devices in 6G networks. UAVs act as Aerial Base Stations (ABSs), facilitating data collection and transmission between GAI-IoHM devices and edge servers. This is crucial for efficient communication in 6G-enabled Space-Air-Ground Integrated Networks (SAGIN). However, UAVs supporting EMPD face challenges related to limited energy and computational capacity, especially during task offloading to edge servers. To address these constraints, this paper proposes the integration of Generative Artificial Intelligence (GAI) into UAVs for intelligent policy learning, enabling adaptive decision-making, real-time diagnostics, and efficient path planning under uncertainty. We present a novel multi-agent deep reinforcement learning (MADRL)-based joint optimization framework for cooperative task allocation, trajectory planning, and power management (CTATP) in a 6G-enabled SAGIN architecture. The problem is modeled as a Partially Observable Markov Decision Process (POMDP) to capture dynamic and uncertain operational conditions. To solve this, we introduce the GAI-based Deep Deterministic Double Policy Gradient (GAI-DD3PG) algorithm, which leverages a generative actor-network to learn adaptive, energy-efficient control policies from latent action spaces. Simulations in urban emergency scenarios with 10–20 UAVs demonstrate GAI-DD3PG’s efficacy. Compared to Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Multi-Agent Federated Reinforcement Learning (MAFRL), and Greedy heuristics, GAI-DD3PG achieves a 20% energy reduction, 15% higher delivery success rate,25% shorter trajectories, and 30% improved resource utilization. These results highlight its potential for reliable EMPD in complex 6G SAGIN environments.
AB - The Internet of Health Monitoring (IoHM) plays a vital role in Emergency Medical Package Delivery (EMPD) by enabling real-time monitoring and transmission of critical health data through interconnected devices in 6G networks. UAVs act as Aerial Base Stations (ABSs), facilitating data collection and transmission between GAI-IoHM devices and edge servers. This is crucial for efficient communication in 6G-enabled Space-Air-Ground Integrated Networks (SAGIN). However, UAVs supporting EMPD face challenges related to limited energy and computational capacity, especially during task offloading to edge servers. To address these constraints, this paper proposes the integration of Generative Artificial Intelligence (GAI) into UAVs for intelligent policy learning, enabling adaptive decision-making, real-time diagnostics, and efficient path planning under uncertainty. We present a novel multi-agent deep reinforcement learning (MADRL)-based joint optimization framework for cooperative task allocation, trajectory planning, and power management (CTATP) in a 6G-enabled SAGIN architecture. The problem is modeled as a Partially Observable Markov Decision Process (POMDP) to capture dynamic and uncertain operational conditions. To solve this, we introduce the GAI-based Deep Deterministic Double Policy Gradient (GAI-DD3PG) algorithm, which leverages a generative actor-network to learn adaptive, energy-efficient control policies from latent action spaces. Simulations in urban emergency scenarios with 10–20 UAVs demonstrate GAI-DD3PG’s efficacy. Compared to Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Multi-Agent Federated Reinforcement Learning (MAFRL), and Greedy heuristics, GAI-DD3PG achieves a 20% energy reduction, 15% higher delivery success rate,25% shorter trajectories, and 30% improved resource utilization. These results highlight its potential for reliable EMPD in complex 6G SAGIN environments.
KW - Generative artificial intelligence
KW - multi-agent deep reinforcement learning
KW - ,trajectory
KW - power optimization
KW - energy-efficient UAV operation
KW - space-air-ground integrated network
UR - https://doi.org/10.1109/JIOT.2025.3579780
U2 - 10.1109/JIOT.2025.3579780
DO - 10.1109/JIOT.2025.3579780
M3 - Article
SN - 2327-4662
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
ER -