Generative AI-Driven Multi-Agent DRL for Task Allocation in UAV-Assisted EMPD within 6G-Enabled SAGIN Networks

Mesfin Leranso Betalo; Inam Ullah; Fiseha Berhanu Tesema; Zongze Wu; Jianqiang Li; Xiaoshan Bai

doi:10.1109/JIOT.2025.3579780

Generative AI-Driven Multi-Agent DRL for Task Allocation in UAV-Assisted EMPD within 6G-Enabled SAGIN Networks

Mesfin Leranso Betalo, Inam Ullah, Fiseha Berhanu Tesema, Zongze Wu, Jianqiang Li, Xiaoshan Bai

Research output: Journal Publication › Article › peer-review

Abstract

The Internet of Health Monitoring (IoHM) plays a vital role in Emergency Medical Package Delivery (EMPD) by enabling real-time monitoring and transmission of critical health data through interconnected devices in 6G networks. UAVs act as Aerial Base Stations (ABSs), facilitating data collection and transmission between GAI-IoHM devices and edge servers. This is crucial for efficient communication in 6G-enabled Space-Air-Ground Integrated Networks (SAGIN). However, UAVs supporting EMPD face challenges related to limited energy and computational capacity, especially during task offloading to edge servers. To address these constraints, this paper proposes the integration of Generative Artificial Intelligence (GAI) into UAVs for intelligent policy learning, enabling adaptive decision-making, real-time diagnostics, and efficient path planning under uncertainty. We present a novel multi-agent deep reinforcement learning (MADRL)-based joint optimization framework for cooperative task allocation, trajectory planning, and power management (CTATP) in a 6G-enabled SAGIN architecture. The problem is modeled as a Partially Observable Markov Decision Process (POMDP) to capture dynamic and uncertain operational conditions. To solve this, we introduce the GAI-based Deep Deterministic Double Policy Gradient (GAI-DD3PG) algorithm, which leverages a generative actor-network to learn adaptive, energy-efficient control policies from latent action spaces. Simulations in urban emergency scenarios with 10–20 UAVs demonstrate GAI-DD3PG’s efficacy. Compared to Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Multi-Agent Federated Reinforcement Learning (MAFRL), and Greedy heuristics, GAI-DD3PG achieves a 20% energy reduction, 15% higher delivery success rate,25% shorter trajectories, and 30% improved resource utilization. These results highlight its potential for reliable EMPD in complex 6G SAGIN environments.

Original language	English
Journal	IEEE Internet of Things Journal
DOIs	https://doi.org/10.1109/JIOT.2025.3579780
Publication status	Published - Jun 2025

Keywords

Generative artificial intelligence
multi-agent deep reinforcement learning
,trajectory
power optimization
energy-efficient UAV operation
space-air-ground integrated network

Access to Document

10.1109/JIOT.2025.3579780

Cite this

@article{3f005f00142245819517ffa267eb5124,

title = "Generative AI-Driven Multi-Agent DRL for Task Allocation in UAV-Assisted EMPD within 6G-Enabled SAGIN Networks",

abstract = "The Internet of Health Monitoring (IoHM) plays a vital role in Emergency Medical Package Delivery (EMPD) by enabling real-time monitoring and transmission of critical health data through interconnected devices in 6G networks. UAVs act as Aerial Base Stations (ABSs), facilitating data collection and transmission between GAI-IoHM devices and edge servers. This is crucial for efficient communication in 6G-enabled Space-Air-Ground Integrated Networks (SAGIN). However, UAVs supporting EMPD face challenges related to limited energy and computational capacity, especially during task offloading to edge servers. To address these constraints, this paper proposes the integration of Generative Artificial Intelligence (GAI) into UAVs for intelligent policy learning, enabling adaptive decision-making, real-time diagnostics, and efficient path planning under uncertainty. We present a novel multi-agent deep reinforcement learning (MADRL)-based joint optimization framework for cooperative task allocation, trajectory planning, and power management (CTATP) in a 6G-enabled SAGIN architecture. The problem is modeled as a Partially Observable Markov Decision Process (POMDP) to capture dynamic and uncertain operational conditions. To solve this, we introduce the GAI-based Deep Deterministic Double Policy Gradient (GAI-DD3PG) algorithm, which leverages a generative actor-network to learn adaptive, energy-efficient control policies from latent action spaces. Simulations in urban emergency scenarios with 10–20 UAVs demonstrate GAI-DD3PG{\textquoteright}s efficacy. Compared to Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Multi-Agent Federated Reinforcement Learning (MAFRL), and Greedy heuristics, GAI-DD3PG achieves a 20% energy reduction, 15% higher delivery success rate,25% shorter trajectories, and 30% improved resource utilization. These results highlight its potential for reliable EMPD in complex 6G SAGIN environments.",

keywords = "Generative artificial intelligence, multi-agent deep reinforcement learning, ,trajectory, power optimization, energy-efficient UAV operation, space-air-ground integrated network",

author = "Betalo, {Mesfin Leranso} and Inam Ullah and Tesema, {Fiseha Berhanu} and Zongze Wu and Jianqiang Li and Xiaoshan Bai",

year = "2025",

month = jun,

doi = "10.1109/JIOT.2025.3579780",

language = "English",

journal = "IEEE Internet of Things Journal",

issn = "2327-4662",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Generative AI-Driven Multi-Agent DRL for Task Allocation in UAV-Assisted EMPD within 6G-Enabled SAGIN Networks

AU - Betalo, Mesfin Leranso

AU - Ullah, Inam

AU - Tesema, Fiseha Berhanu

AU - Wu, Zongze

AU - Li, Jianqiang

AU - Bai, Xiaoshan

PY - 2025/6

Y1 - 2025/6

N2 - The Internet of Health Monitoring (IoHM) plays a vital role in Emergency Medical Package Delivery (EMPD) by enabling real-time monitoring and transmission of critical health data through interconnected devices in 6G networks. UAVs act as Aerial Base Stations (ABSs), facilitating data collection and transmission between GAI-IoHM devices and edge servers. This is crucial for efficient communication in 6G-enabled Space-Air-Ground Integrated Networks (SAGIN). However, UAVs supporting EMPD face challenges related to limited energy and computational capacity, especially during task offloading to edge servers. To address these constraints, this paper proposes the integration of Generative Artificial Intelligence (GAI) into UAVs for intelligent policy learning, enabling adaptive decision-making, real-time diagnostics, and efficient path planning under uncertainty. We present a novel multi-agent deep reinforcement learning (MADRL)-based joint optimization framework for cooperative task allocation, trajectory planning, and power management (CTATP) in a 6G-enabled SAGIN architecture. The problem is modeled as a Partially Observable Markov Decision Process (POMDP) to capture dynamic and uncertain operational conditions. To solve this, we introduce the GAI-based Deep Deterministic Double Policy Gradient (GAI-DD3PG) algorithm, which leverages a generative actor-network to learn adaptive, energy-efficient control policies from latent action spaces. Simulations in urban emergency scenarios with 10–20 UAVs demonstrate GAI-DD3PG’s efficacy. Compared to Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Multi-Agent Federated Reinforcement Learning (MAFRL), and Greedy heuristics, GAI-DD3PG achieves a 20% energy reduction, 15% higher delivery success rate,25% shorter trajectories, and 30% improved resource utilization. These results highlight its potential for reliable EMPD in complex 6G SAGIN environments.

AB - The Internet of Health Monitoring (IoHM) plays a vital role in Emergency Medical Package Delivery (EMPD) by enabling real-time monitoring and transmission of critical health data through interconnected devices in 6G networks. UAVs act as Aerial Base Stations (ABSs), facilitating data collection and transmission between GAI-IoHM devices and edge servers. This is crucial for efficient communication in 6G-enabled Space-Air-Ground Integrated Networks (SAGIN). However, UAVs supporting EMPD face challenges related to limited energy and computational capacity, especially during task offloading to edge servers. To address these constraints, this paper proposes the integration of Generative Artificial Intelligence (GAI) into UAVs for intelligent policy learning, enabling adaptive decision-making, real-time diagnostics, and efficient path planning under uncertainty. We present a novel multi-agent deep reinforcement learning (MADRL)-based joint optimization framework for cooperative task allocation, trajectory planning, and power management (CTATP) in a 6G-enabled SAGIN architecture. The problem is modeled as a Partially Observable Markov Decision Process (POMDP) to capture dynamic and uncertain operational conditions. To solve this, we introduce the GAI-based Deep Deterministic Double Policy Gradient (GAI-DD3PG) algorithm, which leverages a generative actor-network to learn adaptive, energy-efficient control policies from latent action spaces. Simulations in urban emergency scenarios with 10–20 UAVs demonstrate GAI-DD3PG’s efficacy. Compared to Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Multi-Agent Federated Reinforcement Learning (MAFRL), and Greedy heuristics, GAI-DD3PG achieves a 20% energy reduction, 15% higher delivery success rate,25% shorter trajectories, and 30% improved resource utilization. These results highlight its potential for reliable EMPD in complex 6G SAGIN environments.

KW - Generative artificial intelligence

KW - multi-agent deep reinforcement learning

KW - ,trajectory

KW - power optimization

KW - energy-efficient UAV operation

KW - space-air-ground integrated network

UR - https://doi.org/10.1109/JIOT.2025.3579780

U2 - 10.1109/JIOT.2025.3579780

DO - 10.1109/JIOT.2025.3579780

M3 - Article

SN - 2327-4662

JO - IEEE Internet of Things Journal

JF - IEEE Internet of Things Journal

ER -

Generative AI-Driven Multi-Agent DRL for Task Allocation in UAV-Assisted EMPD within 6G-Enabled SAGIN Networks

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this