Ph.D. (Engg) : Distributed Artificial Intelligence Technology for Robotic Swarms: An Interpretable Online Learning Perspective

Name: Ph.D. (Engg) : Distributed Artificial Intelligence Technology for Robotic Swarms: An Interpretable Online Learning Perspective
Start: 2026-06-10T10:30:00+05:30
End: 2026-06-10T12:00:00+05:30

June 10 @ 10:30 AM - 12:00 PM

Virtual Event

Robotic swarms offer immense potential for critical operations such as search and rescue, surveillance, and environmental monitoring, owing to their distributed nature and redundancy. However, the operational reliability of swarms heavily depends on a continuous, closed-loop process: extracting high-fidelity or accurate collective situational awareness from distributed observation, and subsequently utilizing that awareness for cooperative decision control in non-stationary environments. To achieve reliable operation across unstructured and dynamic domains, the robotic swarm must maintain resilience when subjected to adverse situations. For instance, in real-world target searching and surrounding, environmental hazards can cause physical sensors to degrade or fail, leading to significant prediction biases, communication link drops, or a complete loss of ground truth. Further, robots operating in such dynamically evolving situations are frequently tasked with conflicting spatial objectives, such as aggressively pursuing an agile target while safely navigating dense clutter, requiring intelligent control to manage limited actuation and actuator velocity saturation limits. To address these demands, this thesis develops an online learning based Distributed Artificial Intelligence Technology (DAITy) for robotic swarms, which establishes a collective Situational Awareness – Decision Control (SA-DC) loop capable of executing resilient swarm cooperation and coordination across unstructured and non-stationary environments.

This thesis first focuses on extracting high-fidelity collective situational awareness from distributed observation under severe sensory biases and intermittent communication failures, enabling the robots to accurately determine own-location and predict target behavior. The Distributed Learning-based Decentralized Cooperative Localization (DL-DCL) and Distributed Online learning-based Multi-Estimate (DOME) fusion frameworks are proposed. These supervised loss-driven online learning frameworks dynamically learn an information fusion strategy to combine pose estimates (or target position predictions) from onboard sensors (or prediction algorithms) and neighboring robots. To handle dynamically evolving situations and recover target trajectories despite severe environmental interference, both frameworks incorporate periodic reset mechanisms to shed historical inertia alongside bounded loss functions to prevent explosive estimation errors from faulty sensors. Theoretically, both frameworks establish the convergence of the fusion weights to the optimal local or social estimate/prediction; while DL-DCL’s theoretical analysis involves deterministic communication network, DOME’s analysis involves random network. Quantitative evaluations demonstrate that DL-DCL improves own-pose estimation performance by approximately 40% under severe sensor faults, while DOME achieves up to a 74% reduction in target position prediction loss compared to baselines that include both covariance-based and online learning methods.

While supervised online learning is effective when landmarks or leaders are available, swarms often operate in unstructured, GPS-denied environments entirely lacking such ground-truth supervisors. To tackle this challenge, the second contribution of this thesis introduces a reward-driven online learning framework based on concepts of independent learning or self-learning. The Autonomous Online Learning (AOL) framework enables resilient cooperative target monitoring by a reward-driven weighted information fusion process extracting accurate target location from limited, intermittent exteroception among the robots in the swarm. Three variants are developed, introducing a novel perturbation-greedy reward design that facilitates exploration-exploitation in the fusion weight space. The AOL framework empowers the swarm to isolate faulty robots and prioritize reliable information dynamically in adverse situations. Through rigorous convergence analysis, the framework theoretically guarantees that the fusion weights converge to prioritize the most accurate information source. The top-performing variant, AOL-1P, demonstrates a 182.2% to 652% improvement in target detection scores and a 94.7% to 150.4% improvement in tracking closeness across varying swarm sizes over established baselines, ensuring robust monitoring even when 50% of the total swarm population is undergoing permanent sensor failures.

With a high-fidelity situational awareness established, the swarm must execute distributed decision control in non-stationary environments. Standard online optimization methods are often ineffective or inefficient on physical robotic swarms due to the dual challenge of simultaneous multi-objective balancing and actuator descent-step saturation, particularly when target maneuvers force the swarm into dense clutter requiring seamless transitions between target tracking, pursuit, and collision avoidance. As the third contribution, this thesis proposes the Softmax-Adaptive Objective Balancing in Multi-Objective Online Gradient Descent (SAO-MOOGD) framework. SAO-MOOGD utilizes a bounded loss signal to update objective weights using a softmax function, enabling dynamic weighted averaging of the objective gradients. This enables simultaneous, real-time balancing of competing goals, such as decoupled collision avoidance versus target tracking. Theoretical analysis reveals dynamic regret bounds that explicitly quantify the error induced by physical velocity saturation, showing that sub-linear regret growth can be guaranteed when proportional velocity controllers are appropriately tuned. Further, theoretical analysis proves the exponential decay of the aggregation gap regret under non-degenerate loss margins. Complementing these theoretical guarantees, the approach leverages the 1/2-Lipschitz continuity of the softmax operator to ensure smooth physical trajectory blending, effectively eliminating heuristic limit-cycling. Across extensive ROS-Gazebo evaluations using TurtleBot3s, SAO-MOOGD achieves improved multi-objective performance. Compared to high-tracking heuristic baselines, it exhibits up to 5.5% lower target tracking accuracy while reducing collisions by up to 67% and lowering control effort by 30%. Further, it demonstrates improved resilience, incurring a reduction of less than 1% in tracking accuracy even under 40% communication packet drops and severe Gaussian sensory noise.

During real-world deployments, maintaining precise geometric configurations under physical disturbances is crucial for the Situational Awareness – Decision Control Loop. As the final contribution, this thesis introduces the Topological Online Learning for Displacement-based (TOLD) formation control framework. Unlike conventional robust controllers that regulate node-level inputs without modifying the interaction topology, TOLD performs real-time edge-level adaptation to preserve the swarm’s structural integrity in dynamically evolving situations. Two strategies are proposed: Online Gradient Flow (OGF) with unconstrained weights, and Online Exponential Gradient Flow (OExpGF) with non-negative convex weights. Theoretical analysis proves that under directed communication topologies, the convex OExpGF strategy drives single-integrator agents to asymptotic consensus, whereas the unconstrained OGF approach guarantees a bounded structural distortion. Evaluated in both simulation and physical hardware experiments using Crazyflie 2.0 quadrotors, TOLD significantly reduces formation distortion. OGF achieves a 62% reduction and OExpGF over a 31.4% reduction in median formation distortion compared to static interaction topologies in hardware experiments.

Overall, this thesis develops Distributed Artificial Intelligence Technology (DAITy) for robotic swarms from an interpretable online learning perspective, seamlessly connecting collective information to decentralized situational awareness and decision control. Extensive evaluations across theoretical domains, high-fidelity simulations, and real-world hardware testbeds confirm the effectiveness and practical applicability of the proposed frameworks in unstructured and dynamically evolving situations.

Speaker: Shubhankar Gupta

Research Supervisor: Prof. Suresh Sundaram

Teams Meeting Link: Join: https://teams.microsoft.com/meet/48118742709693?p=5x8mXnmrvKS0j3tx9g
Meeting ID: 481 187 427 096 93
Passcode: 9JQ6Tg2d

Details

Date:: June 10
Time:: 10:30 AM - 12:00 PM
Event Category:: Thesis Colloquium / Defence

Venue

: STC Seminar Hall, Dept. of Aerospace Engineering

Other

Speaker: Shubhankar Gupta