Research

This page provides research highlights. Please see our publications page for more information and/or feel free to contact us. We enjoy discussing interesting ideas and pursuing new collaborations.


Multi-Robot Task Allocation Games in Dynamically Changing Environments
Researchers: Shinkyu Park, Desmond Zhong, and Naomi Ehrich Leonard
Abstract: We propose a game-theoretic multi-robot task allocation framework that enables a large team of robots to optimally allocate tasks in dynamically changing environments. As our main contribution, we design a decision-making algorithm that defines how the robots select tasks to perform and how they repeatedly revise their task selections in response to changes in the environment. Our convergence analysis establishes that the algorithm enables the robots to learn and asymptotically achieve the optimal stationary task allocation. Through experiments with a multi-robot trash collection application, we assess the algorithm’s responsiveness to changing environments and resilience to failure of individual robots.
Related Publications:
S. Park, Y. D. Zhong, and N. E. Leonard, “Multi-robot task allocation games in dynamically changing environments”, in 2021International Conference on Robotics and Automation (ICRA), Xi’an, China, 2021. [PDF]

A General Model of Opinion Dynamics with Tunable Sensitivity
Researchers: Anastasia Bizyaeva, Alessio Franci, and Naomi Ehrich Leonard
Abstract: We present a model of continuous-time opinion dynamics for an arbitrary number of agents that communicate over a network and form real-valued opinions about an arbitrary number of options. The model generalizes linear and nonlinear models in the literature. Drawing from biology, physics, and social psychology, we introduce an attention parameter to modulate social influence and a saturation function to bound inter-agent and intra-agent opinion exchanges. This yields simply parameterized dynamics that exhibit the range of opinion formation behaviors predicted by model-independent bifurcation theory but not exhibited by linear models or existing nonlinear models. Behaviors include rapid and reliable formation of multistable consensus and dissensus states, even in homogeneous networks, as well as ultra-sensitivity to inputs, robustness to uncertainty, flexible transitions between consensus and dissensus, and opinion cascades. Augmenting the opinion dynamics with feedback dynamics for the attention parameter results in tunable thresholds that govern sensitivity and robustness. The model provides new means for systematic study of dynamics on natural and engineered networks, from information spread and political polarization to collective decision making and dynamic task allocation.
Related Publications:
A. Bizyaeva, A. Franci, and N. E. Leonard, “A General Model of Opinion Dynamics with Tunable Sensitivity”, in arXiv:2009.04332 [math.OC], 2020. [arXiv]
A. Bizyaeva, A. Matthews, A. Franci, and N. E. Leonard, “Patterns of nonlinear opinion formation on networks”, to appear in American Control Conference (ACC), 2021. [arXiv]
A. Franci, M. Golubitsky, A. Bizyaeva, and N. E. Leonard, “A model-independent theory of consensus and dissensus decision making”, in arXiv:1909.05765v2 [math.OC]. [arXiv]
R. Gray, A. Franci, V. Srivastava, and N. E. Leonard, “Multi-agent decision-making dynamics inspired by honeybees”, in IEEE Transactions on Control of Network Systems, Vol. 5, No. 2, June 2018, pp. 793-806. [PDF] [arXiv]

Distributed Cooperative Decision Making in Multi-agent Multi-armed Bandits
Researchers: Peter Landgren, Vaibhav Srivastava, and Naomi Ehrich Leonard
Abstract: We study a distributed decision-making problem in which multiple agents face the same multi-armed bandit (MAB), and each agent makes sequential choices among arms to maximize its own individual reward. The agents cooperate by sharing their estimates over a fixed communication graph. We consider an unconstrained reward model in which two or more agents can choose the same arm and collect independent rewards. And we consider a constrained reward model in which agents that choose the same arm at the same time receive no reward. We design a dynamic, consensus-based, distributed estimation algorithm for cooperative estimation of mean rewards at each arm. We leverage the estimates from this algorithm to develop two distributed algorithms: coop-UCB2 and coop-UCB2-selective-learning, for the unconstrained and constrained reward models, respectively. We show that both algorithms achieve group performance close to the performance of a centralized fusion center. Further, we investigate the influence of the communication graph structure on performance. We propose a novel graph explore-exploit index that predicts the relative performance of groups in terms of the communication graph, and we propose a novel nodal explore-exploit centrality index that predicts the relative performance of agents in terms of the agent locations in the communication graph.
Related Publications:
P. Landgren, V. Srivastava, and N. E. Leonard, “Distributed cooperative decision making in multi-agent multi-armed bandits”, in Automatica, Vol. 125, 2021. Mar. 2021. [arXiv]
P. Landgren, V. Srivastava, and N. E. Leonard, “Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms”, in Conference on Decision and Control (CDC), Las Vegas, NV, 2016, pp. 167-172. [PDF] [PDF with correction] [arXiv]

Adaptive susceptibility and heterogeneity in contagion models on networks
Researchers: Renato Pagliara and Naomi Ehrich Leonard
Abstract: Contagious processes, such as spread of infectious diseases, social behaviors, or computer viruses, affect biological, social, and technological systems. Epidemic models for large populations and finite populations on networks have been used to understand and control both transient and steady-state behaviors. Typically it is assumed that after recovery from an infection, every agent will either return to its original susceptible state or acquire full immunity to reinfection. We study the network SIRI (Susceptible-Infected-Recovered-Infected) model, an epidemic model for the spread of contagious processes on a network of heterogeneous agents that can adapt their susceptibility to reinfection. The model generalizes existing models to accommodate realistic conditions in which agents acquire partial or compromised immunity after first exposure to an infection. We prove necessary and sufficient conditions on model parameters and network structure that distinguish four dynamic regimes: infection-free, epidemic, endemic, and bistable. For the bistable regime, which is not accounted for in traditional models, we show how there can be a rapid resurgent epidemic after what looks like convergence to an infection-free population. We use the model and its predictive capability to show how control strategies can be designed to mitigate problematic contagious behaviors.
Related Publications:
R. Pagliara, and N. E. Leonard, “Adaptive susceptibility and heterogeneity in contagion models on networks”, in IEEE Transactions on Automatic Control, vol. 66, no. 2, pp. 581-594, Feb. 2021. [arXiv]
R. Pagliara, B. Dey and N. E. Leonard, “Bistability and resurgent epidemics in reinfection models”, in IEEE Control Systems Letters, Vol. 2, No. 2, pp. 290-295, 2018. [PDF]
Y. Zhou, S. A. Levin, and N. E. Leonard, “Active control and sustained oscillations in actSIS epidemic dynamics”, in IFAC Workshop on Cyber-Physical & Human Systems (CPHS), 2020. [arXiv]

Optimal evasive strategies for multiple interacting agents with motion constraints
Researchers: William Lewis Scott and Naomi Ehrich Leonard
Abstract: We derive and analyze optimal control strategies for a system of pursuit and evasion with a single speed-limited pursuer, and multiple heterogeneous evaders with limits on speed, angular turning rate, and lateral acceleration. The goal of the pursuer is to capture a single evader in the minimum time possible, and the goal of each evader is to avoid capture if possible, or else delay capture for as long as possible. Optimal strategies are derived for the one-on-one differential game, and these form the basis of strategies for the multiple-evader system. We propose a pursuer strategy of optimal target selection which leads to capture in bounded time. For evaders, we prove how any evader not initially targeted can avoid capture. We also consider optimal strategies for agents with radius-limited sensing capabilities, proving conditions for evader capture avoidance through a local strategy of risk reduction. We show how evaders aggregate in response to a pursuer, much like animals behave in the wild.
Related Publications:
W. L. Scott and N. E. Leonard, “Minimum-time trajectories for steered agent with constraints on speed, lateral acceleration, and turning rate”, in ASME Journal of Dynamic Systems, Measurement and Control, Vol. 140, No. 7, p. 071017, July 2018. [PDF]
W. L. Scott and N. E. Leonard, “Dynamics of pursuit and evasion in a heterogeneous herd”, in 53rd IEEE Conference on Decision and Control (CDC), pp. 2920-2925, 2014. [PDF]
W. L. Scott and N. E. Leonard, “Pursuit, herding and evasion: A three-agent model of caribou predation”, in 2013 American Control Conference (ACC), pp. 2978-2983, 2013. [PDF]