• Home
  • Current congress
  • Public Website
  • My papers
  • root
  • browse
  • IAC-21
  • C1
  • 4
  • paper
  • Meta-Reinforcement Learning for Adaptive Spacecraft Guidance during Multi-Target Missions

    Paper number

    IAC-21,C1,4,7,x63391

    Author

    Mr. Lorenzo Federici, Italy, Sapienza University of Rome

    Coauthor

    Mr. Andrea Scorsoglio, United States, University of Arizona

    Coauthor

    Mr. Alessandro Zavoli, Italy, Sapienza University of Rome

    Coauthor

    Prof. Roberto Furfaro, United States, University of Arizona

    Year

    2021

    Abstract
    The use of micro/nano-spacecraft for deep-space exploration is becoming a reality, with the recent success of NASA's Mars Cube One mission to Mars and JAXA's PROCYON mission to a near-Earth object (NEO). The use of electric propulsion engines in micro-spacecraft may considerably improve the scientific return of future exploration missions, as the saved propellant mass can be allocated to additional scientific instrumentation. However, electric propulsion delivers a very low thrust, which usually brings to many-revolution long-duration transfers. This characteristic greatly increases the mathematical complexity and the computational load of both trajectory design and onboard spacecraft guidance. Additionally, micro-spacecraft are typically required to have greater decision-making autonomy than standard aerospace systems. Indeed, the combined effect of a reduced number of ground station accesses and limited propellant margins shifts the balance in favor of small real-time trajectory adjustments, rather than of larger, but delayed, ground-controlled corrections, in the event of unforeseen mid-course malfunctioning. The presence of an autonomous and computationally lightweight guidance system onboard is particularly relevant in multi-target exploration missions, where the need to repeatedly change the trajectory to target different bodies clashes with the limited computing capabilities of the flight hardware. This research addresses these challenges by demonstrating a novel deep reinforcement meta-learning procedure, named Model-Agnostic Meta-Learning (MAML), to be a powerful, yet computationally-cheap, means to provide a micro-spacecraft with autonomous and adaptive guidance capabilities during a multi-asteroid exploration mission. The idea underlying MAML is to train a deep neural network by reinforcement learning on a rather-general distribution of environments, in such a way that it achieves maximal performance on any task from that distribution by just using a small amount of new data and one or few gradient descent steps, even though it was not explicitly trained on it. In our investigation, a Proximal-Policy-Optimization-driven MAML approach will be used to teach a micro-spacecraft how to realize optimal low-thrust rendezvous maneuvers between any pair of asteroids belonging to a large list of NEOs, by just training a relatively small-size network to solve a single-target rendezvous problem. A fast and simple fine-tuning of the top layer network's parameters will then be enough to produce a high-quality and robust guidance law on any transfer leg. Results obtained with more traditional RL approaches will be also reported to assess the effectiveness of MAML in the framework of multi-target trajectories.
    Abstract document

    IAC-21,C1,4,7,x63391.brief.pdf

    Manuscript document

    IAC-21,C1,4,7,x63391.pdf (🔒 authorized access only).

    To get the manuscript, please contact IAF Secretariat.