Time-Optimal Attitude Controller Design for Precision Pointing of an Agile Spacecraft using Deep Reinforcement Learning
- Paper number
IAC-22,C1,1,7,x73398
- Author
Mr. Debajyoti Chakrabarti, India, ISRO Satellite Centre (ISAC)
- Coauthor
Dr. Vinod Kumar, India, Indian Space Research Organization (ISRO)
- Year
2022
- Abstract
In this research work we have designed a deep reinforcement learning based controller for attitude control of an agile spacecraft. These spacecraft are designed to meet the stringent time-optimal precise attitude pointing requirement. Remote sensing satellites in low Earth orbit are placed in Sun Synchronous retrograde orbit. Whenever the imaging requirement comes, it needs to perform an agile maneuver to achieve the precise Earth pointing orientation. We have used control-moment-gyroscopes (CMG) in pyramid configuration as actuator to meet the agility requirement. It is well known that the CMG’s exhibit geometric singularity problem. This condition arises when no control torque is produced by CMG configuration for the commanded control torque along a particular direction. A singularity robust inverse steering law has been implemented to address this problem. Reinforcement learning (RL) is an intelligent goal-directed computational approach. In this technique, learning is performed by interaction with the dynamic environment, that ensures that the controller (or agent) makes series of actions to maximize the cumulative reward for the given problem. In RL, the agent is able to observe the current state of the environment. From the observed states, it decides the action to be taken under current policy. In the presence of dynamically varying environment, the reward changes likewise, which is utilized by the agent to evaluate the optimality of the current policy and update, if necessary. The observation-action-reward cycle continues until the learning is completed. Accordingly, deep reinforcement learning utilizes the deep neural network architecture in the framework of reinforcement learning, to approximate the policy and value functions. The novelty of this research work is that it uses, the state of the art deep deterministic policy gradient (DDPG) algorithm to train the agent and control orientation of an agile spacecraft under specified constraints. The DDPG is a model-free, off-line policy, actor-critic based algorithm for learning and taking appropriate actions in continuous domain. Through high fidelity simulation studies, we have established that deep reinforcement learning can be a viable solution for an agile spacecraft attitude control under nominal as well as singularity condition. This proof-of-concept solution has shown a degree of autonomy with capability to recover from uncertainty.
- Abstract document
- Manuscript document
IAC-22,C1,1,7,x73398.pdf (🔒 authorized access only).
To get the manuscript, please contact IAF Secretariat.
