How to implement a low level controller? #96

HimGautam · 2022-06-14T07:53:06Z

Hi @JacopoPan ,
I want to use a reinforcement learning agent for position tracking. My idea is input to RL agent will be X, Y, Z of drone and target and it will output linear velocities. This linear velocity will then be used in a PID controller for low level controller. Can you please tell me how to do this when using your simulator.

JacopoPan · 2022-06-14T15:00:50Z

Hi @HimGautam

there already is a subclass of BaseAviary that works like that (has positions in the observations and takes velocity vectors as actions): https://github.com/utiasDSL/gym-pybullet-drones/blob/master/gym_pybullet_drones/envs/VelocityAviary.py
It is used (without learning) in this example: https://github.com/utiasDSL/gym-pybullet-drones/blob/master/examples/velocity.py
You probably want to add the target to observation vector and the reward signal.

HimGautam · 2022-06-14T18:03:32Z

Hi @JacopoPan,
I checked out that environment. I wanted to know what action vector represents.
Action vector: X Y Z fract. of MAX_SPEED_KMH
In the file above statement was mentioned. So if X, Y, Z represents velocities in their respective directions, then what is the 4th term here.

JacopoPan · 2022-06-14T19:59:34Z

Look at the implementation of

gym-pybullet-drones/gym_pybullet_drones/envs/VelocityAviary.py

Line 142 in 36da0bf

def _preprocessAction(self,

X,Y,Z are transformed into a unit vector that is then multiplied by the 4th parameter.

Is this for this project of yours https://github.com/HimGautam/LearningToFly ?

HimGautam · 2022-06-16T05:23:23Z

Hi @JacopoPan,
The velocity aviary you told me about can also be achieved using ActionType.VEL in BaseSingleAgentAviary. When I used this in RL agent. I mean when input states in NN are (X, Y, Z, X_d, Y_d, Z_d, R, P, Y, V_x, V_y, V_z, W_x, W_y, W_z) and actions are (V_x, V_y, V_z, frac of Max Speed KMH) . The rl agent just coulnt learn any thing related to position tracking. It just goes up and up with some sort of jittering. I am really confused about which states to include in the RL agent because unnecessary inputs can hamper the learning process.

Also I have added small cubes in 3D space to represent the target position in a single episode. But when I change the target position in that episode, the cubes corresponding to previous target are still there and this is making a big mess. So my question is how can I delete the previous target cubes from the simulation.

JacopoPan · 2022-06-16T17:40:37Z

The simpler point first, you should be able to remove any URDF you imported into the Bullet scene using p.removeBody(UniqueBodyId).

W.r.t your results, from what I see on this page https://github.com/HimGautam/LearningToFly I already think you get to a rather interesting policy, seemingly correlated to the positions of the targets. What I note is (1) bias in the positions where the quad stabilizes and (2) oscillations. Because you are using ActionType.VEL and therefore the internal PID controller, you should be aware that over/undershooting and oscillations are common for poorly tuned gains.

If I were you, I would try a few things: create new target positions/cubes closer to each other, examine the learning policy (when a new cube is created, does it start outputting a vector that connects the current quad position and the new target? Does the 4th value, the magnitude, decreases as the quad approaches the target?), compare to another agent implementation (e.g. PPO).

JacopoPan added the question Further information is requested label Jun 14, 2022

JacopoPan mentioned this issue Jun 16, 2022

Difference between using thrust and speed in RL #97

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to implement a low level controller? #96

How to implement a low level controller? #96

How to implement a low level controller? #96

How to implement a low level controller? #96

Comments