The following videos shows evolution of the obstacle avoidance behavior after 0, 20, and 100 generations:
The obstacle avoidance scenario was used to create a set of robots that could navigate through their environment while avoiding crashes with walls and other robots. Robots placed in this scenario had no prior driving experience and contained neural networks with randomly initialized weights. The resulting trained robots were used as seed populations for all subsequent scenarios.
Setup: Four green robots, four red robots, five green waypoints, and five red waypoints were placed at random locations across the simulation environment.
Fitness Evaluation: Robots were encouraged to explore as much of the map that they could in the time allocated using the following fitness function:
Fitness = DistanceTraveled * AreaExplored
Robots were not explicitly punished for crashing, though crashing indirectly reduced their fitness by preventing them from further exploration. The waypoints were placed in the environment so that the robots would grow accustom to their presence. Robot received no reward for passing waypoints.
Quantitative Evaluation: Robots displayed a learning curve that had small oscillations, but trended upward linearly. This curve appeared to level off by the hundredth iteration. The learning curves are shown in the backgrounds of each simulation video. The red plot shows the best performing robot from all populations, the green line shows the worst performing robot, and the blue line shows the average robot.
Qualitative Evaluation: Observations made after 0, 20, and 100 generations are printed below each of the following videos.
Obstacle Avoidance training after 0 iterations: When the robots were first exposed to the map, the majority drove directly into a wall, a good percentage sat in one place and spun, while a few drove at least a short distance. Surprisingly, some individuals preformed quite well on their first attempt. These robots were found to drive parallel to a nearby wall and even following it around corners.
Obstacle Avoidance training after 20 iterations: Robots seemed to be getting used to their environment. Their mastery of following walls was good but they still have trouble avoiding other robots. Some robots seem to have learned that they could get more points by driving faster, but most drive slow and carefully.
Obstacle Avoidance training after 100 iterations: Robots had almost completely mastered wall-avoidance. They still had occasional collisions with each other but appeared to be “aware” of each other’s presence. Robots would drive at top speed most of the time but some would slow down when approaching corners, dead ends, or other robots. Many robots avoid passing others, and instead turn around when approaching an oncoming robot.