AI plays Flappy Bird using Neuroevolution

In these section I will tell you how I made an AI that plays Flappy Bird game.

Introduction

AI is taking control of the whole world very rapidly. From diagnosing cancer to generating different & unique versions of 'Game of Thrones' ending, AI is in action everywhere. In March 2016, AlphaGo beat Lee Sedol in a five-game match with a final score of 4-1. The AlphaGo was trained by Reinforcement Learning. The basic idea behind Reinforcement Learning is to reward the 'agent' for his good decisions and punish him for the bad decisions. The reward will encourage him to take the similar decisions under the similar circumstances in future and the punishment will tell him not to repeat such actions in future. We are going to apply same mechanism to our flappy bird game and watch him play the game while we sit back & assume that we ourselves are playing this game so perfectly. The reward will be the increment in score whenever the bird succesfully crosses the pipe and the punishment will be his death whenever he collides with the pipe or ground or the sky( My Neural Network was unable to find some other word).

What is NEAT?

NEAT stands for Neuroevolution of Augmenting Topologies. As the name suggests, it is the evolution of neural networks, or the evolution of the weights of neural network. It mimics the process of natural evolution. Every generation starts with some number( 300 in our case) of population( bird). The first genration's individual are not smart, thier brain has random neural networks, but out of that randomness, there is a chance that some individuals might take some good decisions during their lifetime. When all the individuals of that generation die, the two individual are chosen based on their good decisions, and they are crossovered( mixing of the weights of both neural network), followed by mutation( random twerks in weights). After this process a new individual is born, same procedure goes on for all the individual of that population. Now we have a new genration with some traits from their parents and some acquired traits( due to mutation). This new genration again goes for the only goal in their life, which is to get maximum score before dying. Again after all the individual die, their traits are passed on to next generation. So by this way, every genration gets smarter as compared to their previous generation.

Now we will apply this theory to our flappy bird and make him immortal.

Implementation

Note: In this section we will not be focussing on the making of the game, our main focus will be on AI. If you want to know about the making of game, which I recommend, click here.

So, let's start the journey!

For implementing this algorithm, the first thing we need is the population of bird. So we initialize our program with appending the population of birds in our birds array.


			    def __init__(self):
			        for _ in range(self.population):
			            self.birds.append(Bird(self))
			        self.gameLoop()
			

The gameLoop() is our main function of the game, where the loop which controls the whole game, resides. But first, let's take a look at our Bird() class.

			
				def __init__(self, game):
			        self.x = 50
			        self.radius = 15
			        from game import Game
			        self.y = Game.height // 2
			        self.y_vel = 0
			        self.gravity = 1
			        self.score = 0
			        self.fitness = 0
			        self.distance = 0
			        self.brain = NeuralNetwork(8, 5, 2)
			        self.game = game
			        self.inputs = 8 * [0]
			
		

This is our __init__ function of Bird(). For now, ignore the other initialisations and focus on self.brain = NeuralNetwork(8,5,2).
Now, we will look at the __init__ function of NeuralNetwork() class.

				
				    def __init__(self, input_nodes, hidden_nodes, output_nodes):
				        self.input_nodes = input_nodes
				        self.hidden_nodes = hidden_nodes
				        self.output_nodes = output_nodes
				        self.in_hidden_weights = np.random.rand(self.hidden_nodes, self.input_nodes)
				        self.hidden_output_weights = np.random.rand(self.output_nodes, self.hidden_nodes)
				        self.in_hidden_biases = np.random.rand(self.hidden_nodes, 1)
				        self.hidden_output_biases = np.random.rand(self.output_nodes, 1)
				
			

As you can see, NeuralNetwork() takes three parameter, input_nodes, hidden_nodes, output_nodes, and then with these parameters we initialize 4 randomly generated matrices self.in_hidden_weights, self.hidden_output_weights, self.in_hidden_biases, self.hidden_output_biases. These function initialises a 3-layered neural network, the first layer be the input layer with number of nodes in it as input_nodes, the second layer as hidden layer with number of nodes in it as hidden_nodes and the third layer as output layer with number of nodes in it as output_nodes. And the 4 randomly generated matrices are the weights of the neural network with self.in_hidden_weights as the weights between input layer & hidden layer, self.hidden_output_weights as the weights between hidden layer & output layer, self.in_hidden_biases & self.hidden_output_biases as the biases between input layer & hidden layer, hidden layer & output layer respectively.

Now let's look at the variable self.brain of Bird() class.

self.brain = NeuralNetwork(8, 5, 2)

So, this indicates that every bird object has a brain variable which is nothing but the randomly generated Neural Network.

Now let's look at the __init__ method of Game() class, where our program started.


			    def __init__(self):
			        for _ in range(self.population):
			            self.birds.append(Bird(self))
			        self.gameLoop()
			

So, what this particular block of code doing is appending 300 birds with their respective brain(Neural Network) into self.birds array and then calling the self.gameLoop() method which is the mainLoop handling our game.

Now, let's take a look at self.gameLoop() method.

				
				    def gameLoop(self):
				    	gameExit = False
				        while not gameExit:
				            for bird in self.birds:
				                bird.showBird()
				                bird.think()
				                bird.predict()
				                bird.distance += 1
				                bird.y += bird.y_vel
				                bird.y_vel += bird.gravity
				                if bird.collidingWall() or bird.collidingPipe():
				                    self.savedBirds.append(bird)
				                    self.birds.remove(bird)
				                    if len(self.birds) == 0:
				                        self.pipes = []
				                        ga = GA(self)
				                        ga.nextGenration()
				                        self.generation += 1
				                        self.gameLoop()
				            pygame.display.update()
				            self.clock.tick(30)
				        pygame.quit()
				        quit()
				
			

So, what this self.gameLoop() method doing is traversing through every bird in self.birds list and performing some operations on them. Let's look at these operation one-by-one.

1. showBird()

Let's look at the method showBird method of bird class.

				
			    	def showBird(self):
				        from game import Game
				        pygame.draw.circle(Game.gameDisplay, Game.black, (self.x, self.y), self.radius)
				
			

Apart from the self.brain, Bird() class also have some other variables such as it's x position, it's y position, it's radius(Since our bird is actually a circle XD) etc. So what this showBird() is doing is just drawing a circle at given x,y with it's radius.

2. think()

Let's see what think() method of bird do

				
			    	def think(self):
				        # Inputs to neural network
				        # 1. horizontal distance from the start of pipe
				        # 2. horizontal distance from the end of pipe
				        # 3. vertical distance from the upper pipe
				        # 4. verical distance from the lower pipe
				        # 5. y position of bird
				        # 6. y velocity of bird
				        # 7. vertical distance from sky
				        # 8. vertical distance from ground

				        pipe = self.findClosestPipe()

				        # 1. horizontal distance from the start of pipe

				        dis_1 = pipe.x - (self.x + self.radius)
				        dis_1 /= self.game.width
				        self.inputs[0] = dis_1

				        # 2. horizontal distance from the end of pipe

				        dis_2 = (pipe.x + 75) - (self.x + self.radius)
				        dis_2 /= self.game.width
				        self.inputs[1] = dis_2

				        # 3. vertical distance from the upper pipe

				        dis_3 = (self.y - self.radius) - (pipe.y - pipe.gap)
				        dis_3 /= self.game.height
				        self.inputs[2] = dis_3

				        # 4. verical distance from the lower pipe

				        dis_4 = (self.y + self.radius) - (pipe.y)
				        dis_4 /= self.game.height
				        self.inputs[3] = dis_4

				        # 5. y position of bird

				        y_pos = self.y
				        y_pos /= self.game.height
				        self.inputs[4] = y_pos

				        # 6. y velocity of bird

				        y_vel = self.y_vel
				        y_vel /= self.game.height
				        self.inputs[5] = y_vel

				        # 7. vertical distance from sky

				        ver_dis_1 = (self.y - self.radius)
				        ver_dis_1 /= self.game.height
				        self.inputs[6] = ver_dis_1

				        # 8. vertical distance from ground

				        ver_dis_2 = (self.y + self.radius) - self.game.height
				        ver_dis_2 /= self.game.height
				        self.inputs[7] = ver_dis_2
				
			

Remember the self.brain = NeuralNetwork(8,5,2)? So our neural network takes 8 inputs, which are:

  1. horizontal distance from the start of pipe
  2. horizontal distance from the end of pipe
  3. vertical distance from the upper pipe
  4. vertical distance from the lower pipe
  5. y position of bird
  6. y velocity of bird
  7. vertical distance from sky
  8. vertical distance from ground

So, what this function does is calculate the inputs for the neural network and normalize them.

3. predict()

Let's see what the predict() method does.

		    	
		    		def predict(self):
				        self.inputs = np.reshape(self.inputs, (8, 1))
				        output = self.brain.feedforward(self.inputs)
				        output = np.argmax(output)

				        if output == 0:
				            self.moveUp()
				        elif output == 1:
				            pass
		    	
			

So, as the name suggests, the inputs which was calculated in think(), the neural network takes it and predicts whether the bird should jump up or not.

Till now, bird was performing some action, like calculating the input and predicting whether it should jump or not, Now we will handle the consequences of the action performed by the bird in previous step.

				
					 if bird.collidingWall() or bird.collidingPipe():
	                    self.savedBirds.append(bird)
	                    self.birds.remove(bird)
	                    if len(self.birds) == 0:
	                        self.pipes = []
	                        ga = GA(self)
	                        ga.nextGenration()
	                        self.generation += 1
	                        self.gameLoop()
	            pygame.display.update()
				
			

This block of code checks that if the bird is colliding with walls(ground or sky) or with the pipes, then it deletes the bird from the self.birds list(which is equivalent to dying). Then if the list becomes empty then our NEAT algorithm comes in action.

nextGenration()

Since our list has became empty, which means our generation has died, now it's time to call the nextGeneration.

				
				    def nextGenration(self):
				        self.calculateFitness()
				        if (self.game.foundBestBird):
				            child = Bird(self.game)
				            child.brain = self.game.bestBirdBrain
				            self.game.birds.append(child)
				        else:
				            self.game.birds.append(self.game.savedBirds[random.randrange(self.game.population)])
				        for i in range(self.game.population - 1):
				            self.game.birds.append(self.pickOne())
				        self.game.savedBirds = []
				
			

calculateFitness() calculates the fitness of each bird based on the distance travelled by him and and it's score. Let's see how it does.

				
			    	def calculateFitness(self):
						sum = 0
						for bird in self.game.savedBirds:
						    bird.calculateFitness()
						    sum += bird.fitness
						for bird in self.game.savedBirds:
						    bird.fitness /= sum
				
			

So, we traverse through every bird object of the previous generation which is saved in our self.savedBirds list, and then call the method calculateFitness() of the bird class.

Note that this is not recursion, the current calculateFitness() is of GA() class which handles NEAT algorithm, and the method calculateFitness() it is calling is from Bird() class.

				
					def calculateFitness(self):
        				self.fitness = math.pow(2, self.score) + (self.distance ** 2)
				
			

The calculateFitness() method of Bird() class calculates the fitness of bird. It is a function of score as well as distance, but more priority is given to score of the bird. So, the more the score and the distance, the more the fitness, & more chances of getting picked up for mating.

Now, we will see calculateFitness() method of GA() class again.

				
			    	def calculateFitness(self):
						sum = 0
						for bird in self.game.savedBirds:
						    bird.calculateFitness()
						    sum += bird.fitness
						for bird in self.game.savedBirds:
						    bird.fitness /= sum
				
			

So, as a whole, what this method does is, it calculates the fitness of every bird and normalize them.

				
				    def nextGenration(self):
				        self.calculateFitness()
				        if (self.game.foundBestBird):
				            child = Bird(self.game)
				            child.brain = self.game.bestBirdBrain
				            self.game.birds.append(child)
				        else:
				            self.game.birds.append(self.game.savedBirds[random.randrange(self.game.population)])
				        for i in range(self.game.population - 1):
				            self.game.birds.append(self.pickOne())
				        self.game.savedBirds = []
				
			

pickOne()

				
			 	def pickOne(self):
			        r1 = random.uniform(0, 1)
			        index = 0
			        while r1 > 0:
			            r1 -= self.game.savedBirds[index].fitness
			            index += 1
			        index -= 1

			        bird1 = self.game.savedBirds[index]

			        r2 = random.uniform(0, 1)
			        index2 = 0

			        while r2 > 0:
			            r2 -= self.game.savedBirds[index2].fitness
			            index2 += 1
			        index2 -= 1

			        bird2 = self.game.savedBirds[index2]

			        child = Bird(self.game)

			        child.brain.in_hidden1_weights = bird1.brain.crossover(bird1.brain.in_hidden1_weights,
			                                                               bird2.brain.in_hidden1_weights)
			        child.brain.in_hidden1_biases = bird1.brain.crossover(bird1.brain.in_hidden1_biases,
			                                                              bird2.brain.in_hidden1_biases)
			        child.brain.hidden1_output_weights = bird1.brain.crossover(bird1.brain.hidden1_output_weights,
			                                                                   bird2.brain.hidden1_output_weights)
			        child.brain.hidden1_output_biases = bird1.brain.crossover(bird1.brain.hidden1_output_biases,
			                                                                  bird2.brain.hidden1_output_biases)

			        child.brain.mutate(child.brain.in_hidden1_weights, 0.3)
			        child.brain.mutate(child.brain.in_hidden1_biases, 0.3)
			        child.brain.mutate(child.brain.hidden1_output_weights, 0.3)
			        child.brain.mutate(child.brain.hidden1_output_biases, 0.3)

			        return child
				
			

This method looks a little bigger. So, first, let's see what it is returning.
return child
So, it returns a child. The job of pickOne() is to append the 300 children to our self.birds list. But before returning child, it will need his parents(2 birds of previous generation). So for each 300 birds, there will be a pair of birds who are crossovered.
But how does the pair is selected?
We select two parent one-by one, based on their fitness, the more the fitness, more is the probabiity of them to get selected. It's not necessary that the two parent will be different, sometimes, the parent can be the same bird, it all depends on the fitness value of the birds in the previous generation. After selecting the parents, the bird is crossovered. Which means, we take all the 4 matrices of parent 1 and mix them with the corresponding matrices of it's counterpart.
How do we mix them? We take the upper half rows of parent 1, lower half rows of parent 2 and combine them to return a matrix, which is a combination of both the matrices.

				
			    	def crossover(self, mat1, mat2):
				        childMat = np.zeros((mat1.shape[0], mat1.shape[1]))
				        x = mat1.shape[0] // 2
				        childMat[:x], childMat[x:] = mat1[:x], mat2[x:]
        				return childMat
				
			

Now we have the 4 matrices of child bird which has the combination of traits of his parents, but the bird should acquire some new traits, so we mutate the 4 matrices.

What mutate() method does is ,it traverse through the matrix, and pick element based on the rate, which is the probability, in our case 30% and twerk the element by adding a random number between -0.1 to 0.1. The rate defines, the probability of the element getting twerked. In our case, while traversing through the matrix, the probability of twerking the particular element is 30%.

				
			    	def mutate(self, mat, rate):
				        for i in range(mat.shape[0]):
				            if rate > (random.uniform(0, 1)):
				                for j in range(mat.shape[1]):
				                    mat[i][j] += random.uniform(-0.1, 0.1)
				
			

Now after mutation, the child is ready to get appended in our self.birds list.

Now let's summarize the nextGeneration() method.

				
				    def nextGenration(self):
				        self.calculateFitness()
				        if (self.game.foundBestBird):
				            child = Bird(self.game)
				            child.brain = self.game.bestBirdBrain
				            self.game.birds.append(child)
				        else:
				            self.game.birds.append(self.game.savedBirds[random.randrange(self.game.population)])
				        for i in range(self.game.population - 1):
				            self.game.birds.append(self.pickOne())
				        self.game.savedBirds = []
				
			

In nextGeneration() method, we first calculate the fitness of every bird, then we check if the best bird is found, i.e if the any of the birds has surpassed the best score which is initially 0. If the best bird is found then we apeend the child with the brain of that best bird and the rest 299 birds by pickOne() method, otherwise, we append one random bird from the population and rest 299 by pickOne() method.

After this, our code again goes to gameLoop() method and the same whole process goes on for new generation.

Conclusion

This is how the NEAT works, the full code is available on my GitHub.

After the DeepMind's AlphaGo has beaten Lee Sedol, who has won 18 International titles, the question that pops in my mind is, has the Artificial Intelligence already surpassed the human intelligence?