Making a DotA2 Bot Using ML

Designing a resource-efficient machine-learning algorithm

18 min readMay 30, 2019

Problem

In December of 2018, the creators of AI Sports gave a presentation and introduced the DotA2 AI Competition to the school. DotA (Defense of the Ancients) is a game played by two teams, each consisting of five players who can choose from over one hundred different heroes. The goal of the game is to destroy the opponents base, while defending your own. Each hero has access to at least four unique abilities, and are able to purchase items that also have different abilities. Items are purchased with gold earned from destroying opponent structures, or defeating players or creeps, NPC (non-player character) units that spawn to help defend and attack bases.

The complexity of the game comes from not only the roster of characters, but the ever changing state of the map. Full information games such as chess or go leave no information withheld from players, allowing them to see every possible action on the board at any given time. The DotA map includes a “fog of war” which hides any part of the map not seen by a player or their teammates. Each hero’s abilities also have “cooldowns” — after a player uses an ability, they cannot use it again for a set amount of time — and use mana as a resource. While a player has access to this information about their allies, they do not have this information about the opponent, and must take it into consideration when engaging in fights.

The rules for the competition were to program a full team of five bots to play in Captain’s Mode. Captain’s Mode sets one member of each team as a captain, giving them the ability to choose the heroes for the rest of the team, and also “banning”, or selecting heroes that the opponent’s team cannot use. In order to avoid having all of our characters banned, we needed to program at least sixteen of them. Our limitation set by Maurice,42 Silicon Valley staff, was that we could not use the built in “desires,” a system to provide default behaviors for the bot to execute, provided by Valve. Instead of using the default bot behaviors, we were tasked with writing the code from the bottom up. The API for DotA2 is written in Lua and allows players to create their own bots. The competition was originally designed to use a C++ API written by

**Overview of DotA battlefield with labels, image from** **https://dota2.gamepedia.com/Map**

the AI Sports creators, but due to “complications,” our team used Lua instead.

How Was it Solved

Learning Lua and the API

In order to create the bot, we first read through the API and looked for other examples that users had created. The DotA API was made available in early 2016, though hasn’t received any meaningful updates since approximately October of 2017. The first resource we used was a guide on getting started, written by RuoyuSon. RuoyuSon explained where to find other resources and how to start games, as well as useful console commands for the testing process. Valve also provides small examples of bot scripts in the games directory that can be used to potentially get started. With the API and other examples, we naively believed we could create a bot and have a crude, working version of the code within a week.

The first challenge came in selecting the heroes we wanted to use and starting the game. What we didn’t know at the time was that if the code for hero selection has an error, the entire game will crash without displaying anything. The example provided by Valve can be used to quickly create hero selection code for All Pick Mode, but is unusable for Captain’s Mode. In order to select heroes, we read through other examples of the code. Although our current iteration of the bot allows for human players to play against and alongside it, the original version was only meant to play against another bot in Captain’s Mode. Finally getting a simple version of the hero selection working took a little over one week, but since then has been modified to support All Pick Mode and human players.

After getting the game to start, we began experimenting with making heroes walk to locations on the map. We quickly learned not knowing the Lua language made writing and understanding other examples of code difficult. While we were able to make bots walk to certain locations or purchase items, we frequently made syntax errors and finding bugs in code took considerable time. After a frustrating two weeks, we took time to learn the language before engaging with the API again.

While the tournament drew near, we were still figuring out Lua and fighting to understand the API. Our heroes moved to the correct locations and they were able to fight enemy minions and opponents, albeit poorly, but they would never retreat, resulting in death after death. Even against the easiest default bot difficulty, Passive, we were unable to win. We implemented a crude retreat function — simply telling the bots to run back to their base if they took too much damage — that helped, but left a lot to be desired. We were able to consistently win against the Passive bot, but usually ended the game with close to 100 deaths per game on our side, and we were lucky to see two deaths on the opponents.

The next step, now that the groundwork had been laid for bot behaviors, was to begin to individualize each bot so that they could use their abilities. Each bot uses their skills based on conditions, allowing them to fight the enemy. At this point our lack of DotA experience began to show — although the bots were able to use skills, they didn’t use them optimally simply because we didn’t know what optimal was. We frequently asked people with more experience for tips and slowly made the bot stronger. Finally, we were able to defeat the Passive bot with a positive score. We attempted to beat the Easy difficulty, but struggled. The difference between the two was significant and we needed to implement more behaviors in order to win.

State Machine

Up to this point, all code had been written as predefined actions for each bot to execute. The complexity of DotA gradually made it more and more difficult to separate what actions to take and when. Should we fight in this instance, or run away? When should we focus on hitting creeps? While we were able to defeat the Passive difficulty consistently, we realized that Easy would be a significant hurdle. We began to discuss possible options, and landed on the State Machine.

When modifying bot behaviors, it became impossible to cleanly separate when it would perform actions. They were so closely intertwined that adjusting one would affect the performance of the other, and none of the behaviors worked particularly well. We were also unable to include more behaviors neatly without disrupting other parts of the code. By creating the State Machine, we were able to separate each behavior, using weighted averages to decide which would be the most optimal in any instance of the game. The code for each bot is run every frame, allowing for constant calculations and giving each behavior a value as a weight. Assuming we programmed the bot well, it could now decide for itself what it should do based on the game state.

At this point we were able to separate each behavior into its own distinct code, broken down into components and conditions. Components are pieces of information that are always necessary to calculate a behavior, while conditions would add or subtract from the weight only under specific circumstances. Separating the code allowed us to make each behavior perform better — previously, each behavior was reliant on another, but by using the State Machine, we would only execute the parts of the code we needed to and only when we needed to.

While some of us set up the State Machine, we also continued to improve the non-State Machine version, to the point that we were able to defeat the Easy difficulty. We were once again seeing 100 deaths on the scoreboard, but would get more kills on our side and eke out wins. The code from the non-State Machine version easily slotted into our new bot, allowing us to continue working without any significant delays.

One of the benefits of the State Machine was the modularity of the system. Prior to this, the bot’s generic behaviors were made up of two files that had necessary comments written throughout in order to understand which part of the code was being looked at — the new version had separate files for each weight and the behaviors were separated so they did not interact with one another. The modularity allowed multiple people to work on different parts of the project without affecting what someone else might be working on, improving clarity, simplicity, and the team’s workflow.

We were also preparing for our first bot vs bot match against another team in the competition, but the State Machine was untested and not ready to implement. This gave us one last chance to see how the previous version held up. Before we started our scrimmage, we decided to test and make sure both teams’ code ran properly. When both bots had a random mix of the opponent’s heroes and their own, the teams realized we had made an error in the picking phase. Both teams were able to fix the issue, but it was another instance of fighting with the API, something that would persist throughout the entire process. At this time, we were also notified by Maurice that the tournament would postponed for a month, giving us a chance to continue to improve our bots.

During testing against Valve’s proprietary bots, we would frequently have to restart games because of the compatibility issues with their bot and Captain’s Mode. We decided to make our own picking mode for two teams in order to speed up the process, and cut down on the unnecessary restarts. We gave our opponent’s bots a random team of five and used this team for much of our testing. What we didn’t know at the time was that this would come back to bite us later.

Our team continued to work with the State Machine, adding more behaviors that we were unable to implement before. As the behaviors increased, we also started to see improvements in our matches against Valve’s bot. After defeating Easy, within 24 hours we were able to beat Medium, and the next day we beat Hard and Unfair back to back. We were ecstatic, not expecting to beat Unfair much later down the line, but as we decided to watch the opponent’s bots closer, our jaws dropped. Two of our opponent’s bots didn’t buy items, and one didn’t use any abilities. Although we were able to win, still a feat in and of itself, it wasn’t a real victory against the Unfair bot.

What we didn’t know was that Valve only implemented specific skill usage and item purchasing on 46 of the bots. We changed the opponent’s lineup to five of those bots, and while we could put up a good fight against Hard and win about forty percent of the time, we rarely won against Unfair. We began to have more discussions about what we could do to increase our win rate, resulting in our first roster change. After looking at the heroes we had implemented, at the time only five, we decided to switch out heroes that would hopefully fit our overall game plan better. Immediately we saw an increase and, while we had become attached to the heroes we chose to use, we began to consider swapping heroes as an option as we continued to program.

Data Gathering

We continued to implement more behaviors into the State Machine, and added more features and, as we did, saw a slow but steady increase in performance in our matches. In order to see how well we did when including something new, we had to watch a full game to observe the specific behavior and to see whether or not we won the match. All of the bot’s weights were hand-tuned, and any tweaks we made might not be visible within a single game. Even sped up, a game would take between ten and fifteen minutes. In order to gather any meaningful data, we could spend over hours just watching. To speed up this process and make sure that any change we added was meaningful, using Python, the Go programming language , and Docker, we began to create a way of gathering data over hundreds of games.

Maurice gave us access to fifteen computers which we could use to run games on and gather data. At this point, we had researched a “headless” mode for DotA; we were able to run games graphicless which would speed up the games themselves, and allow us to run multiple instances of the game without using the GPU. Using Docker, we set up a client to server connection that allowed us to use virtual machines on fourteen of those computers. We calculated that we could run up to four games per computer optimally, so ran four virtual machines at six times speed. Altogether, we were able to run games approximately 300 times faster than with originally.

Each game could range between fifteen and eighty minutes. Docker Swarm distributed the total number of games requested evenly to all of our worker computers. If we were running less than 56 games this solution would be fine, but anything more would be suboptimal. We initially attempted to deploy using Docker Swarm, but it made more sense for us to create our own solution. It would need to be customizable, work well on a distributed network, and have support for simple concurrency. We decided to use Go because it filled our criteria and was easy to build and deploy. Finally, Python was used to graph and illustrate our data results as histograms and line graphs.

Using this setup, we were able to run 500 games over an hour, giving us meaningful data. While it was still necessary for us to watch games to observe and confirm behaviors worked properly, we could now test them and gather data in order to confirm whether or not a change was beneficial or detrimental to the bot.

As we went into the final weeks, we played with the idea of incorporating a genetic algorithm. The State Machine weights were all hand tuned and based on our observations. Specifically our Farm, Hunt, and Retreat weight were so closely tied together that by changing the values of one, we would see dramatic differences in the way they played and their win rates would generally decrease. We knew they were at a good point, but were sure they weren’t optimal, especially considering different characters played differently and using the same weights made them all play more or less the same. Using a genetic algorithm would use machine learning to tune each weight, giving us the most ideal numbers to defeat the default bots, and hopefully our opponents in the tournament. An ambitious goal was to create different genes for each character, thereby giving them each their own unique play style, but we knew that without more time or more computing power, we would have to make do with the hand-tuned weights we had.

A week before the competition, we strayed away from adding major features, only including small changes that our data decisively proved would increase the win rate. By the end, with the State Machine we were able to achieve a consistent win rate above 98% against the Valve bots. Ready for the competition, Maurice messaged us, informing us that once again the competition had been extended for another month.

Genetic Algorithm

With the month long extension to the tournament, we began to discuss how we could create a genetic algorithm. In the end, we decided to use Go once again because our data gathering programs had already been written in it, therefore making it easier to tie the programs together.

In order to get the genetic algorithm to work, we needed to run multiple iterations of our bot. From those iterations, we would grab the top five heroes genes and “breed” them by shuffling, averaging, and splicing them together. The next generation would be made up of slightly modified versions (using a 10% mutation probability to choose which genes to change, and 10% mutation rate to change each gene by the respective amount) which we would then gather data on, repeating the process until the beginning of the competition. Our plan was to replace the current hand-tuned genes with our new machine learned ones.

Our first step was making sure we could run the genetic algorithm using Go and Docker, and modify the Lua script at the same time. Each bot’s gene was a Lua file containing the values we wanted to mutate using the genetic algorithm. We used Go to read in the gene file, mutated the values, and output the new gene using a gene template. The new genes were then used for the subsequent iterations.

Having successfully created a way to read and write to our new gene files, instead of making one generic genetic algorithm as we had originally planned, we created genes for each hero we were using. In order to make it work, each file had to include the name of the hero we were writing to. Unfortunately we could only train five heroes at a time, so we opted to train our starting lineup and use our hand tuned genes for the rest of the heroes we had implemented.

Finishing the genetic algorithm ended up taking longer than planned. We hoped to have it running and training within a week, but needed a few more days to iron out bugs. We had each made separate parts of the genetic algorithm, and piecing each together took some time.

Finally, the genetic algorithm worked, but as we began to run the first generations, we ran into multiple issues. At this point, we had continued to have some issues with our Docker containers not running games, but had chosen to ignore it for the time being because while it had been slower in collecting data, it wasn’t a significant time difference. If one computer malfunctioned and dropped off the network the server would hang, waiting for data to come in from the downed machine. When we decided to use the genetic algorithm, we needed it to run non-stop and continue working through each generation. If a worker failed to respond, the server could never move onto the next generation because it was waiting for the remaining games to come in. It made little sense for us to monitor the computers in shifts all day, so we added in a way of timing out if we did not get a response from the container after a period of time.

In the end, after about four days of starting and stopping the genetic algorithm, we finally had it working. While running the genetic algorithm and confirming that it worked, we decided to change our team lineup in favor of one we thought could raise our win rate. When we began running the genetic algorithm and set up the genes we wanted to manipulate, as a team, we went through them and adjusted them to numbers we believed made sense for the genetic algorithm to start on. At that time, we decided to manipulate approximately 25 components and conditions, the “genes,” from our Farm, Hunt, and Retreat weights. This change combined with a new hero selection we used for the opposing team dropped our win rate from 98% to 80%. While the genetic algorithm was slowly raising the win rate, we spoke as a team and decided that if we could boost it by switching or adding heroes early on, it could be worth testing. After the switch, the initial 80% rose closer to 90%.

While we observed the bot, we knew that time was beginning to run out and it wasn’t growing fast enough. Although it was a risky decision that could result in a potentially drastic decrease in win rate, we decided to adjust the rate of change from 10% mutation probability and 10% mutation rate to 15% and 25% respectively. We calculated that in the most ideal situation, in order to get rid of a gene that was not useful, it would take at least thirty generations, or at least one week. We wanted to reduce that number and figured if we doubled it, we would see a higher rate of change, for better or for worse. After days of observing the results, our risk paid off and the bot saw a faster and more consistent increase in win rate.

Once we were sure of the outcome, we began to add in more genes to manipulate from other State Machine weights. Another problem we ran into throughout the project that we had been unable to solve was how to play early on in the game, and how to play near the end. In DotA, the play styles between the two are drastically different. The behaviors that are important early on are less important as the game goes on for longer, and vice versa. Our strategy up to this point had been to trade a slightly weaker start to the game for a more powerful finish. We had tried to tweak the weights multiple times, but even if they played better at the beginning, the manipulated weights would fail in the end dropping the overall win rate. Now that we had our working genetic algorithm, we added in various multipliers to health for it to adjust, but also decided to add in multipliers based on how powerful the hero is. Heroes go from level 1 to 25 and get stronger as they gain levels. By hand we were never able to manipulate the weights in an effective way that would allow for early and late game adjustments. With the genetic algorithm, we could now leave it up to the computer to decide when to play differently.

After one more hero change, we settled on our final roster and continued to let the genetic algorithm work. A few days before the beginning of the tournament, we saw that the bot had finally reached a 99% win rate for one generation, but this dropped the next generation to 96%. While our rapid manipulation of genes had created a powerful bot, once it got closer and closer to it’s theoretical peak, the 25% mutation rate would change to much at once and dropped the win rate. We decided that in order to preserve our win rate, we would need to slow down the mutation. The mutation probability was dropped to 7% and the mutation rate was dropped to 15%.

As we changed our genetic algorithm once again, we decided to take another risk. Up to this point we had been taking the top five genes from each hero as parents, breeding them and using the offspring for the next generation. While this had worked for us, classically it was not how we should have been using a genetic algorithm. In a genetic algorithm, all genes should have a chance of being picked, but we were actively selecting which to use. The importance of using the lower win rate bots is diversity. While in the generation it may not have performed as well, in a future generation its genes may be an important part towards increasing the win rate. In order to make sure those lower win rate genes had a chance at being selected, we mapped all of the genes, allowing a higher probability of being picked to the higher win rates, while still giving the lower win rates a chance, albeit smaller.

We also discussed a change in strategy. Taking the genes from each individual bot was necessary, but we considered the importance of taking all of the genes from one “team” of bots. A bot at first glance may have looked like it had less potential, but as part of a team it’s genes could have been an important key to victory. We thought about the benefits of switching from the individual bots to only breeding teams, but couldn’t justify losing out on more powerful heroes genes. As we came to the conclusion that we should continue using the same strategy of selecting the individual bots, a thought popped into our heads. What if we do both? Taking the individual genes was undeniably important, but by breeding bots from the same team with the stronger individual bots, we believed we could unlock the potential of both worlds.

Conclusion

The genetic algorithm seems to have improved the bot, although its play style is considerably different from the original non-genetic version. While we were much more aggressive earlier, we now play more conservatively and aim to win mostly by destroying structures and winning the occasional team fights. The older bot would group together more often as a team, forcing opponent bots to react and resulting in more fights. If we continued to work on the project, I believe the next step would be having the bot begin to fight itself and the older, non-genetic version of the bot.

Through this project, I’ve been able to learn multiple programming languages, as well as familiarize myself with Docker, and the importance of documentation when working as a team. The reason I decided to work on this project was less an interest in DotA2, but more trying to understand machine learning. I’d heard the term multiple times, had read about it, but didn’t have a real understanding of what it entailed and how to actually program it. Participating on this project gave me a unique opportunity to work with machine learning, and has increased my understanding of programming as a whole.