The plan for this report was to analyze the various ways big data is used in the sports betting industry both by bookmakers to determine the odds and by gamblers to gain a competitive edge while betting. It was started by examining the way big data is used by the bookmakers. Many of the algorithms and methods bookmakers use are not available for the public to see (for obvious reasons) but there is still a lot of information about the type of data that bookmakers use and where they get it. There are also various companies dedicated to gathering and analyzing sports data that work in collaboration with bookmakers. With the research, it was found that odds compiling starts with game prediction models that use historical sports data. Both player and team data is analyzed to come up with game predictions about the score of the game and player performance. After finding out the big data methods bookmakers use to come up with game predictions, research was done about different algorithms and models that gamblers and researchers use to predict winnings bets. One of the first ones found was an algorithm made by a researcher at the university of tokyo that used the odds available by different bookmakers as the primary data used for the algorithm. From this paper, information was found that the odds put out by bookmakers do not reflect the exact probability predicted for the game, and there is more to the way bookmakers determine their odds rather than just game prediction models. This led back to doing more research on how bookmakers come up with their odds. Research showed that game prediction probabilities are the basis of odds compiling, but it isn’t just the prediction models that determine the odds. In order for the expected return to always be in the favor of the bookmaker, bookmakers add a margin to the actual predicted probability of an outcome, often based on public opinion. This method is used to reduce risk and ensure a positive expected return for bookmakers.

After this, more research into different prediction models for betting occurred which led to looking into a lot of different algorithms and projects. Finding details for the algorithms big companies have created proved to be challenging since these companies try to keep their methods private. An article from a former odds compiler guided the direction of looking into the Poisson distribution for soccer predictions. This was a very popular game prediction method for soccer, used by both bookmakers and gamblers. The original plan was to take premier league data and create a Poisson distribution model on R using one year’s worth of data and compare how it did to actual games. This did not end up happening so this project was changed to a report since there was not a programming aspect to it. Further research was done into different betting models, some that proved to be more accurate than others. A lot of them used artificial intelligence and machine learning neural networks.