Building a PyTorch Machine Learning Model to Predict a Presidential Election

Friday, November 1, 2024

I mentioned on linkedin that I thought it would be a fun diversion to build a PyTorch model to forecast the US Presidential Election. This was an interesting project for a few reasons.

At the time I did this, it was about two weeks before the presidential election in the United States. As I write this, the election is next Tuesday (2024). My personal desire was twofold, can I model a machine learning neural network with a true unknown (something that has not happened as of the model training and this post), and limited resources to generate something specific enough to get a predictable result.

PyTorch and Machine Learning

Machine learning takes many forms. Today, when someone mentions artificial intelligence (AI) they are most likely talking about a large language model (LLM) like we see with ChatGPT or a number of other chat interfaces. While this is a form of AI, machine learning is a specific part of the larger AI circle.

Machine learning uses multi-dimensional arrays to ‘learn’ about something. It will pass information into this array. This is the first of many levels. The first level passes the data onto the next level, which is hidden, and does some calculations on it to learn some features about the input. This repeats for many layers until the end when it outputs a much smaller array of numbers. Once it gets to the end, it goes backwards and tries to reverse engineer itself to take that ending array and get the input. This is what is called training a machine learning model. There’s a lot more that goes into it, but at a high level, this is what is going on insdie the model.

There are a few ‘engines’ that do all this. They all a numerical array of numbers in a format called “tensors.” Two of the popular engines are Tensorflow and PyTorch.

PyTorch is an open source library for python that is free to use. This is what I decided to use to build the election model.

Choosing Data for PyTorch

I limited myself to the amount of data I would injest. There is quite a bit of data out there, but since this was a diversionary weekend side project for me, I really didn’t want to invest too much time in aquiring data. As such, I limited myself to historical CPI, GDP, ‘sentiment’ in the form of historical polling data immediately prior to previous elections, and actual election voting results on a state by state basis going back to 1976.

The accuracy of a model can be quantified during training and is called the ‘loss function.’ The lower this number is, the better the model will perform. To a point. When training a model for too long, this loss function can actually get very low and give the impression that the model is really really good. But like a student that never paid attention in class and decided to memorize the answers, the result will be a good score without really understanding or learning what it is they were supposed to learn. When a model memorizes this the input data, it is called ‘overfitting’ and means when the model sees something new, it won’t know how to handle it. So we want to minimize that loss function without overfitting our model. It’s a balance.

With my model, I found that my Loss function varied wildly from one training session to the next. As a result, I don’t feel confident that this specific model is the most accurate, but the loss function did get down to about 0.10. It would have been better to get lower, but considering the data I had, I was happy with this.

I started to see some trending results and this gave me hope that I was honing in to model that could at least pass as acceptable. For this limited dataset, I used pandas to organize/structure the data, which was based on state results by year. I then normalized it since I had vote tallies as three input features next to CPI results. The normalization would help prevent the former from being overweighted in the model.

The PyTorch model itself is fairly basic at a high level: multiple hidden layers, from input to 64 to 32 to 1 output. This target output was a -1 to 1 value that signified which party each state would favor. Under the hood there is much more math going on. There are ‘activation functions’ that help the layers continue to learn and not just all go to zero over time. There is the ‘backpropagation’ which is the going back up through the layers to learn. Plus much more.

The model ‘learns’ by running this entire model described thousands of times, each one called an ‘epoch.’ Each epoch has its own loss function and it learns by searching through the multi-dimensional function what the lowest loss function.

I ran the python script to forecast each state using this model. Once the state’s results were forecast, I would then award an electoral vote count to the respective party for that state, keeping a running total. This total was tallied and then the script output each state’s winning part followed by the overall results. Regarding this, when I saw some results giving New York to the Republicans, I attributed it to a lack of data.

Results

So to put myself out there, with just over a week to go, I’m posting the results of my PyTorch model and will treat myself to a glass of wine if it is accurate. After all, it was more about going through PyTorch and building a model with limited datasets to see the results more than getting the best result. Had I wanted to do that, I would have ingested more data, going back at least a few more decades. I would have also added data that captured sentiment, other world events (war or no war), or other ‘October surprises’ like if either candidate becomes invovled in some large scandel in the weeks before the election.

According to my PyTorch model, Trump wins with 307 electoral votes.

We will see how this goes. To prove I’m not altering the results after the fact, I have posted my model’s results on x.com and on linkedin, both a week prior to the election.

In case you were wondering, this is the state by state breakdown. I used the intereactive map tool from Real Clear Politics to build the following graphic:

Ciao! I'm Scott Sullivan, a software engineer with a specialty in machine learning. I spend my time in the tranquil countryside of Lancaster, Pennsylvania, and northern Italy, visiting family, close to Cinque Terre and La Spezia. Professionally, I'm using my Master's in Data Analytics and my Bachelor's degree in Computer Science, to turn code into insights with Python, PyTorch and DFE superpowers while on a quest to create AI that's smarter than your average bear.