It’s part of our nature to want to know the future.
Whether we realize it or not, almost every decision we make is based on a prediction about future events.
Experience has taught us, for example, that if it’s the month of December and you see frost on the ground, you’ll need a jacket because it'll feel cold outside.
Sounds obvious, right? But this is actually the same framework that some of the world’s most sophisticated algorithms use.
You see, the process of making predictions boils down to three primary components: collect data, organize it, then use it for making future decisions.
The process of making predictions boils down to three primary components: collect data, organize it, then use it for making future decisions.
In the back of your mind, you’ve stored logged memories of seeing frost on the ground during the month of December and feeling cold as you walk outside. It’s through this model that you’ll be able to predict what it will feel like today despite not yet having stepped outside.
The world of Machine Learning and Artificial Intelligence is all about using computers to imitate this process of decision making. Models are trained based on past events, then used to predict the future.
The good news is you can take advantage of this technology without needing a PhD in Computer Science. In fact, I’ll be walking you through how to predict foot traffic with little more than a spreadsheet and some simple demand forecasting.
You can take advantage of this technology without needing a PhD in Computer Science.
Step 1: Collect the data
Step one in building your forecast is data collection, and the most important part is making sure your data is clean and consistent.
In the case of foot traffic, your options range from keeping a manual log to automated systems like a door sensor or cameras with built in image analysis. With whatever option you choose, you just want to avoid gaps in your data.
Our template contains three separate columns: the date, the weekday in the form of a number, and your customer count for the day.
Computers prefer numbers over text, so you always want to map text-based categories to a number.
Step 2: Organize the data
Now that we’ve established our process for collecting data, we’ll need to organize it in a way that can be used for future predictions.
We can start to build our forecast by creating a scatter plot of our data. Luckily for us, most spreadsheet programs have this as a built-in function. Just select your Day of Week and Customer Count columns, find the Insert tab/drop-down, then select Scatter Plot from the chart section.
Your result will look something like this:
Now it’s time to fit an estimator to your data. This process involves finding out what path you’d need to take in order to draw a line that minimizes the cumulative distance between the line and each dot on your plot.
There are numerous theories and approaches to drawing this line. In the interest of time, I won’t be going into those details. For this example, we’ll be using a polynomial regression in light of the fact that we have one input (day of week) and a dynamic output (customer counts).
Most spreadsheet programs have built-in features that will calculate and draw the line for you. This is often as simple as right clicking on your chart and selecting “add trend line.” From there you can configure your options for fitting the data.
And voila, you have your model:
Step 3: Time to deploy!
Up to this point you’ve done little more than create a spreadsheet that would make your colleagues think you’re smart were they to walk by your computer.
You need to make the data actionable.
Behind that fancy line of yours is an equation. It’s through that equation that you can enter a value for the x-axis, and find where you should end up on the y-axis.
For example if I were to plug the number 2 into the equation, the result will be ~160. If I used 4, it’d be ~145.
It’s going to be through this equation that we’re able to make predictions about the future.
On a separate page in our spreadsheet, we can use some special formatting to give us dynamic forecasts about the week ahead based on our log.
For this particular use-case, it’s a good idea to put a cap on your forecast’s memory. Reason being, you don’t want customer counts from too far back to impact your predictions on the week ahead. As would be my recommendation, the template included with this post automatically updates its polynomial equation based on the previous 4-weeks.
Be Creative
Now that you have the know-how, it’s time to think outside the box.
Interested in monthly demand for moisturizers based on sales from the past 12 months? Is there a time of day when foot traffic calls for more or less staff?
In order to start predicting the future, all you need to do is collect data, organize your model and deploy it into action.