Thursday, June 6, 2024

Another Inside Look At One Of Models

Last time I did this it was pretty fun. I know you guys liked it too because it's always in the top 10 of most viewed posts on here. And I really don't mind 'spilling any secrets' now because like I said in the last post, these models almost certainly aren't profitable as-is anymore. At this point a lot of the stuff in there is almost 10 years old. You're lucky if you get two or three full seasons out of any model without serious, constant tweaking. And finally, even if they are profitable, it would be for a tiny amount of EV, and quite frankly, I really don't give a shit either way. Try em out for yourself or don't. I'm very clearly not even remotely guaranteeing anything. All I can say is that any model I'll share on here absolutely worked for me for at least a few full seasons and I did make some real, actual money with them. And I used to LOVE reading about other people's models. Everyone seems to approach it in a completely unique way and I always learned something reading about anyones even attempt at modeling.

So here goes. Today I'll be going into my NHL expected shots on goal/goals/points model. The model I made last in my sports betting career and the one I'm probably most proud of. I have somewhat of a bizarre, almost intimate love-hate relationship with all my models, this one in particular. It's big, it takes a lot of work and time to use and I always felt like it was running bad. Like I should have been winning way more with it even though it was quite profitable for a while.

 I can still remember the early days of building it and the little light bulb moments along the way. When you're building a model, it's in your head quite literally 24/7. You make 2 steps forward and 1 step back. I was like a sponge, trying to absorb any and all information I possibly could about hockey analytics and modeling in general. I even paid someone I respected from a message board for a small piece of the model that I'll explain.

So let's get into it. This one will be a little different in that it will have more of a narrative structure to it. To explain this model I will have to explain the first NHL model I made that eventually became this one.

One of the first models I made was for NHL player vs player matchups, aka who would record the most points in a game. This is where the basic genesis for my later model came from. Basically, I used a players corsi instead of shots and then turned that number back into shots. (Corsi is simply all shots AT the net instead of ON the net, named after Jim Corsi, a goalie coach for the Buffalo Sabres who first started tracking it for his goalies.) Corsi, especially in smaller samples, is better than Shots because there's way more data points. People used to always say 'Corsi measures possession' without really knowing what that meant. It's true, it does a better job of measuring possession than shots, but only because it counts more events. I used to always think of this example: say a player gains possession of the puck in his own zone. He goes the length of the ice, dekes out all 5 guys, dekes the goalie out and wrings a shot off the post or misses a wide open net by half an inch. NONE of that gets recorded into the game log! Hitting the post doesn't count as a shot on goal. Does that seem accurate to you? Now with advanced analytics we could call it a good zone exit, a good zone entry and a Grade A shot attempt from a super high percentage spot on the ice. But when I was making this model, just counting up the shots AT the net was considered fairly ground breaking. And the reason, in my opinion, was not so much that it was super important to count up every shot attempt, but by counting shot attempts, you were really measuring where the puck was on the ice. The real skill, the real work being done on the hockey rink is getting the puck and the play into the opponents zone and keeping it out of yours. Whether or not the puck happens to bounce off someone and go in or not, or be on net or not, has a ton of noise and luck in it. The good teams play more in the offensive zone and convert that time into shots AT the net. Corsi did a good job of capturing that. You can't put a shot at the net if you're always in your own zone. (In the long term, shooting percentage and corsi percentage is pretty fluky and noisy and for the most part, any team giving up more shot attempts than taking them usually lost. Although I will say, some teams did seem better at suppressing shot quality. John Tortarella coached teams come to mind. His whole thing was everyone collapsing down in front of their own net and EVERYONE had to sell out to block shots. Another team that comes to mind is the Boston Bruins, especially when Chara was on the team. Great defenders not only suppress shot quantity, but suppress shot quality. However, by and large, any team giving up more shots than they took would eventually lose.)

My models always treated all shot attempts as equal which I know wasn't an accurate assessment. I should have made adjustments later on at least for High Danger, Mid Danger and Low Danger shots. One of those things that was always on my to-do list.

Anyway, getting back on track. I know I talked about this a little bit on here before, but in my first model I assumed that a players corsi per game for the current season was his true skill level. I would take either the last 3 years worth or his full careers worth of what I called 'corsi percentage' or the rate at which he turned 'Corsi's' into 'Goals'. It's the same exact thing as 'shooting percentage' but I just used Corsi instead of Shots. (I know it sounds funny saying 'corsi's' and most people nowadays call them Shots AT the net while just 'shots' is considered Shots ON the net. That's dumb and confusing though and I will never, EVER not call them Corsi.) So I would take a players Corsi per game, multiply it by his expected Corsi into Goals percentage, and now I have his raw expected Goals. Now you have to adjust for opponent. When I first started, I just divided his raw number by the average goals allowed per game and then multiplied that times the opponent goals given up per game, which is maybe a little bit clunky but really not that bad. Now I have his Expected goals. The next step is to get his expected assists. At first I did the same thing as goals; I took the players assists per game, divided by the average assist per game given up then multiplied by the opponent assists per game given up. So now I would have players expected goals and assists, add them up and you have their expected points. Do the same thing for whoever he's up against and you'll end up with something like Player A .875 and Player B .62. Which is great and all, but you still need to then turn those numbers into percentages, and then turn THAT into betting terms. For all that, all you need is a poisson calculator which are online for free. I'm not going to walk you through how to use Poisson but if you want to bet on props, you absolutely 100% need to learn it. It isn't that hard, I did it all by myself. So you end up with something like Player A will outscore Player B 65% of the time (I'm making that number up) and then you convert that into a betting line using an odds converter. 65% equals -186. So if the bet was, say, -170 or better, it would be a bet on Player A. If the line was, say +200 or better, it would be a bet on Player B.

This is a really simple model but believe it or not, it worked great for like 3 years straight. I think the linesmakers would simply take each players raw points per game and just use that as their line. Most of my bets would be unders on guys playing against top 5 defending teams, or overs on really bad defensive teams. You'd also sometimes catch guys running good or bad on getting their shot attempts through or not. 

Adjustments:

I made many adjustments along the way with this model. First, I changed the way I calculated assists. Assists in NHL game logs are counted two ways, primary and secondary assists. Primary assists are when you pass the puck to the goal scorer, secondary assists are when you pass the puck to someone who passes to the goal scorer. Primary assists are more stable, or more likely to result in more assists. So all I did was break my assist column into two columns, primary and secondary. For primary assists I would multiply by something like 1.2 and for secondary I would multiply by something like .8. (This will probably horrify certain people, but I just completely made those numbers up. I would monitor them closely and would change them slightly every now and then, but that was the basic idea. The big thing, in my eyes anyway, was to make sure I was taking away from secondary assists the same amount I was adding to primary assists). 

The other big change was that I changed the way I adjusted for opponent. It occurred to me that I should try to use the market game line and total in some way. I knew there was a way to extrapolate each teams expected goals using the game line and total, but I didn't know how exactly. First, I tried to reverse engineer through poisson each teams expected goals using their team totals, which wasn't a bad idea but didn't quite work since team total markets aren't exactly efficient. This is where I paid for a piece of this model. I reached out to TomG from the message boards, who was/is a true OG, one of the best old school DFS guys out there and I know is reading this right now (shoutout Tom, leave a comment!) For 50 bucks, he sold me his little Pythagorus theorem way of inputing the game line and total and it spits out each teams expected goals. This was huge. So now I had each teams expected goals for each individual matchup. I would simply use this number against their season long GF/60 number. For instance, say Team A is expected to score 3.25 goals tonight but for the season they average 2.9 goals per 60 minutes. So you know the team is expected to score more than average. (If their raw expected goals per game is .3, you divide that by their Teams GF/60 and then multiply that times their expected team total. So using the example above, you'd get .336). More goals equals more assists, too. For assists, I did some work and found that for the most part, teams get assists at a fairly constant rate with regards to how many goals they score. Something like 1.7 assists for each goal. So I changed my expected assists to match their expected goals. (At certain points I did this a little different too. I would add up each teams goals and assists and then divide them to get each teams "assists per goal ratio" and use that number. Good teams did actually get more assists per goals than bad teams. It depended where we were in the season).

Model Limitations and Problems:

Whenever you're making a model, you have to make a bunch of assumptions, some of which you know aren't great. And you have to constantly be attacking your own model, looking for flaws. For one, my model treated every single shot attempt as being equal. I had a really good idea that I never implemented which was to count corsi (all shot attempts), fenwick (all unblocked shot attempts), shots (all shots on net) and goals (shots that go in) as slightly different, weigh them all accordingly and then use that as my corsi number (any attempt at all is a corsi).

The second one, and probably the biggest issue with this model, was that I completely ignored special teams. I tried for a long time to incorporate power plays and penalty kills, but for this model I never did (I did eventually which I'll get into later). My reasoning was that if I just use "all situations", I was taking into account power play time/points anyway. The glaring issue here though is that you can't take into effect the fact that a team is playing an opponent that takes or draws a lot of penalties (or vice versa). And getting even more granular, certain individual players could draw penalties more than average. Sometimes a lot more than average. Incorporating that, as well as injuries, was another thing always on my to-do list.

Even with these problems, this model crushed. Eventually I completely ran out of books to take my action on this prop so I had to adjust. My main book at the time actually just took this entire bet off the board, but there was still places to bet on Shots on goal and individual player points over/under. 

Shots on goal seemed like a really easy prop to attack. I used the same logic as above but just used Shots where I had used Goals. That was easy and worked fine, but I wasn't beating 'will player A score a goal, yes or no'. I realized that my model was great for matchups but the numbers it was spitting out weren't all that accurate. They were accurate against each other, but not on their own. So this is where I basically started over and created my end boss NHL model.

The first thing I knew I needed to do was incorporate special teams. So for each matchup, I had 'over all PP time', 'PP TOI per game', 'penalties drawn and penalties taken'. Then you need the league wide average penalty time per penalty (which is slightly below 2 minutes because sometimes you get a penalty while on the power play. This was another place you could find slight edges. Some teams would run good or bad at turning their penalities drawn into actual PP time. For example, getting a penalty while on the PP or vice versa). So I would run their penalties drawn against the opponent penalties taken vs the average team and I would get each teams expected power play minutes. You then take each players power play share (the percent of the teams overall power play time that that player is on the ice for) and you have the expected time on ice on the power play for each player. (I also regressed this number to the mean a little bit).

Then I had three sections for each player: All situations, Power plays, and All situations minus Power Play time (which is a proxy for 5v5 plus penalty kill time). I would do each players Corsi into shots and then shots into goals per minute for all situations and PP. The difference would be all situations minus PP. You add up the outputs for PP plus all situations minus PP and I would get the players expected shots which you then can get expected goals.

 I was constantly changing and tweaking this model and there's plenty of stuff I didn't get into because it's pretty technical (and it was so long ago it's hard to remember exactly what I did and when). The end result though was that I would get each players expected corsi and shot on net per minute for PP time and All situations minus PP time. Multiply that by his average time on ice and you get their expected output for the game.

This model worked great for Shots but for some reason, never really did great on yes/no expected goals. I'm not really sure why. Goals are very fluky and since I wasn't super confident in it, I only tried it sparingly so it's entirely possible I just ran bad. But I'm not quite sure about that.

Anyway, that's about it. As you can see, it was a lot of work. For each game, I would have to put in all the team info and then go bet by bet, player vs player. TOI, corsi, shots, plus the last 4 years of corsi into goals or shots to get their corsi into goals/shots percentage. And that's just half the battle. Once you get the outputs and turn them into betting terms, now you have to actually place the bet which is a whole 'nother thing.

If anyone out there would like this model sent to them and have me walk them through how to use it, reach out to me and we can probably work something out. I'd probably charge less than a grand for a complete walk-through. Honestly though there's enough info here to at least get you on your way. But there are a few things I did leave out.

I know this was a little bit clunky. Looking back, it's crazy to me how long ago this was and how much time and effort went into it. I don't regret anything but man, it did take up a lot of my time. 

Check back soon, I have some crypto and politics stuff I want to get into. 

BTC price: $71k

BTC market cap: $1.4T

BTC dominance: 53%






No comments:

Post a Comment