ABOUT HASLAMETRICS
Hi, my name is Erik Haslam. I am a full-time electrical engineer and a self-taught, part-time disciple of college basketball analytics. I have been a Wisconsinite all my life, growing up in South Milwaukee and later spending my college years at the University of Wisconsin-Madison from 1992 to 1996. Despite failing to achieve my dream of making the Wisconsin basketball team (a five-time walk-on and, thus, a five-time reject), I earned a varsity letter at the UW in 1996 as a member of the Wisconsin rowing team. After graduating in December of 1996 with a Bachelor of Science degree in Electrical Engineering (with a healthy dose of Computer Science classes thrown in along the way), I began my professional career as an engineer in the electric utility industry, a role I still maintain to this day. Circa 1990, I became deeply intrigued with the NCAA Men's Division I Basketball Tournament. By the mid-1990s, I made a point to obtain my USA Today newspaper first thing the day following Selection Sunday so I'd have my team capsules on hand to make intelligent decisions when picking my brackets. Once I graduated college (and was no longer obligated to attend classes), my employers knew well ahead of time not to expect me at work on the first Thursday and Friday of March Madness. In fact, I still take vacation on each of those days every single year. Eventually, the approach of perusing published team capsules wasn't good enough, and, by the mid-2000s, I began exploring the world of advanced, possession-based basketball metrics, teaching myself the basics of data mining, normalized distribution, stepwise regression methods, and variable selection. That research combined with years of professional experience in coding and SQL database administration/usage allowed me to design scripts and applications to help me rate each team and create methodology to predict the outcomes of college basketball contests. I reserved the domain name Haslametrics.com in March of 2014. My goal is to provide unique statistical insight and to offer predictive analysis based on teams' prior performances in a given season. The data displayed on this site is free of charge, so use it however you'd like. Will these numbers help you predict a perfect bracket? Not a chance. That goal is not realistically attainable, and as stated, the statistics provided here evalulate past performance, not future performance. Nonetheless, past performance and future performance are undoubtedly correlated. Therefore, if you're not familiar with many of the teams in college basketball this season, you could do a lot worse than making your bracket selections based on some of the numbers I'm supplying you here. On a personal note, I presently reside in Oregon, WI, a village about 15 miles south of the University of Wisconsin-Madison, with my lovely wife (Britt), our two beautiful daughters (Brielle and Alyssa), and our two pups (Maizy and Nilla).
ABOUT THE METHODOLOGY
I've explored many ways to categorize and rate each of the teams in Division I college basketball. So what makes my numbers unique? First of all, I have sworn off several of the more popular methods endorsed by hoops stats enthusiasts to rate teams. This includes, most notably, Dean Oliver's "Four Factors of Basketball Success" (effective field goal percentage, turnover percentage, offensive rebounding percentage, and free throw rate). Together, these variables are frequently utilized to construct a regression equation to predict offensive and defensive efficiency. After experimenting with linear and logistic regression for years, I have chosen to reject that methodology in my predictive analysis based on the opinion that sample sizes for one particular team in one single season are just too small. I believe there just isn't enough data available to properly formulate a reliable equation that way. Second, I have narrowed my analysis down to the bare necessities.....specifically, how often teams shoot, how close to the basket each shot is, how well teams shoot from different locations on the floor, and how often steals and offensive rebounds affect the shot selection and success. One must also factor in those same traits from a defensive perspective (e.g. how often, how well, and from where a team's opponent shoots). Using this shooting data alongside an estimate of the number of trips upcourt a team and its opponent will make, I can scientifically make a prediction for the result of any contest. Third, based on play-by-play logs that I have collected and parsed, I only utilize data for a particular game where the outcome of said contest is still in question. Using a formula to determine when a game is essentially "over," I can truncate data that is likely to be contaminated by bench players ("scrubs") getting time on the floor when a lead is out of reach. I should mention that, while play-by-play data is available for over 90% of college basketball games, it is not available for all games. Despite this fact, I still get more informative "bang-for-the-buck" statistics from using play-by-play data vs. the box score data typically used by other stats enthusiasts. In cases where play-by-play data is not available, I attempt to extrapolate as much data as possible from the box scores. Prior to 2020-21, the algorithm I utilize knew nothing about each team's history and, therefore, treated all teams as absolute equals on Day 1 of the season. That resulted in some unfamiliar names appearing near the top of the rankings over the first month or two of the season as the algorithm continued to build a larger and larger sample set. Starting with the 2020-21 season, I began to implement preseason baselines, which were Day 1 ratings estimates based on team prestige, last season's ratings, returning players, transfers, recruiting scores, and coaching changes. This allowed me the ability to provide educated estimates of ratings and projections on the very first day of the season. Regardless of the above strategies, be it with preseason baselines or without them, the ratings should become increasingly accurate over time as more and more actual results roll in from the present season each year. This website provides you the option of viewing either time-dependent data (the default) or time-independent data. Time-dependent data puts a greater weight on recent contests and is a good indicator of how good a team is as of a specific date during the year. Time-independent data puts equal weight on all games, regardless of when they took place. It is primarily used to evaluate a team's overall performance throughout an entire season. Time-independent data is not provided until the season has completed each year. Bracketology estimates you see on this site are a service I started to provide in early 2016. These estimates are based on a combination of each team's present power rating (i.e. all-play percentage) mixed with the value/damage of each win/loss (i.e. record quality). The estimates I provide are all about "deserves" -- specifically, which teams deserve inclusion and what seed each of those teams has earned. These projections are not intended to predict how the Selection Committee itself will ultimately decide the field of 68, a process we all know to be greatly flawed. There are plenty of other sites out there that do that. Instead, my estimates seed the teams largely according to overall performance instead of wins and losses, and they are powered by the team ratings you see on the main page. Keep in mind that basketball analytics is like wine tasting.....there is no right or wrong answer. This is merely one man's method to mold the numbers into something meaningful. Therefore, if you disagree with my philosophies, I will not presume your methodologies to be inferior. In fact, it's quite the opposite. I'm always intrigued by new and creative ways to shape game data into something unique and useful. Feel free to shoot me an email or contact me on X (@haslametrics) with any questions or suggestions you have. I'll try to respond as quickly as possible.
ABOUT THE DATA
Many of the fields on this site that measure team quality require more explanation. Here is a list of those fields along with a description of each one:
All fields listed in bold above are utilized for rating both a team's offense and a team's defense. e.g. Team "A" not only has an offensive efficiency (the number of points scored per 100 trips vs. the average opponent) but also a defensive efficiency (the number of points allowed per 100 trips vs. the average opponent). All ratings are adjusted both for the quality of each opponent and to eliminate game data when a contest has been deemed to be mathematically decided (or, as we say on X, #AnalyticallyFinal). |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|