Monday, September 18, 2017

Alternative Season 25 Clustering

So, I've been doing these historical analyses since season 16, but season 25 has seen a big jump in views.  For that reason, I want to reemphasize that these are not meant to be predictions, and to share a bit more analysis.

One challenge in these data analyses is identifying a comparable group.  Back when I started these analyses, I chose to split combat, basketball, and tennis athletes ("CBT") from other athletes, because they had done poorly through the first 15 seasons.  That means that Monica Seles is still in Nikki Bella's comparison group.  It was tempting to drop Monica from that group.  Instead, I broadened the age range to include Paige VanZant, despite her being over a decade younger than Nikki.  Had I dropped Monica instead, the final clusters would have been:


Notice that Nikki's historical average would have filled a gap between Nick and Terrell, pushing both Nikki and Terrell up into Cluster 2, and collapsing Clusters 3 and 4 into one Cluster 3.  A "within-one" interpretation of these clusters would yield more conservative results, broadening the "likely" range for all couples.  However, three clusters are a lot less interesting than four clusters, because the middle cluster can literally place anywhere and stay within one cluster.  That was partly why I went with the more interesting four clusters.

My plan was to see how this season played out before deciding whether the "athlete" category should be re-categorized for future analyses.  Given the unexpectedly high view total, I wanted to share this information now.  Again, these results aren't meant to be predictive, but they can be interesting and insightful for what's historically more or less likely.

No comments:

Post a Comment