Monday, September 18, 2017

Alternative Season 25 Clustering

So, I've been doing these historical analyses since season 16, but season 25 has seen a big jump in views.  For that reason, I want to reemphasize that these are not meant to be predictions, and to share a bit more analysis.

One challenge in these data analyses is identifying a comparable group.  Back when I started these analyses, I chose to split combat, basketball, and tennis athletes ("CBT") from other athletes, because they had done poorly through the first 15 seasons.  That means that Monica Seles is still in Nikki Bella's comparison group.  It was tempting to drop Monica from that group.  Instead, I broadened the age range to include Paige VanZant, despite her being over a decade younger than Nikki.  Had I dropped Monica instead, the final clusters would have been:


Notice that Nikki's historical average would have filled a gap between Nick and Terrell, pushing both Nikki and Terrell up into Cluster 2, and collapsing Clusters 3 and 4 into one Cluster 3.  A "within-one" interpretation of these clusters would yield more conservative results, broadening the "likely" range for all couples.  However, three clusters are a lot less interesting than four clusters, because the middle cluster can literally place anywhere and stay within one cluster.  That was partly why I went with the more interesting four clusters.

My plan was to see how this season played out before deciding whether the "athlete" category should be re-categorized for future analyses.  Given the unexpectedly high view total, I wanted to share this information now.  Again, these results aren't meant to be predictive, but they can be interesting and insightful for what's historically more or less likely.

Saturday, September 16, 2017

DWTS Season 25 - Historical Comparisons


First, a preview / graphical summary of the ensuing DWTS Season 25 historically-based analysis:

Note: See link for an alternative clustering.

This Season 25 analysis is similar to the Season 24 analysis.  These analyses are not meant to be predictive.  They are merely an assessment of relative strength based on historical comparisons.  Job-age comparison groups for each star are listed below.  The "job" categories come from the defunct "Cast DWTS" game formerly on ABC's website.  The age ranges are typically the star's age +/- 5 years (with exceptions noted).  Past season results are scaled from 1-12 (representing 1st place through 12th place) so that results are comparable regardless of field size.  Withdrawals and All-Star season results are excluded.

The comparison groups are listed from strongest average to weakest average.  (Scroll down to view all groups):

Notes: Age range was adjusted for Lindsey, Jordan, Frankie, Nikki, Nick, Barbara, and Debbie because of excessive or insufficient comparison. Job categories were broadened for Drew and Barbara because of inadequate comparison.  "Singer" was the closest available category to "musician" for Lindsay, but that's admittedly a debatable point.

Here's a summary of the above age-job group averages, arranged in ascending clusters of average placement:


Here are the pro averages, from strongest to weakest.  (The average placement for new pros from seasons 4-25 is indicated in red):

Next are the height group averages, for same gender height +/- 1 inch.  These are especially interesting for season 25, imho:


Next are the overall weighted averages*.  They are arranged in clusters of average placement, from strongest to weakest.  Wildcards (highlighted in blue) are Victoria (physically challenged, but with an inspirational story) and Debbie (new pro).  EDIT: Marking Nikki as a wildcard, due to problematic comparison group (explained in addendum post).

*Technical Note: Averages are weighted as 50% Age-Job average, 33% Pro average, 17% Height average, based on correlation analysis of weighted historical averages versus actual historical results.

Comments:
First, a reminder that these are not predictions.  Inevitably, some stars will beat their historical averages and some will fall short.  That said, based on historical comparisons, stars within the same cluster are likely most competitive with each other relative to the field.  Also, stars are more likely to finish within one cluster of their historically-based clustering than to finish further away.  (This was true for the Season 24 analysis.)

Based on historical comparisons, it would be less likely that stars in the top cluster finish lower than 7th; stars in the 2nd cluster finish lower than 10th; stars in the 3rd cluster finish higher than 4th; and stars in the bottom cluster finish higher than 8th.  (These results are summarized graphically in the figure topping this post.)  I'll leave it to the reader to predict their most likely "less likely" results.