Odyssey Comparative Rating System

Whit's wiping down the counter, Connie's mopping the floor, and the kids are sipping on their milkshakes. If you want to talk about Adventures in Odyssey the radio drama, this is the spot to do just that!
Post Reply
User avatar
Bob
Catspaw Rocks!
Posts: 705
Joined: September 2006
Location: The Metroplex
Gender:
Contact:

Odyssey Comparative Rating System

Post by Bob »

I've thought for some time about the issue of providing accurate ratings for Odyssey episodes and the problems that the current systems have.

I think both the five-star system that is commonly used, the percentile system (as seen on AIOWiki), and, and as much as I hate to admit it, my personal rating system (the one you can see in many of my reviews, next to the traditional five-star rating) all leave something to be desired.

The symptom of this problem appears when you realize that very few episodes have truly bad ratings. Even episodes that are almost universally considered to be bad (such as "Fairy Tal-E-Vision" and "Idol Minds"( have come off relatively strong in the percentile ratings, and the others are all rated at 90%. This might not sound like a problem, until you consider that "Karen" (which won the first Odyssey episode competition, and I think is considered a great episode) is rated slightly lower than the first two Mulligan episodes.

The five-star system isn't much better, as just about any episode that's considered to be good or average will usually hit 4 stars ("above average"), and it has to be unthinkably awful (or have a character that someone doesn't like) to get reviewed as a 1.

As far as my rating system goes, well, my ratings are directly based on the rating system that's used in 2K hockey games. The less said about that the better.

The Odyssey episode competitions, I think, are a step in the right direction, but they are very slow-paced, don't permit for many episodes to be ranked at a time, and have a limited set of episode comparisons (plus a limited number of reviewers, which skews the results).

What I want to do is to create like the Kitten War page, or hockey-reference's NHL Elo rating system. By having every episode ever created in one long list, we could determine which is the best by people just going on and click which one they think is better, over and over again until there's a stable list of rankings.

What I'd like to know is what sorts of ratings you think should make up this list, though. Not all episodes are made alike; for instance, some are serious, and some are not. Some are "easy listening" and some are a bit deeper.

I have three questions that I think I would like to ask in each matchup between episodes:
"Which is better written?"
"Which is more entertaining?"
"Which is more applicable to your life?"

I'm not sure these are the best questions to ask, though, to rate episodes, so I'd like to get input on this.

Thanks for your time, and for trudging through my difficult writing style.
User avatar
Reddo
Catspaw Rocks!
Posts: 858
Joined: March 2008
Location: Smalltown, Saskatchewan
Contact:

Post by Reddo »

yeah, pretty much every rating system is broken in someway, most of the time it comes down to people not rating things consistently or just rating 0 or 100 and nothing in between.

making a list of all the episodes would be a little overwhelming and if you can only click one episode there will be a lot that are tied at 0 votes or something, how do those compare to each other? also that would bias the older episodes over time. Say you have the system open for a whole year and then a really awesome episode comes out even thought people think it's good it will take a long time for it to catch up to the other episodes.
Reddo
aiowiki - we have a lot of info, check us out sometime
User avatar
Dallas R.
My posts are revolutionary
Posts: 403
Joined: January 2009
Location: Wisconsin

Post by Dallas R. »

I would love to come up with a better rating system. I hate the fact that almost every episode averages about 4 stars. When I rate an average episode as a three, and everyone else gives it fours, I feel as if I'm saying I didn't like the episode very much. Point in case, I want to give an episode like "Never for Nothing" four stars because while it was very good, it still isn't 'The Best'. I save my five stars for episodes like "Exit", and "The Mortal Coil". However, then we get episodes like "Wooton's Broken Pencil Show", I was in the minority who did enjoy this show, but not nearly as much as "Never for Nothing". I won't give it a two, because that seems as if I really disliked the episode, but giving it a three would raise it to the bar of anything less than "Never for Nothing" So where do episodes like "How to Sink a Sub" and "You're Two Kind" fit in with this spectrum. They're better than Broken Pencil Show, but not as good as "Never for Nothing". I would love to have a percentage rating where people put in more accurate judgements.

The full list idea would be cool, though I don't think you should make it where someone can select an episode more than once. Maybe if you had the full list of episodes, and people chose their top ten comedy episodes, top ten serious episodes, and top ten all around great episodes. It might be a good idea to have a bottom ten in each category as well.
Katrina Meltsner talking to Katrina Shanks Video
(Pamela Hayden and Audrey Wasilewski face-off)

http://www.youtube.com/watch?v=sDHyEphRM4g

Yep. Using my signature for a shameless plug. But trust me. If I can be so arrogant, I think it'll be worth your time.
User avatar
Bob
Catspaw Rocks!
Posts: 705
Joined: September 2006
Location: The Metroplex
Gender:
Contact:

Post by Bob »

@Reddo: Well, the thing about this system is that it only presents you two choices at a time. You can look at the complete list of ratings for reference, to see what people think the best episodes are, but you don't have to go through every single one and rate them above each other, which makes it easier. ;)

How Kitten Wars (and the NHL Elo system) work is that they pick two choices out of the whole list, and you choose which of those you think is better. Over time, and with many ratings, the idea is that you should get a decent list of how all of the choices stand, relative to each other.

Kitten Wars solves the 0 vote problem by having a percentage system (so 1 victory and 0 losses means it has a 100% rating); for the NHL player rating system, they start them off at a "base rating" of 1500 or so, so new players are assumed to be "average", and then the voters skew them one way or another. Either of these choices is better than having no rating at all. The Kitten Wars system is better in that it allows an episode to climb through the ranks more quickly (which helps people bring great new eps to the top), but it also requires a lot more reviews before you can clearly rank every episode, whereas the Elo system, after some setup time, should be more precise and rarely have duplicate ratings.

@Dallas R.: I definitely sympathize with you. I came up with my sports-game-rating system because I felt the whole 5-star deal wasn't working out for me, and every episode got too many stars -- thus not clearly showing me how "Emily the Genius" compares to "Unbecoming Jay" (since they both have 4-star reviews). At the same time, though, only I really know what I mean when I use my system because I'm probably the only one who plays those games who is also on this webboard.

(For a bit of explanation: in the 2K hockey games, a rating like "75" is a third-line player, something like "80" might be a second-liner, "86" is a first-liner, and 90 and above is a star. Still, this obviously does not exactly correlate to Odyssey episodes -- and in this case, unlike in the games, the practical difference between "1" and "65" for a bad episode is really a moot point.)

That said, though, I don't think selecting an episode more than once should be a serious problem. You have to be able to rate the same episode several times, against a variety of other episodes, to get precise results. Anyway, there's something like 700 x 700 possible episode matchups, so the odds of you getting the exact same one twice are pretty slim, to say the least.

In any case, I don't think it's as easy to rank ten (or thirty, as the case may be) episodes once as it is to pick one out of two episodes however many times it takes to get an accurate rating for all of them, and with the best-of-ten lists, many mid-range episodes would likely not have any rating at all.

I'll provide an example: "The Pretty Good Samaritan". I like it, but there are easily ten other episodes that would fit above it in every case if I was making the lists you mentioned. Thus, my positive feelings about it wouldn't even appear on such a list. All people would find out from the top ten lists are the extremes (whether good or bad), which I think we already know, subconsciously -- everybody knows that people like "The Time Has Come" and "Clara", and that people don't like split episodes. Thus, I'm not sure that making such top-ten lists official would really help us to see how all the episodes compare to each other.
Post Reply