Tuesday, March 15, 2005

25 or 6 to 4

(Should I try to do some more?)

Thanks to Ombudsman reader Dan Dormer (new motto: "doing the Ombudsman's job so he doesn't have to") for pointing me to a post on his blog about Play Magazine's changing scoring system. Over three issues, the magazine changed from a long-used letter grade system (A+, B-, etc.) to a star based system (out of four stars) finally settling on a ten point scale that the magazine's Editor-in-Chief, Dave Halverson, says makes everyone happy.

In my short history doing game reviews, I haven't run into many issues less contentious than the subject of review scores. When I started writing reviews for my college paper, The University of Maryland Diamondback, I made a point of not including any sort of summary score with my copy. I felt that these scores distract from the actual text of the review and are too discrete to differentiate between varied gameplay experiences.

This worked fine for about two years, until the paper's new section editor asked me to start adding a letter grade to each review, along with a "bottom line" synopsis (example from a review of F-Zero GX: "Sega and Nintendo collaborated on a well made game, but it's a little too hard for its own good"). At the time, I was a little miffed about this, thinking my articles should be good enough to stand on their own. Now, I realize my editor was just giving the readers what they want. A review score provides a good entry point for a reader who might not be sure if he wants to read a specific review. Sure, there will always be some readers who just skim through and read nothing but the score, but they probably aren't your target audience anyway.

Another blow-up over game scoring occured around the same time at GameCritics.com. A heated message board thread on the subject got started for some reason or other, and the opinions were wild and varied. Up to that point, GameCritics had a 10 point scale for it's reviews, with demarcations at half-point intervals (so 9.5 was allowed, 9.7 was not). Recommendations ranged from allowing all tenth-of-a-point values in scores (for more precision), allowing only whole number scores (for more separation), switching to a letter-grade system (for more understanding) and getting rid of scores altogether (for more focus on the words). Eventually people got tired of arguing and none of the proposed changes came to pass.

Personally, I don't care what rating system you use as long as the scores match well with the writing. A person who has read your reviews for a year should be able to guess the rating (within a margin of error) in a blind taste test. Changing the system as Play did can be very confusing for the reader, and should probably be done very rarely.

And a message to all you skimmers: take a few minutes and read past the final score. You might just enjoy yourself.

9 comments:

  1. I find the best numeric system is 10 with .5 increments allowing for 20-21 options. I've scored about 200 games in the past 4 years and by that point I need that amount of specificity to distinguish compared to a 10 option system. However, 100 option systems like ign are just absurd.

    A letter grading system is what I currently use on my blog, though it is just a conversion from my 20 option system where an A+ is 10, A is 9.5 A- is 9 and so on and basically you limit it to 13 options with anything 4 or below getting a common F which by the time you get down that low it doesn't really matter because crap is crap.

    I tend to appreciate scores because usually I read a review first if I'm interested in a game and second if the score catches my eye. A game I'm not interested in that gets a average score I feel safe in ignoring. Without a score it could say the game is great and I probably wouldn't bother reading to find out.

    ReplyDelete
  2. Five stars (with half stars) is precise enough IMO. This gives ten possibilities (if you include zero) which is about the only reasonable gradation you can actually make.

    Why stars instead of a score out of ten? Because once you start getting into numbers, there is the psycho-social pressure to start passing out halves - leading to the twenty point list that Erik described. Plus, the star system is already in use in most movie reviews (usually with four) so it is instantly familiar to the reader.

    This is getting overly scientific, of course, but I don't see much point in weighted averages of graphics and gameplay or game ratings broken down into percentages. When evaluating a game, is there really a meaningful difference between 93 and 96? Is it the same as the difference between 77 and 80? Too many possibilities give the illusion of science behind all this when we reviewers are usually just eyeballing it.

    Readers do want ratings, and I think they serve a purpose to journalist watchers. They are a shorthand way to detect bias, sloppy reviewing, genre unfamiliarity, etc. In an ideal world, we could read every word of everybody's reviews and then distill that into some sort of wisdom or meaning. A review score lets us do that a lot more quickly. It's not a substitute for text, but can help us ascribe meaning to it.

    ReplyDelete
  3. The site I write for uses the 100 point "percentage" scale and yes it can get a little silly sometimes differentiating between an 9.1 and a 9.2 or a 6.7 and a 6.8 but you have to admit, GameRankings has basically made the 100 point scale the standard.

    ReplyDelete
  4. John:

    No more than Rottentomatoes has for film reviews. Gamerankings is a compilation site and therefore will use a percentage. This has no bearing on what an individual site or reviewer will or should use.

    ReplyDelete
  5. My vote is for five stars, no half-stars. Having only five options would help the reviewer keep his/her emotions in check and seriously consider where each game belongs in the scheme of things -- and there is plenty of evidence that most game reviewers need help.

    (Is that the scheme Next Gen used? I can't remember if they had half-stars.)

    However, when editors control themselves and commit to staff-wide consistency, it does not matter what system is used, because the standards of such writers will make any system useful and legitimate. Take Gamespot's system, for example. That thing would be an unholy mess in the hands of many reviewers, but their staff is extremely discerning, so it works.

    ReplyDelete
  6. My vote is for five stars, no half-stars. Having only five options would help the reviewer keep his/her emotions in check and seriously consider where each game belongs in the scheme of things -- and there is plenty of evidence that most game reviewers need help.

    (Is that the scheme Next Gen used? I can't remember if they had half-stars.)

    However, when editors control themselves and commit to staff-wide consistency, it does not matter what system is used, because the standards of such writers will make any system useful and legitimate. Take Gamespot's system, for example. That thing would be an unholy mess in the hands of many reviewers, but their staff is extremely discerning, so it works.

    ReplyDelete
  7. My vote is for five stars, no half-stars. Having only five options would help the reviewer keep his/her emotions in check and seriously consider where each game belongs in the scheme of things -- and there is plenty of evidence that most game reviewers need help.

    (Is that the scheme Next Gen used? I can't remember if they had half-stars.)

    However, when editors control themselves and commit to staff-wide consistency, it does not matter what system is used, because the standards of such writers will make any system useful and legitimate. Take Gamespot's system, for example. That thing would be an unholy mess in the hands of many reviewers, but their staff is extremely discerning, so it works.

    ReplyDelete
  8. I wonder if there's not some weight to the Siskel and Ebert "Thumbs up/thumbs down" system. The reason we read reviews is figure out whether it's worth our money, whether it is "good" or "bad."

    The thing is that for most of the gaming magazines or sites we develop our own "good/bad" ranking scale. An 8 on IGN doesn't necessary mean a thumbs up, but it certainly does for PC Gamer or Gamespot.

    But I think if magazines started using this thumbs up/thumbs down, or an equivalent of just plain "good/bad," it might encourage people to read more into the essay. A 10 point score system gives away too much for the person to have to read the article to pick up the nuances of the game.

    ReplyDelete
  9. I recently started using a rating system on my website. I resisted it for years because I don't like the whole concept of assigning a score to anything in the creative realm, but I realized that most readers are lazy and easily intimidated by longer reviews. They want some kind of quick summary, particularly if they are just skimming through a listing of reviews trying to choose one to read.

    After a lot of thought, I went with a 4 star system (with half-star increments). I found that the 0-100 percentage system gives way too much flexibility and ends up giving the reader no real point of reference. I also found that even a scale out of 10, where 5 should generally be the average or median score, in practice 7 usually turns out to be average. On most game sites you will NEVER see a score lower than 5 out of 10, even for the worst games. I don't see the point of this, I think you should always choose a rating system where each rating does actually get used. I also went with 4 stars instead of 5 because 3/5 stars is too convenient as a middle of the road score (although 2 out of 4 essentially becomes the division between good and bad instead).

    It's a tough issue because ratings can be so arbitrary. I think each publication needs to take it upon themselves to uphold some sort of consistency. I also hate the idea of rating individual areas of a game. Are graphics and audio equally as important as gameplay? I think one overall score should be sufficient, because all that matters is the sum of the parts. As others have said, you start to give the illusion that reviewing creative projects is a scientific process... which it clearly is not.

    ReplyDelete