Jump to content

Call for input: Standardizing on a rating rubric


Aran

Recommended Posts

We have the Spiderweb tables which rate scenarios on a single score from 1 to 5, and the Lyceum CSR which rate between 1 to 10 (afaik).

 

I actually intend the Blades Forge review section to put more focus on the text of the review than the rating, since the former is far more informative to the undecided player or in fact the designer looking for feedback. But since the purpose of peer review is also to classify the scenarios as masterpieces or rubbish, and rank them, it's clear we need some kind of score to compare them.

 

The single score (preferably on the 1-10 scale, since it is a wider range, and the CSR is better maintained) has an appealing simplicity. But what if I want (this is hypothetical) hack and slash? Or what if I couldn't care less about plot or combat as long as there are lots of original riddles? It is with this in mind that I'm proposing a multi-score rubric that better rates specific aspects of scenarios.

 

However, I've released only a single scenario and actually reviewed perhaps 3-4 of the ones I've played. I'm no judge on this process.

 

Ideally, I would like the rubric to be sufficiently general to apply to both BoA and BoE with only minor differences in labels ("scripting / nodework", etc) as this would far reduce technical complexity.

 

Suggestions are quite welcome. smile

Link to comment
Share on other sites

Rubric?

 

I developed one that can be found here .

While I still haven't gotten around to properly applying it (I've only used it for the half-dozen scenarios that I've rated on Spideweb), theoretical testing suggests that the average scenario will score somewhere between a 4 and a 7 on this rubric, depending on where the author's talents and focus lie.

Tests also suggest that it is difficult to score above a 9 and below a 2.

 

I put more focus on plot, balance, and presentablility than I do gameplay mechanics, because these are the aspects with which players will interact the most. That, and because mechanics vary greatly from scenario to scenario; they become very difficult to standardize.

I also devote a full point out of 10 to custom scripting, because I feel that a solid, creative, and effective original script really does prove the difference between a novice writer and an experienced scenario designer.

However, in the end, I feel that a designer's job is only done properly if the player comes away from the scenario with the desire to re-experience the powerful emotional highs that should come with its execution and completion. After all, the point of a good scenario is to not only tell a good story and give good execution, but to give payoff for the story's correct completion.

Therefore, novelty and replayability are ranked highest among all considered factors.

 

EDIT: fixed formatting in the link.

EDIT 2: After further consideration, I might as well say...

Clearly, this isn't exactly what Aran is going for, as he said, quote, "But what if I want (this is hypothetical) hack and slash? Or what if I couldn't care less about plot or combat as long as there are lots of original riddles?" I just wanted to give an example of how it could be done, not how it should.

 

The ideal rubric would therefore have to balance all elements, instead of having a storytelling emphasis like mine does.

 

I think that an ideal presentation of scenario ratings would be to have a table that presents and compares the average ratings of the individual elements of the rubric for each scenario, as well as the overall score. After all, unlikely as it sounds, the scenario with the best scripting could potentially be the one with the worst plot.

It would also be quite cool to somehow script said table so that you could reorganize it according to which aspect of the rubric you want to look at (so listing by best plot, gameplay, combat, etc.). But as I have minimal web design experience, I have no idea how one could implement that.

 

--------------------

The Silent Assassin will now proceed to do the dance of the 2 veils, 3 handkerchiefs, a dust rag, and a shop towel.

Watch your wallets. Trust me.

Link to comment
Share on other sites

It sounds like Aran is actually asking for something like each element of a scenario to be rated, possibly 1-10. Your rubric would work fine, but only if the breakdown of scoring were shown along with the total.

 

—Alorael, who agrees that it might be nice to separate elements. Technical achievements probably mean more within the Blades community than to casual players (who are how numerous...?). Plot is almost always important, though. And how about the septuagenarian eskimos who love mindless combat?

Link to comment
Share on other sites

I think the categories of Lenar's rubric are good, but they needn't be combined into a single overall rating.

  1. Playability
  2. Plot/Exposition?
  3. Combat
  4. Puzzles
  5. Roleplay?
  6. Aesthetic
  7. Custom Graphics
  8. Town Functionality
  9. Novelty
  10. Replayability?
  11. Scripting/Nodework
  12. Outdoor Functionality
  13. Dialogue
  14. Prefab Party?
  15. Treasure Balance?

I added #l2 onwards, and the ?'s are the ones that don't quite seem good enough for me.

Each category should have a rating of 1-5 or 1-10 or whatever, plus a N/A option (eg for a scenario without an outdoors or without combat, etc.)

Also, some of the above categories could probably be combined.

Link to comment
Share on other sites

Brett\'s Rubric for those who haven't seen it. I don't have time to reread it right this second, and I don't remember my impression of it the first time I saw it. Anyway, Brett was pretty big on rubrics if I'm not mistaken, and he was definitely pretty big on designing, so his thoughts on the topic are definitely worth noting.
Link to comment
Share on other sites

Usually I'm not a fan of rubrics, it tends to focus too much on the parts and not a much on the scenario as a whole. But in the case of the Blades Forge it presents the interesting possibility of, say, searching scenarios based on combat scores, or plot scores, which would be pretty cool.

 

I think if you give reviewers a rubric to fill out, you should also have a field for overall score, which would be independent of the sum of the rubric scores and would instead be a subjective 1-10 Lyceum style score. There's still something to be said for coming up with a score based on impressions of the scenario rather than picking apart its strengths and weaknesses.

Link to comment
Share on other sites

I would rarely bother to rate a scenario if it required more than 3 different numeric scores. Enjoyment, replay value, and technical. I would also like to see validation of reviews by allowing users to give a thumbs up or down to the review, depending on how it matches their experience. This will allow a user to pick scenarios that are compatible based on previous reviews. By the same light, and totally abusable, I would like to see the ability to call into question the review (and reviewer) by allowing a series of thumbs down to suspend that right to review. If no one is finding a review to be helpful, should it be allowed?

The notion of rubrics is great, but when put into practice only a few people will take the time and effort to make sure that they are fully utilized. The remainder (me) will just do what they want and not care about the balance. See the Lyceum for what you can expect as a range of reviews, and I would suggest that those submitted during the ratings contests be ignored.

 

smile

Link to comment
Share on other sites

That is a very good notion.

 

Perhaps the score should consist of only ~3 separate rankings (and a subjective total ranking), but the review could still allow the reviewer to select from a list which aspects are strengths/weaknesses of the scenario, and which type of player the scenario would most/least appeal to.

 

This reduces the number of numerical scores the reviewer has to give, and reduces the number of decisions. Deciding if "Difficulty of combat" was a weakness or a strength (or neither) of the scenario is much easier than rating it from 1-10.

 

It doesn't significantly reduce the information the player wants - they won't be searching for a "scenario that ranks around 8/10 in plot on average", but rather a scenario "that around 80% of people think would attract players who care about plot".

 

And yes, meta-moderation will certainly be there. Whether a bad feedback is enough to unpublish a review is another question (personally I'd say leave it up but rank reviews by feedback, and calculate a weighted average score based on feedback). And whether reviewers have a karma - that is, whether feedback on one review can influence their other reviews - is still a different question.

Link to comment
Share on other sites

My suggestion:

 

Three rating categories (0-10, with 0 being absolutely bereft of any sort of redeeming quality whatsoever and 10 being absolutely perfect, plus N/A), plus an Overall rating and a textual review field.

 

The Overall Rating is independent, but has an optional "Calculate Average" button. It may be too complicated, but then I know you like a challenge, Aran. This button, when clicked, asks the player to relatively weight each rating category, and then uses those weights to calculate a weighted average to put in the Overall Rating. The categories would be weighted on a 1-5 + N/A importance scale, where N/A discounts the category from the average, and the three categories, including those rated N/A, would be weighted evenly by default.

 

Each of the three specific categories would also have an explanation:

 

1. Writing

This category encompasses storyline, plot, themes, dialogue, descriptions, cliché or lack thereof, etc. The literary aspects of the scenario.

 

2. Gameplay

This category encompasses combat, puzzles, balance (how well the combat, treasure, EXP, etc. interact), engine modifications (such as the water system in "Destiny of the Spheres" or the rune system in "Nebulous Times Hence"), flow (or something, not sure what to call it. For instance, do you constantly have to blindly guess what you have to do or say to progress the scenario, or are the outdoors so unnecessarily large that it takes three weeks real time to get from one town to the next?), etc. What the player actually does while playing the scenario.

 

3. Aesthetics, Functionality, Etc.

This category includes graphics, spelling and grammar, town design, cutscenes, technical noding/scripting, bugginess, whether the scenario contains far too much gratuitous foul language/lewdness/etc. (if that's important to the person rating the scenario), artistic merit, etc. The presentation of the scenario.

 

Aside from the numerical rankings, the reviewer must have the opportunity, or even be encouraged, to give a Textual Review of the scenario, to explain in depth the strengths and weaknesses of the scenario.

 

Does it sound like a good system to you guys? I know the designers would probably want a separate rating for technical innovation, but ultimately that won't matter at all to most players, and this can always be discussed in the textual review.

 

EDIT: Alternately, perhaps the Blades Forge could include a feature called A Chronology of Technical Innovations or something like that, which chronicles the various innovations and which scenarios created them, and this could be referenced in a Technical Innovations section on each scenario page. The section could be hidden for scenarios without a notable innovation and only be visible if there's something referenced in it.

 

EDIT 2: Fixed a mistake (typed "rated" where it should have said "weighted") and edited the third category slightly.

 

EDIT 3: Also, perhaps a five-star (or some other symbol(s), a Blades of Exile/Avernum theme would be cool, but that's more an aesthetic thing than anything important) Review Usefulness rating system might be preferable to just thumbs up or thumbs down.

Link to comment
Share on other sites

My instincts are similar to ADOS's. The three major components of a successful game (or scenario) really boil down to the storytelling, the gameplay (mainly combat) and the technical part.

 

The "technical part" corresponds to scripting and sloppiness and such, and basically asks: are there any seams showing? Do the typos, dysfunctional scripts, strange combat happenings, contradictions, ugly graphics, bad zonification, or so on interfere with my suspension of disbelief? Or was it constructed competently?

 

Aran -- I think it would be very useful to have some binary (or trinary, etc.) categories used to describe each scenario as well -- more or less separate from subjective ratings -- and would make it much easier both to organize the scenarios and to learn about a scenario at a glance. For example:

 

Prefab Party / Any Party

Singleton Required / Any Party

Epic / Regular Length / Short

Linear / Open-Ended

Many Riddles / Few Riddles

Set in World of Avernum / Medieval / Modern / etc.

Combat / No Combat

 

I'm sure there are others that could be added, or perhaps some of these are unnecessary.

--t

Link to comment
Share on other sites

The inclusion of separate ratings for puzzles, combat, and so on, would be very useful for players like me, who are just looking for something that matches their current mood.

 

The potential categories (among the ones I look at) include: plot, combat, puzzles, atmosphere, and unusual mechanics.

 

For categories like combat, puzzles, and non-standard mechanics it would also be good to include a "High/Medium/Low" rating of how important they are to a scenario. (If I am not in the mood for puzzles, searching for scenarios with low puzzle score wouldn't work, because low puzzle score might mean that scenario is full of bad puzzles, rather than lacks them completely.)

 

In summary, I'd like to see ratings of quality in the 5 categories (plot, atmosphere, puzzles, combat, and non-standard mechanics), as well as an indication of how important each category is to the scenario (for the times when I want to avoid puzzles/combat/whatever).

 

PS Reviews of reviews are such a good idea that we should take it a step further: reviews of reviews of reviews. :p

 

Seriously, if you don't agree with a review, write your own.

 

EDIT: The thumbs up/down systems are useful for places like Amazon, where they are a good way to sort a list of dozens, or even hundreds of reviews. However, considering that we'll be lucky to have a dozen reviews per scenario in Blades Forge, it would be more beneficial for people to write their own reviews, rather than give anonymous ratings to existing ones.

Link to comment
Share on other sites

Perhaps a search function where you can search for keywords in the Textual Review and sort results by any of the four ratings would help? For instance, you could search for scenarios with the word "puzzles" in the reviews and sort them by Gameplay.

 

Or perhaps, in addition to the four ratings and textual review, there could be a Keywords field, where you could put, say "puzzles" or "atmosphere". It could also be encouraged to place a + sign before keywords that indicate good qualities and a - sign before keywords that indicate bad qualities. For instance, a scenario with good puzzles could have the keyword "+puzzles". Then, you could sort results by the number of "+puzzles" keywords the scenarios received.

 

EDIT: I find combat and puzzles to sort of go together, though. Unless you're a septuagenarian eskimo, good combat will be a sort of puzzle in and of itself. Also, the number of scenarios with puzzles is too few relative to those without to warrant a separate category, in my opinion.

Link to comment
Share on other sites

The point of rating reviews was to give weight to them. A review that was universally given "thumbs down" would have less weight than ... any other review. Also, it would be a way to possibly learn if you could just mimic another player because you found yourself in agreement on all their reviews.

Link to comment
Share on other sites

I'm honestly not sure we have enough reviewers at this point in time to merit a thumb system, as interesting as it sounds. I'm personally not sold on it, even if we had the numbers to make it worthwhile... I am also of the "Don't like it? Write your own." school, if only because it's so difficult to get a substantial number of reviews anyway.

 

As for the actual scoring, I kinda like Aran's strengths/weaknesses idea. Might be because that's how a lot of my Lyceum reviews read, but yeah. I like it. There does still need to be room for a textual review and the classic "random number between 1 and 10" score, though.

Link to comment
Share on other sites

Quote:
This button, when clicked, asks the player to relatively weight each rating category, and then uses those weights to calculate a weighted average to put in the Overall Rating.
That's a lot of input parameters for getting one single value. If the "calculate average" is optional anyhow, it would be enough to make it unweighted. If the reviewer doesn't like the unweighted average, they can adjust it directly without having to mess around with the weights until they get one they like.

Technical Innovations:

There is so far the possibility to reference a scenario in an article. It would be a useful feature to reverse this feature and list the referencing articles on the scenario page, too. As most technical innovations are documented by an article, this would also take the place of that feature.

Quote:
Prefab Party / Any Party
Singleton Required / Any Party
Epic / Regular Length / Short
Linear / Open-Ended
Many Riddles / Few Riddles
Set in World of Avernum / Medieval / Modern / etc.
Combat / No Combat
Good point - though surely this is pretty objective stuff that the scenario designer can answer when creating the scenario. We don't need the reviewers' opinion on whether the scenario has a pre-fab party; it does or it doesn't.

---

By the way, note that I could place the tagging system that's already on Graphics and Scripts onto Scenarios too. This would allow any kind of flexibility with catgorizing scenarios with arbitrary labels.

Also, reviews would be rated at most with a single rating from 1-10, or in Slashdot's and Amazon's style with a "helpful / unhelpful, fair / unfair" binary. This is common practice, and allows a far better moderation than the idea of "choosing a couple of good reviewers who have access to the reviewing system". Seriously, I'll already have a problem getting people to want to contribute. Placing artificial barriers on who gets to do so would be counter-productive. It's not as if I'm giving away rewards for reviewing, so it cannot be portrayed as any kind of privilege. Jury duty?

-----

Also, as Reviews are nodes themselves, I really like the idea of enabling comments for them. This is an opening for discussion and constructive (hopefully?) feedback.

In general, I'll choose the path that is least restrictive on the content the users put on the site. Wherever there is a choice between enabling an option to offer feedback and not doing so, the feedback is better - so rating and commenting reviews look particularly appealing. I could see the latter option being used by the author too, to respond to criticisms.
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...