Brainstorming Company Names Revisited
- Total Cost: $35.94
- 387 HITs
- TurKit code for this experiment: code.js, all files
- TurKit Version: 0.1.42
I’ve been gone for a bit working on a research paper, and then attending conferences, but now it is time to get back to business.
This experiment is an extension of the previous blog post about brainstorming company names. In that post, it seemed like iteration wasn’t making a difference, except to encourage fewer responses. This time, we decided to enforce that each worker contribute the maximum number of responses. We also reduced that number from 10 to 5, since it felt daunting to force people to come up with 10 names. This also reduced the number of names we needed to rate, which is the most expensive part of this experiment.
Finally, we decided to show all the names suggested so far in the iterative condition. Previously, we showed only the best 10 names, but this required rating the names, which seemed bad for a number of reasons. Most notably, it seemed like an awkward blend of using the ratings both as part of the iterative process, and also as the evaluation metric between the iterative and non-iterative (or parallel) conditions.
The new iterative HIT looks like this:
The parallel version doesn’t have the “Names suggested so far:” section.
We also changed the rating scale from 1-5 to 1-10. This was done because 1-10 felt more intuitive, and provided a bit more granularity. It would be nice to run experiments concentrating on rating scales to verify that this was a good choice (anyone?). Here is the new scale:
We brainstormed names for 6 new fake companies (we had 4 in the previous study). You can read the descriptions for each company in the “Raw Results” link below.
This graph shows the average rating of names generated in each iteration of the iterative processes (blue), along with the average rating of all names generated in the parallel processes (red). Error bars show standard error.
Names generated in the iterative processes averaged 6.38 compared with 6.23 in the parallel process. This is not quite significant (two-sample t(357) = 1.56, p = 0.12). However, it does appear that iteration is having an effect. Names generated in the last two iterations of the iterative processes averaged 6.57, which is significantly greater than the parallel process (two-sample t(237) = 2.48, p = 0.014) — at least in the statistical sense; the actual difference is relatively small: 0.34.
There is also the issue of iteration 4. Why is it so low? This appears to be a coincidence—3 of the contributions in this iteration were considerably below average. Two of these contributions were made by the same turker (for different companies). A number of their suggestions appear to have been marked down for being grammatically awkward: “How to Work Computer”, and “Shop Headphone”. The other turker suggested names that could be considered offensive: “the galloping coed” and “stick a fork in me”.
These results suggest that iteration may be good after all, in some cases, if we do it right, maybe. Naturally we will continue to investigate this. We have already done a couple of studies with similar results to this one, suggesting that iteration does have an effect. After posting these studies on the blog (soon), the hope will be to start studying more complicated iterative tasks.
You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.