As I am thinking about my own way of thinking when I am assigning scores to students knowledge, motivation and ability, it would not have any sense to deny the subjective elements of my evaluation method. Somehow I integrate my memories about the student’s character, attitude, performance. Of course, with close students I had numerous conversations about very different aspects of life, from work ethics to philosophy of science, and from politics to love. I try to be objective, but it is difficult to avoid what is called halo effect. The halo effect is a form of cognitive bias, in which our overall impression of a person determines our evaluation of specific traits and performance. The emergence of the concept goes back to Edward Thorndike (1874-1949), a psychologist who used it in a study published about hundred years ago to describe the way that commanding officers rated their soldiers. I have been finding very difficult to detach to judge the motivation and the performance of students, or her likability to her analytical skills. After I became aware of the halo effect, I make more effort to rate each items independently from all other items. Fortunately, a student is evaluated by several other people, so maybe (yes, maybe) the individual biases averaged out. Collective wisdom is supposed to be more efficient than the individual judgment, as we discuss now.
Francis Galton (1822-1911), a half-cousin of Charles Darwin, loved to count and measure everything. While he has a bad reputation for introducing the field of eugenics with the goal of improving the genetic quality of human population, he contributed to make among others the fields of biology, psychology and sociology more quantitative. A famous story tells that he visited the West of England Fat Stock and Poultry Exhibition, where among others an ox was on display. He asked the guests to estimate the weight of the animal. About 800 people participated, and the median estimate was very very close to the real value. (The median value is the one lying at the midpoint of a frequency distribution of observed values.) The take home message of this observation is that the accuracy of the estimate of a population exceeds the ones of the individual experts. The notion called and popularized, as The Wisdom of Crowds, which was the title of a book of James Surowiecki in 2005. We don’t have to believe that the opinion of the crowd is impeccable. Surowieczki argued that the estimation of the crowd is really good if the people’s individual opinions are independent. Nietzsche recognized and sharply criticized the herd instinct, we humans have. If we let’s influence ourselves (led by others like a sheep, as Nietzsche writes) than the crowd’s calculation leads to biased result. I am influenced by the works of a leading computational social scientist group in Zurich, Switzerland directed by Dirk Helbing. They gave several neutral questions to people, who had to estimate some data related to the demography or crimes (population density, the number of rapes in a given year in Switzerland, etc…). If people did not communicate with each other, they got a better result than when they could change opinion with each other. Actually the range of estimates was reduced, and the center of opinions has been shifted from the real value. Their finding was surprising. Generally we believe, that consensus implies better decision making, however, it might happen that initially small deviations from the ”good” value are amplified by the herding mechanism.
What we see is that if opinions distribute over a larger range, the estimate is better. Along the same line, diverse population of problem-solvers counts better than even the much more uniform well-performing solvers, as the model calculations of the complex systems scientist Scott Page from the University Michigan demonstrated.
Let’s make a step back! Can we consider that even an individual is a crowd? First, there are people who might have more than one opinions about something or somebody. Also, people may give different estimates to the same quantity, several weeks later. As it turned out that averaging is useful, even individuals may benefit by integrating their different perspectives: crowd and crowdsourcing may exist within a single mind!