Rating graduate students applicants

December. As a college professor, my seasonal duty is writing letters of recommendations, and rate students based on  several criteria to help them to be accepted by graduate schools. Students should ask a number of professors to evaluate them. Occasionally I have to tell a student that I would not be able to be a strong recommender, so it is better not to ask me. We, evaluators, combine quasi-objective data (say, grades) and subjective impressions to generate a rating score. Subjectivity is far from being identical to random, and college professors don’t have better ways to help students and graduate programs to find a good match.  Admission committee has a strong interest in ensuring they only accept mature, polite, reliable and stable people into their program, and my professional duty is to help them.

CollegeNET  is a corporation, which provides software as a service for many universities, among others for admissions and application evaluation. There are six criteria to rate students:

  • Knowledge in chosen field
  • Motivation and perseverance toward goals
  • Ability to work independently
  • Ability to express thoughts in speech and writing
  • Ability/potential for college teaching
  • Ability to plan and conduct research

We should choose among five options: Exceptional (Upper 5%) Outstanding (Next 15%) Very Good (Next 15%) Good (Next 15%)  (Next 50.)

(In some other softwares the “exceptional” is the upper 2%. I noticed that while I am ready to place students in exceptional category if is defined as upper 5%, and very infrequently, if they should be in the upper 2%.)

How  do we generate the numbers and choose the appropriate rubric? In principle, a micro-rationalist, bottom up approach would work: teachers could collect and store data from students back to decades, and they might have a formal algorithm to calculate the percentages. I do believe, still many of us adopts top down strategies. I ask myself: do I want to grant an “all exceptional” set of grades? Does the applicant  have a clearly weakest point, so  should I check the third or maybe the fourth rubric?  How about to check four exceptional and two outstanding rubrics?
Good or bad, decision makers calculate the sum of the grades, analyze the grade distribution.

As Churchill could have told: Quantification is the worst form of evaluation, except for all the others.

A news about the world of ranking: Napoleon was the Best General Ever, and the Math Proves it.

Ethan Arsht published an analysis of Ranking Every* General in the History of Warfare. He adopted  a “system of Wins Above Replacement (WAR). WAR is often used as an estimate of a baseball player’s contributions to his team. It calculates the total wins added (or subtracted) by the player compared to a replacement-level player. For example, a baseball player with 5 WAR contributed 5 additional wins to his team, compared to the average contributions of a high-level minor league player. WAR is far from perfect, but provides a way to compare players based on one statistic.”

Arsht constructed a database from Wikipedia. The database  includes 3,580 unique battles and 6619 generals, and using a not too complicated linear model he got some remarkable results:

“Among all generals, Napoleon had the highest WAR (16.679) by a large margin. In fact, the next highest performer, Julius Caesar (7.445 WAR), had less than half the WAR accumulated by Napoleon across his battles. Napoleon benefited from the large number of battles in which he led forces. Among his 43 listed battles, he won 38 and lost only 5. Napoleon overcame difficult odds in 17 of his victories, and commanded at a disadvantage in all 5 of his losses. No other general came close to Napoleon in total battles. While Napoleon commanded forces in 43 battles, the next most prolific general was Robert E. Lee, with 27 battles (the average battle count was 1.5). Napoleon’s large battle count allowed him more opportunities to demonstrate his tactical prowess. Alexander the Great, despite winning all 9 of his battles, accumulated fewer WAR largely because of his shorter and less prolific career….”

For the details visit his website, see also comments on his post.


People have a desire to compare themselves with others. In many cultures children learn they should win to demonstrate that they are better, stronger and more successful than the others.

Direct comparison might lead to different results from Muhammad Ali’s famous ”I am the greatest” via ”The grass is always greener on the other side of the fence.” Actually, Ali stated even more: ”I’m not the greatest. I’m the double greatest. Not only do I knock ’em out, I pick the round. I’m the boldest, the prettiest, the most superior, most scientific, most skillfullest fighter in the ring today.” In principle we may believe that self-qualification is suspicious, and leads to woolf-boolf type biased ranking. Ali’s statement on himself, however, is approved by the ”collective wisdom”: almost everyone from the generation who saw him in the ring believes that Ali was really the greatest. When the U.S. Army measured Ali’s IQ at 78, he said, ”I only said I was the greatest, not the smartest.” I find amazing how objectively describes himself: ”It’s just a job. Grass grows, birds fly, waves pound the sand. I beat people up.”

As opposed to the result of Ali’s positive self-ranking, another class of comparison expresses inferiority complex. The idea behind the quotation ”The grass is always greener on the other side of the fence” may have its origin in the poem of Ovid (43 BCE – 17 or 18 CE) Art of Love. He wrote, ”The harvest is always richer in another man’s field.” There are other proverbs expressing similar attitude: ”The apples on the other side of the wall are the sweetest,” ”Our neighbour’s hen seems a goose,” and ”Your pot broken seems better than my whole one.” These all convey the message of others have a better life, are more fortunate than we are. The German version of the proverb states that ”Kirschen in Nachbars Garten schmecken immer besser” (The cherries in the neighbour’s garden always taste better.) A life might be miserable if you always feel that others have better stuffs. The feeling makes you envy and might lead to anxiety and to other mental health problems. The suggestion of Robert Fulghum, author of former best seller book All I Really Need to Know I Learned in Kindergarten is not only more objective, but also offers a viable strategy: ”The grass is not, in fact, always greener on the other side of the fence. No, not at all. Fences have nothing to do with it. The grass is greenest where it is watered. When crossing over fences, carry water with you and tend the grass wherever you are.”

The idiom ”comparing apples and orange” refers to situations, when two items practically cannot be compared. Apples and oranges are thought incomparable or incommensurable. In many European language ”comparing apples and pears” are used. Since comparison is the basis of any ranking procedure, and it has a unique role in or decision makings, I will argue that we need to find the balance between accepting reality and make an effort to change things towards future successes.



Is a horse bigger or smaller than a cow?

Ferenc Jánossy (1914-1997), an engineer-turned economist from a legendary Hungarian family (he was the step son of George Lukács (1885-1971), one of the founders of the philosophy of ”Western Marxism”), wrote a book in Hungarian with the title ”The measurability and a new measuring method of economic development level”, and it was a revelation at that time. Jánossy explained his approach clearly:

”The first issue is how qualitatively different objects can be compared quantitatively. Every child knows that an elephant is bigger than a sparrow. They would agree without the least doubt that the cow is smaller than the elephant, but bigger than the sparrow. Ranking animals according to size, they would place the cat between the cow and the sparrow without any hesitation. But suddenly the child is faced by the problem of the horse. Where should the horse go? Is it bigger or smaller than the cow? When comparing objects of different characteristics, ranking is no longer so simple because taking into consideration various features may lead to various ranking results. (The horse is taller, yet shorter than the cow.)..”

Generalizing the above game Jánossy finds that the greater the qualitative difference between two items the greater is the quantitative difference needed to make the ranking reliable according to size. Qualitative difference is limiting the quantitative comparability – this is what Jánossy calls the ’criterion of comparability’. Obviously, the critical limit depends on the features compared. (If ranking is only according to height, then the horse-cow dilemma does not even arise.) Ranking, however, is not the aim but only the means, therefore the basis of the comparison cannot be changed to make ranking easier. A clear definition of the organizing principle may lower the critical limit but cannot eliminate it.

The next question is how to move from ranking to measuring. How to make a quantitative statement, or rather under what conditions could be quantitatively described that Sweden is more advanced than Turkey. If any one feature is not additive or cannot be traced back to some additive feature, it cannot be measured. If a feature is measurable, then the comparison of two objects can be decomposed into two steps of “numerical measurements along a fixed scale”. Which means that the critical limit of measurability matches the given absolute scale and the limit of comparability of the object. If the examined feature of the objects can be measured along an absolute scale, the critical limit can be expressed numerically.

This is an example to gave the hint that ranking and rating needs appropriate methods, and the methods have limits! It is very vital to accept the existence of the limits of comparability!

A not-so beautiful tale: An example for intentional biased ranking from a Hungarian folktale

László Arany (1844-1898), the son of the celebrated poet, and the “Shakespeare of ballads”, János Arany (1817-1882), collected Hungarian folktales. One of these tales taught children, how decisions supposed to bring collectively can be manipulated by the strongest participant.

A number of animals escaped from their homes, and fell into a trap. They were not able to escape, and became very hungry. There wasn’t any food around, so the wolf suggested a solution: ”Well, my dear friends! What to do now? We should eat soon, otherwise we starve to death. I have an idea! Let us read the names of all of us, and the most ugly one will be eaten.” Everybody agreed, (I have never understood, why). The wolf  assigned himself to be the judge, and counted:
”Woolf-boolf o! So great!, fox-box also great, my dear-my beer very great, rabbit-babbit also great, cock-bock also great, my hen-my-ben, you are not great.. and they ate the hen… so on…next time cock-bock became food… (thanks to Judit Zerkowitz for the translation from Hungarian).

This is a great example of demonstrating how objectivity is manipulated if one of the voters controls an election.