Algorithmic bias
Experts of one of the world leading big law firms, White & Case, explained in their paper Algorithms and bias: What lenders need to know that even clearly unintentional algorithms direct financial technologies may lead to discriminative decisions. Why? Creditors and lenders have access now, in the age of Big Data, to so-called nontraditional data, such as Internet activity, shopping patterns and other data which are not necessarily directly related to creditworthiness. These data are analyzed by using the techniques of now very popular machine learning.
Traditional algorithms use the rules of arithmetics and logic defined by the designer of the algorithm. Say, IF Borrower payed back her previous credits without any delay THEN increase her credit score by X points. But machine learning techniques relies don’t have a previously defined algorithm, they generate algorithms based on patterns found in large datasets. Take for example the approval of loan request. The software has stored analyzed data of financial behavior of many thousands of previous customers. Loading the credit history of a new applicant as input data, a machine learning algorithm might calculate the output, something like the probability that the applicant will default.
There are justified concerns that algorithms might bring biased, and may be unfavorable decisions for minorities. Ideally, the decision makers should take into account the data permitted by ECA for the borrower only. But we live in networks surrounded by our neighbors, friends, and peers etc. If the creditors analyze data of your social network friends to rate your creditworthiness, it might lead to discrimination based on data creditors are not permitted to consider. Living address might matters a lot, and ZIP code is considered a dangerous variable, and the term redlining expresses the discriminatory practice (historically for people lived in black inner city neighborhoods).
A machine learning algorithms may find that there is a correlation between your creditworthiness and the financial behaviour of your friends or neighbors. It is a complicated situation: a creditor cannot deny your request on the basis that many of your friends delayed to repay their loan. Also, they should explain the basis of their credit request denial. But if nontraditional data are used, it is very difficult to give transparent and understandable explanation.
It is understandable that data from your social network cannot be used for evaluating your financial future. However, we choose many our future activities based on recommendation systems. These recommendations influence our choices of hotels, restaurants, dating partners, movies, just to give an unranked list.
The recommender algorithms use data also about ”stuffs what my friends like“. We will discuss in more detail in subsection Sec:10.1.
Should we like algorithms? If you are ready to answer the question with ”No”, did you think whether or not we better off by returning to the personality-based subjective credit evaluation?
In May, 2016, the Obama administration’s The Treasury Department issued a white paper titled, Opportunities and Challenges in Online Marketplace Lending. In addition to the traditional players, online marketplace lending companies has been formed emerged offering faster credit for consumers and small businesses. It was a good news that the Treasury Department found important to analyze the opportunities and risks presented of this new type of crediting system.
Towards fair algorithms?
Computer scientist realized that algorithms might lead even unintentionally to discrimination. Why? Data mining methods are based upon assumptions that comes from the pioneers of modern science, as Galileo, Kepler and Newton: looking into the data from the past implies our ability to predict the future. While this method worked wonderfully to predict the motion of the planets, should we also assume that historical data for the social behavior are useful for prediction?
There are now algorithms for forecasting crimes based on historical data. Patterns for time of the day, seasonality, weather, location (vicinity of bars, bus bus stops factors), crime level in the past and similar data help police departments to have a better chance to prevent potential crimes. As always, while the goal of \emph{predictive policing} promises to be race-neutral and objective, there are also justified concerns that the application of algorithmic approach leads to the emergence of new problems related security, privacy, and constitutional rights of citizens. (A.G. Ferguson: \emph{The Rise of Big Data Policing Surveillance, Race, and the Future of Law Enforcement}. Again: algorithms behind predictive policing – much more often than not – help the work of law enforcement, but they are not silver bullets to eliminate crimes. As a Lithuanian data scientist Indrė Žliobaitė, now in the University of Helsinki in Finland writes in a position paper about ”Fairness-aware machine learning”: ”usually predictive models are optimized for performing well in the majority of the cases, not taking into account who is affected the worst by the remaining inaccuracies”. It is a very difficult question, we all know that there are human faces and fates behind the numbers.
We know the horror stories, and I decided not to repeat them, when intentionally neutral algorithms produced sexist or racist output. One reason is that machines learn by examples extracted from the data. Data generated by humans reflect human bias. Yes, it may happen that algorithms may sustain prejudice or maintain social hierarchy.
Social scientists and computer scientist should cooperate to generate ”ethical algorithms”. Ethics, i.e. moral philosophy investigates what is ”good” or ”bad” behavior. (I leave the answers for the philosophers.) From the perspective of machine learning the question is how to train algorithms to teach them to bring moral decisions. Than date can be preprocessed, and unethical data could be eliminated. We may expect that lot of studies will be conducted to understand the scope and limits of building ethical algorithms, and we should accept that ”fairness” is far to be a well-defined concept.
Beyond the algorithms: the Lending Circles and the credit score game
Not only algorithms, people also can learn :-). As a researcher now at Univetsity of Arizona, Mark Kear describes and analyzes an example how immigrants learned that (i) they should play the credit score game, (ii) it is possible to improve their credit history. Kear was a participant and observer of a Lending Circle. Lending Circles are organized by the Mission Asset Fund (MAF), a San Francisco-based nonprofit organization to help increase the credit scores of low-income families. People learn strategies to report data, which improves their creditworthiness. MAF managed to increase the credit scores significantly (with 168 in one case study).
Instead of summary
As John von Neumann wrote in his paper Can we survive technology?:
All experience shows that even smaller technological changes than those now in the cards profoundly transform political and social relationships. Experience also shows that these transformations are not a priori predictable and that most contemporary ‘first guesses’ concerning them are wrong. For all these reasons, one should take neither present difficulties nor presently proposed reforms too seriously. \dots
The one solid fact is that the difficulties are due to an evolution that, while useful and constructive, is also dangerous. Can we produce the required adjustments with the necessary speed? The most hopeful answer is that the human species has been subjected to similar tests before and seems to have a congenital ability to come through, after varying amounts of trouble. To ask in advance for a complete recipe would be unreasonable. We can specify only the human qualities required: patience, flexibility, intelligence.