Stereotypes often carry a negative connotation so many people eschew making decisions based on them. A stereotype, however, is simply a generalization about how a group of people behaves. Even if it is statistically accurate, many believe we shouldn’t be making decisions affecting an individual based on a stereotype because it may not be universally valid.
Say you’re a restaurant owner and sell alcohol in your establishment. A lot of us accept that a person needs to be a certain age to legally consume alcohol. This is not based on universally valid generalizations: some 30 year olds may drink irresponsibly, while some 18 year olds might be capable of drinking responsibly. Some 16 year olds could also handle alcohol sensibly. The problem is you can’t test every individual for alcohol consumption competence. Who would be trusted to design such a test? Would you, as a restaurant owner, even have the time to test everyone ordering alcohol on a busy Friday night? Even for something like alcohol tolerance, if we’re confident that the vast majority of 16 year olds are not mature enough to drink, is it worth the cost of testing them all, knowing most will fail? Individualized testing isn’t always a feasible solution and comes with its own errors and potential for abuse.
Here’s another example. You’re walking through an unsafe neighborhood when you see a group of Black teenagers approaching. You might feel uneasy and consider crossing the street to avoid them. This reaction is based on a stereotype, not on the actual behavior or intentions of the individuals. It’s not that you dislike these people; it’s about instinctively avoiding perceived danger based on past experiences or stories suggesting a group of Black teenagers in a bad neighborhood might spell trouble. Whether the stereotype is right or wrong does not matter to you in that moment; you’re making a quick decision to circumvent a situation you might regret later. After all, what harm could come from taking a little detour?
Just as with age requirements for drinking, you can’t test these individuals a priori. When we lack knowledge about individuals, we often have to make decisions based on what we know about the group they belong to1. This grouping is often based on race, gender, facial features, physique, clothes, or style because these are the most immediate visual cues. Sure, without knowing who they are and where they come from, such generalizations can lead to unfair and biased decisions, but wouldn’t you rather be cautious and avoid a potentially dangerous situation, even if it means making an imperfect judgment, especially if this judgment does no harm to them?
Some might argue, “What if these Black teenagers don’t fit this generalization?” While it’s one thing to challenge the accuracy of a generalization or propose an alternative for a specific context, it’s another to dismiss a statistically accurate generalization due to exceptions. The main question we need to address is what determines when the use of stereotypes is appropriate.
Stereotypes can provide a heuristic or rule of thumb that simplifies the decision-making process. This can lead to quicker decisions by reducing the complexity and the number of variables considered. In the absence of the knowledge of individual variability and when individual testing is expensive a priori, this can avoid or minimize the potential impact of the worst-case scenario2.
It’s naive to argue that generalizations about a group of people must be universally valid to be useful at all — we rely on such stereotypes all the time, and in fact, the human race wouldn’t have survived without them.
What’s important is understanding how using stereotypes can be problematic downstream for those being stereotyped and knowing when alternatives like individualized testing are more appropriate.
Many a generalization based on race, gender, or religion, for instance, deserves special scrutiny due to their historical misuse. Nevertheless, it’d be erroneous to claim that using a stereotype is “wrong” simply because there are exceptions to the underlying generalization.
This is at the heart of Bayesian thinking. In the scenario above, we can consider the group of Black teenagers (a finite population) to be a random sample from a “super population” of all Black teenagers in unsafe neighborhoods that satisfies a certain distributional characteristic.
For instance, we can build a hierarchical Bayesian model where each Black teenager can be thought of as being drawn from a Normal distribution with a certain variance-covariance matrix. We then add a layer of probabilistic structure by assuming the mean of this Normal distribution has a prior distribution that is also Normal, with a mean of zero and a large variance, reflecting high uncertainty (a “dispersed” prior).
In real life, a prior is often informative. People might have some preconceived notions about encountering Black teenagers in unsafe neighborhoods. The influence of this prior is greater when we lack specific data on the individuals and diminishes as we gather more data about them.
Minimizing the worst-case scenario, or minimizing the maximum possible “loss,” is the minimax principle in statistics that is a decision rule used in the context of statistical decision theory, particularly under conditions of uncertainty.
Formally, let θ be the parameter to be estimated, and δ be a decision rule. The risk function R(θ, δ) represents the loss incurred when using decision rule δ if the true parameter is θ. The minimax principle is defined as:
This equation means that we first determine the maximum risk for each decision rule δ and then choose the rule that minimizes this maximum risk.