Illusion of predictability #2

I started a project called “the illusion of predictability” right at the beginning of my PhD. It seemed like an interesting idea. It was about questioning the possibility of using a widely featured statistical tool in economics to improve decision making and prediction; it was about merging academia with reality.

I really believed I was onto something.

I teamed up with a fellow believer, an exceptional psychologist. We used real and knowledgeable people in our experiments. They contributed voluntarily to our quest. It was both exciting and terrifying. The results were polarized and so were the comments.

I would like to thank once again to all the contributors.

My previous post featured some very insightful reactions to the project. Below are some more. Also, I list at the end of this post the main article and the comment articles that followed it, along with our reply to them.

Several recent discussions about our project in the blogosphere:

As mentioned above, now several scientists also have commented on the project through several reply papers published alongside the main article. Then we wrote our own replies to their commentaries. Here are all the discussion papers:

The main article can be found HERE

Scott Armstrong’s comments about the project can be found HERE

Keith Ord’s comments on the project can be found HERE

Nassim Taleb and Daniel Goldstein’s comments can be found HERE

Stephen Ziliak’s comments can be found HERE

Our reply to comments can be found HERE

Illusion of predictability #1

A study I recently conducted with Robin Hogarth has been featured in various blogs, including Harvard Business Review and Reuters.

They also involve some very insightful discussions.

Here they are:

Harvard Business Review – Blog by Justin Fox

Reuters – Blog by Felix Salmon


Simulated experience #1

In a random group of 25 people, what would be the probability that there are 2 people with the same birthday?

This is the famous birthday problem.

We have difficulty in understanding the probabilistic structure of this problem. When I asked this question to 100 university students, the average response I got was “less than 1%.”

The correct answer is actually around 54%!

The analytic solution of the problem is not straightforward. One has to think in terms of combinations. One has to make several complex calculations.

And these don’t come naturally to us.

So what to do?

If you think about it, experiencing frequencies of outcomes would be actually much easier. This would mean going out there and meeting many many groups of 25 people and observing if there are matching birthdays in each group.

Sounds like a hard task to accomplish though.

But wait; we could use simulations.

Check THIS site for instance.

What you do here is to generate 25 birth dates and see if there are two that are the same.

Then reset and meet a new group of 25. Then again. Then again. Then again…

Once you meet a dozen groups like this, you’ll notice that in around half of them, in fact, there is a match.

Consider now the power of simulations and the possibilities of experiencing the outcomes of complex and relevant probabilistic scenarios, such as investment decisions, pension schemes, insurance regimes, with the help of simulations. You would gather much more insight on the problem and possibly make better decisions.

Well… Try with a number larger than 25 this time.

You’ll be surprised.

The invasion

Imagine that you are a general at a military compound near the border. From your vast experience, you know that when enemy troops mass at the border, the probability of an invasion towards your territory is 75%.


You don’t have direct access to information about enemy troops and you have to rely on your intelligence sources. As it turns out, when an intelligence report states that the enemy troops are massing, they are really there!


Your sources tell you that enemy troops are at the border. What is the probability of an invasion?

The nice thing about this problem is that it can be applied to many fields, including business or finance, where news about an event gives clues about the occurrence of another underlying event, which is what you are actually interested in.

The answer, though, is unsettling.

No… It is not 75%.

In fact, it could be virtually anything between 0% and 100%.

When one thinks of 75%, one is making the wrong assumption that the intelligence draws a random enemy action. However, perhaps the enemy is cunning, in that the intelligence sources see their troops only when there is no invasion planned (then the answer would be 0%).

This question is taken from a 1980 article by the famous social scientist Hillel J. Einhorn and shows us how our judgments can be driven by true yet worthless and irrelevant information.

The cloud

When someone tells you that a certain event “X” causes another thing “Y”, be suspicious.

There could be indeed a positive relation between the two. But, first of all, causation is a complex issue and it can go one way or the other.

It could be that actually “Y” is causing “X”.

Even if the direction of causation is well established, there is a further issue to be considered. How important is “X” as a cause of “Y”.

In their 2008 book The cult of statistical significance, Stephen Ziliak and Deidre McCloskey talk about the “oomph” effect to express the essential aspect of determining whether or not a certain phenomenon is an important determinant of an outcome given a certain context.

Consider for instance the graphs A and B. In which one would you say X is a more important determinant of Y? Although the average effect of one more X on Y is the same on both occasions, the cloud of noise is much bigger in B, suggesting that in this case most of what ultimately causes Y is relatively independent of X.

Or consider the following.

Smoking is bad for you. This is a fact. So, to a friend that, say, goes to a conference, you advise not to smoke too much.

Given the same relation between health and smoking, would you give the same advice to someone going to fight in a war?


How does the stock market work? Is it predictable? Which new firms will succeed next year? Which ones will fail? Which products will be popular? Why exactly? Will you lead a healthy life? How much control do you have over it?

More importantly… How much of all these will depend on chance? What is the weight of randomness in all of this?

Somehow, the understanding of randomness does not come naturally to us. And by us, I mean humans.

Is an outcome considered random because we do not know how it was generated? Or was it really randomly conceived? A bit of both maybe?

How does one characterize or even recognize chance events?

The problem is that, possibly, we are not yet equipped with the innate statistical sophistication that is needed to understand such issues intuitively. Humans have been living on earth for a very… very long time. The probability theory that deals with the understanding of randomness is only a few centuries old.

The first discussions about such concepts are found in the letters that went back and forth between Blaise Pascal and Pierre de Fermat. Both brilliant men, their primary concern was to decipher and possibly better perform in gambles. The letters date back to 1650s.

Hence, given the relative recency of our adventure with randomness, it might be that we will need several centuries more to evolve and fully grasp its nature and consequences.

We are still primitive in the face of randomness.


Transparency of a description denotes how correctly it is perceived and accurately understood by people. If I tell you that, as a side effect, the use of a certain drug increases the probability of say, becoming completely bold by 100%, you would have second thoughts about even touching the pill. But you are missing a crucial piece of information there: the base on which this statistic was calculated. For instance, an increase from one in a million to two in a million would also constitute a 100% increase. The drug seems less scary now, doesn’t it? Hence, the description that includes the base rates is more transparent to the human mind in this case.

Gerd Gigerenzer, Wolfgang Gaissmaier, Elke Kurz-Milcke, Lisa M. Schwartz and Steven Woloshin, in their 2007 report published in Psychological Science for Public Interest (downloadable here), dug deeper into the issue. At one point, they talk about why abortions in England and in Wales increased dramatically around 1995. The reason was that the birth control pills were rumored to have an undesirable side effect, expressed in a non-transparent way, which created an overreaction against their use. Ironically, it was found that the abortion procedure increases even further the probability of facing the same side effect, hence the importance of transparency of a description, especially in medicine.

The most curious cases are medical tests. They can make two types of mistakes: finding a sick guy healthy or a healthy guy sick. They are commonly designed to avoid more the former one (finding a sick guy healthy). On the other hand, one can observe more cases where they find a healthy person sick, especially if there are much more healthy people than sick ones. Consider a test that always correctly identifies a sick person and fails to find a disease (correctly) in a healthy person 90% of the time, i.e. 10% of the time it erroneously claims that a healthy guy is sick. Say that the probability that a random person in the population has the disease is 1%. We grab a random guy from the population and test him. The test says he has the disease. What is the probability that he is really sick then?

Answer: Take the test again.

False hopes

False hopes are everywhere. Apparently there are devices that make you thinner while you are asleep. These will only be beaten by the ones that get you skinnier as you eat.

I get emails, at least once a month, telling me to forward it to ten people if I want my wishes to come true. I remember before they used to have a context to persuade me or maybe even scare me somehow. The last one just had a tweety picture on it, yet it had passed through hundreds of wishful individuals.

Politicians use false hopes all the time and there are still fortune tellers in TV. In Spain, there are channels dedicated to them  24 hours.

I had read somewhere in the Internet that in Las Vegas, slot machines in the airports have higher stakes, designed to get peoples’ hopes up and attract them. I do not know about that, but I stumbled upon something interesting on online casinos: They are full of false dreams.

Try it out. Go to an online casino that allows you to play roulette in a virtual table with virtual money. Apply the gamblers’ fallacy: bet progressively on one color, always making sure that you cover your losses by a small margin in case you win. I tried a few years ago and in just four hours my 25 virtual dollars grew to become 4000. The virtual roulette had a memory and it wanted me to win.

I am not immune to false hopes. I ended up buying 25$ worth of chips with my real money the next day. Lost it in less than a minute.

I was hopeless.