I read a little bit more about BSEs and maybe I learned a bit about how they could be predicted. Besides lambasting the very notion that past data sets can be incredibly deceptive (I can count on no fingers the amount of days that humanity has been wiped out by a giant meteor, but it can happen) Taleb stated that a data set that is very regular with lots of verifiable data, and tends to be be bell curve worthy after a few data sets, is less likely to have Black Swan Events. Contrastingly, if a small portion of the data set is seemingly erratic then it may be more prone to wild fluctuations.
If this holds true, it would seem the definition of large and small are key. If we think we have a small enough data set, but it is just large enough to generate a bell curve, but too small to generate outliers, then it looks safe. Looking safe is as dangerous as can be. I am am sure that further reading will expose more insight into this, and if it doesn’t that my be a tiny BSE just for me.
Deeper than just this data set estimation, Taleb seems to be stating that we can’t look at numbers, we need an understanding to reduce risk. Let me provide a real example. Take a look at Pidgin’s download stats for 2010. It could be argued either way. This is consistent and could be arranged on a bell curve if you chose your axis’ carefully, or one could argue that April as a lower bound(365,803) and November as an upper bound(1,500,663) are clearly too extreme.
Without understanding why these numbers fluctuate, what use are these numbers in predicting the future? Like most projects for download at sourceforge more frequent updates and the greater usefulness and popularity all contribute to downloads. I wrote that last sentence before checking the Files released page for pidgin, but please notice that November 2010 has 3 releases, and neither March nor April have even 1. Clearly someone with an understanding of open source software communities could have predicted that, because I just did.
An understanding of the software and the community around that software don’t guarantee correct predictions. I easily could have been wrong. For example if some event, unknown to me, happened that caused Pidgin to land on the front page of Digg and Slashdot. If understanding is what it takes to predict the future with accuracy, then how do you know when you know enough?
In Chapter 4 Taleb describes a turkey that gets regular feedings until the day it is slaughtered. If the turkey stopped to ask “why am I being fed?”, rather than simply accept it as part of a greater pattern maybe he could have discerned that he was being fattened up. I hate to blame the turkey, it is hard to think freely when you are raised in captivity or otherwise indoctrinated. For example, look at all the Christian’s and Muslims out there.

