Data for Prediction

I read a little bit more about BSEs and maybe I learned a bit about how they could be predicted. Besides lambasting the very notion that past data sets can be incredibly deceptive (I can count on no fingers the amount of days that humanity has been wiped out by a giant meteor, but it can happen) Taleb stated that a data set that is very regular with lots of verifiable data, and tends to be be bell curve worthy after a few data sets, is less likely to have Black Swan Events. Contrastingly, if a small portion of the data set is seemingly erratic then it may be more prone to wild fluctuations.

Awesome Face Hitting Earth

Sample BSE


If this holds true, it would seem the definition of large and small are key. If we think we have a small enough data set, but it is just large enough to generate a bell curve, but too small to generate outliers, then it looks safe. Looking safe is as dangerous as can be. I am am sure that further reading will expose more insight into this, and if it doesn’t that my be a tiny BSE just for me.

Deeper than just this data set estimation, Taleb seems to be stating that we can’t look at numbers, we need an understanding to reduce risk. Let me provide a real example. Take a look at Pidgin’s download stats for 2010. It could be argued either way. This is consistent and could be arranged on a bell curve if you chose your axis’ carefully, or one could argue that April as a lower bound(365,803) and November as an upper bound(1,500,663) are clearly too extreme.

Without understanding why these numbers fluctuate, what use are these numbers in predicting the future? Like most projects for download at sourceforge more frequent updates and the greater usefulness and popularity all contribute to downloads. I wrote that last sentence before checking the Files released page for pidgin, but please notice that November 2010 has 3 releases, and neither March nor April have even 1. Clearly someone with an understanding of open source software communities could have predicted that, because I just did.

An understanding of the software and the community around that software don’t guarantee correct predictions. I easily could have been wrong. For example if some event, unknown to me, happened that caused Pidgin to land on the front page of Digg and Slashdot. If understanding is what it takes to predict the future with accuracy, then how do you know when you know enough?

In Chapter 4 Taleb describes a turkey that gets regular feedings until the day it is slaughtered. If the turkey stopped to ask “why am I being fed?”, rather than simply accept it as part of a greater pattern maybe he could have discerned that he was being fattened up. I hate to blame the turkey, it is hard to think freely when you are raised in captivity or otherwise indoctrinated. For example, look at all the Christian’s and Muslims out there.

Share

I Guessed Right, but Why?

I was right about my simple call center prediction. I cannot provide exact details, but I can say more than 80% of my calls had to do with the Canadian finances. That was an easy one to guess though, I had a holiday and slow predictable work week helping me.

How should I predict tomorrow? I could examine past weeks and expect an easier time because of the holidays. I could look at what I did last year around this time. I could easily do these things and say: that it will be an easy going work week, just like any other, but a bit less work. Such a simplistic prediction would hardly seem in the spirit of this exercise.

I want to be better at predicting the unpredictable. I know it is impossible, but I still plan on learning something. So let’s pick something a little less statistic based. I have been working on two projects with two of my co-workers, These are for the betterment of the company, but because they involve putting forth effort, i do not think that anyone will contact me about them all day if I dot not bring them up.

I either love or hate being cynical, but I will keep doing it because it is easy and frequently accurate.

Share

SOPA Hopefully Not a Black Swan Event

A “black swan event”, as described by Nassim Nicholas Taleb, is an event that is difficult to predict due to rarity, can cause widespread impact, and is explained or justified later in a way that make it seem to have retroactive continuity. I justed started reading his book describing such events. I pondered a bit about things like 9/11, the start of WWII, the invention of the Internet, or the namesake event the discovery of the black swan. Why are these are so difficult to predict (why ornithologists matter enough to name a kind of event is beyond me)?

I pondered about the prediction mechanism intrinsic to the human mind. It is my understanding that for all actions we take, our mind has a model, a prediction, of how it will play out. For most actions this is simple and automatic. Pick up a nearby object. Before you did, you had an estimate of how much resistance that object would provide, maybe some information about how it would feel in your hand. If your actual experience is different, then you will know that something is off. However, your brain primarily exists to decide how you move. As a person moves through childhood they create, and later in adulthood refine, their internal model for movement. Automatically our minds do this with all our movements.

What would happen if this same amount of effort were put in predicting our larger scale decisions. I know many who put their faith in god, or simply accept that they cannot understand the world, rather than try to make a real impact. I see no reason why effort in this regard couldn’t help us develop decision making abilities as refined as our ability to walk down a path.

I will make predictions. Many will be wrong, and those failures will force me to look at new information in new ways. Occasionally I will get some right, this could mean I am lucky or it could mean I am getting better at understanding what makes a good prediction.

Here is a simple prediction of tomorrow: My work in the tech support call center will be more heavily focused on supporting our Canadian financial customers than would normally be expected. I will be busier than past holidays and I will not be asked to do any special tasks. I say this because most of my customers will still be on vacation, and the large amount of Canadian shoppers on Boxing day will stress the kinds of financial systems I support. Most of the managers will be out on vacation so special tasks will be unlikely.

Here is the prediction for the near future: SOPA, if passed in its current form will be a black swan event. There will be chaos in the IT industry and immediately legal and illegal channels will be used to combat it. There will be unpredictable casualties in the form of censorship, commerce disruption, boycotts, maybe even the creation of a non-DNS naming system which could cost untold billions to implement. The IT sector has never been so unified on an piece of American Politics. Even with recent GoDaddy Shenanigans, there is a very small minority of IT companies that are pro-SOPA. With the entire Business Software Alliance against it hopefully it will not pass. If it does, strange things will happen to the Internet, commerce and the flow of information worldwide will be altered because of the amount of American control over DNS. The strange occurrences will catch people who aren’t IT professionals completely off guard.

Share