Are YOU a Predictably Irrational Analyst?

Posted by bkloss | Analytics,YouTube | Saturday 21 February 2009 2:31 am

Let’s begin the conversation with a little attention test.  Please count the number of passes the white team makes in the video below.


Yep, it’s easy to miss what’s right in front of your face.  If you can’t trust your finely tuned visual perception system to spot bears with an affinity for Michael Jackson, perhaps you’re overlooking other important things in your day-to-day life as a data analyst as well.

Last night I attended a talk by Dan Ariely , author of the New York Times best seller Predictably Irrational.  Throughout the talk, Dan offered evidence that the human brain a hardwired to make irrational choices that seem intuitively logical.  Dan provided a multitude of examples drawn from cognitive and social psychology and connected them to the seemingly irrational decisions made in economics.  Seemingly irrational that is…

once the listener was made aware of the biological and environmental factors that work against the voice of reason in subtle and undetectable ways.

I’m going to extend Dan’s ideas to the sphere of data analysis to explore how some of our innate biases influence results.

Brain Jam

Dan spoke about a study conducted in a grocery store.  People were given the opportunity to sample jam from two tasting booths.  Customers were presented with a booth sampling 6 or 24 types of jam depending on which store entrance they walked in through.   The booth with 24 different samples did a great job of drawing people in.  60% of those who entered the store stopped to sample the jam vs. 40% for the booth with 6 flavors.  At both booths, the people who stopped for a taste-test tried around two different types of jam before continuing to shop.

Which group do you think bought jam from the store shelves more often after leaving the booth?  The group that sampled from the 6 jam table bought jam off the shelf 30% of the time.  The group that sampled from the table of 24… 3%.  On average, 3% of customers have jam on the list when they walk in, so the 3% doesn’t really count anyway.

So, why the 27% difference?  The more complicated a decision, the more likely someone is to go with the default- in this case, not buy jam.  As analysts, we value parsimonious, intuitive models.  We like one solid number to go off of, be it P-values or R2 , so we become locked in tunnel vision. We have to make a decision about which model is better so there’s a little voice whispering in the background, softly telling us to ignore crucial factors and make things ‘easy’.

Did you check the assumptions?

Did you look at the distributions of the input variables?

Many practitioners fall into the trap of model first, ask questions later.  Once we have found a significant set of predictors, we then go back and say… well that distribution looks ‘pretty’ normal or  ‘It’s significant if we impute missing values with the mean’.  Of course we all know this is the wrong approach.  Don’t let yourself fall victim to irrational decisions in the face of a complex model (sorry neural nets ;) , not this time!)

Abstraction = Cheating

Dan was asked to explain the prevalence of stock brokers using *fuzzy math* to inflate their quarterly bonuses.  Now it’s easy to see the conflict of interest that arises from performance based bonuses.  There are, however, a number of more subtle factors that lead to an increase or decrease in the tendency of people to cheat.

Dan set up a study where subjects were asked to complete 20 pen and paper math problems in a short period of time.  After the time elapsed, subjects scored their own quiz.  Here’s the twist, for each correct answer, participants either received a dollar or a token that they exchanged for a dollar three feet away from where they received the token.  I concluded that it shouldn’t make a difference if they were in the token or the dollar group, essentially they are the same thing.  Wrong!  Those who got tokens for right answers had MUCH higher self-reported scores than subjects who got dollars.  Does the presence of tokens increase mathematical ability?  I’m afraid not.  The token group was cheating, plain and simple.

It turns out that people are more willing to cheat if a level of abstraction is involved.  The more steps that distance a person from the money or goal he/she seeks, the more likely they are to cheat the system given the opportunity.  People were willing to cheat in the study because they were getting tokens, not dollars.  Perhaps even more interesting is that a great number of subjects in the study were willing to cheat just a little bit. It was not the case, as many people assume that a few people cheated by a wide margin.

In data analysis, we take real world components of a business problem such as money or time and add a level of abstraction.  Dollars become probabilities and people turn into ROC curves.  With each added step of abstraction, it becomes easier for an analyst to justify fudging the numbers.  After all, it’s not real people being turned down for a loan by your credit scoring model, it’s just a slightly more significant set of predictor coefficients.

Moral of the story- test your new found prediction system on a validation set!  Analysts have the tendency to create predictive model that are shaped and bias toward achieving the end goal.  If you want to ensure that you haven’t tweaked the process to get a significant outcome, expose your model to a new set of similar observations and see if you come out with the same result.

People who cheat are not born with a mental defect. Instead,  the are created, in part, by abstract situations.  Would you deflate a P-value?  Given the right set of circumstances, it’s more probable that you’d like to believe.

How do you stop people from cheating?  Inflate their awareness of a value system.  People were much less likely to cheat in previous study if at the beginning they swore on a bible.  Yes, this method reduces the likelihood of cheating for all subjects, from Born again Christian to self-reported Atheist.

Similarly, remember that your model has real world consequences on the process or people it predicts.  That may hinder your hardwired inclination to act like a hedge fund manager.

Free Marketing Tip

Although somewhat unrelated, this last paragraph outlines a cool marketing tip that Dan shared and I can’t help but pass along.  For those of you who aren’t in marketing, this tip can be adapted for the task of picking a wingman for Saturday night.

Consider the following subscription options from an Economist Magazine landing page.

When Dan asked a class of grad students what they would buy, how many do you think chose print access for $125?  Nobody of course; grad students can count!  Here’s the buying breakdown:

Nice work Economist- most people went for the costly option.  So What happened when the ‘silly’ print only choice was excluded?

Wow!  Looks like the previous addition of a seemingly inconsequential choice tipped the scales in your favor.  Without the print only option, the grad students overwhelmingly prefer the low cost online access plan.  Hey, what do you expect- they’re living on a stipend.

Here’s the key takeaway.  If you want to make an option more attractive, include a similar option that is slightly less desirable.

You may be saying gee, that’s great but how do I increase my chances to pick up the girl of my dreams Saturday night?  Easy- simply bring along a slightly less desirable version of yourself.  OK – I admit it – that may be pretty difficult to pull off.  But honestly people, if you want dating advise, you’ve come to the wrong blog.

I recommend you read Dan’s title ‘Predictably Irrational’ or any of my recent favorite books listed below.

1 Comment »

  1. Comment by osris — February 21, 2009 @ 3:39 am

    Very useful ideas to remember. Most often we want data mining to do wonders and we keep overcooking the model. Simple is significantly better.

RSS feed for comments on this post. TrackBack URI

Leave a comment