Being deceived by randomness 2010-02-04 Craig McQueen Business Intelligence (0) An important part of getting value out of a Business Intelligence system is interpreting results correctly. We can often be fooled into attributing cause to random results. Consider the following graph of sales over time. It looks like something caused Sales to drop for three months. Management may scurry around trying to find out what happened during those months. Inevitably some event that happened around the same time will be picked as the cause and blame will be assigned. Also note that the opposite happens too. Sales could be way up for three months and there would be a line of people willing to take credit. Any set of time series data is bound to have a string of events that looks like order – typically called a ‘streak’. Examples are everywhere from home run streaks, string of loses of a hockey team and investment performance results of a mutual fund manager. Certainly skill plays into the results but elements outside our control introduces randomness into the results. Stock markets are affected by many elements such as world events and thus are chaotic and very unpredictable. This introduces a significant random element to a mutual fund manager’s performance. There are other examples outside of time series data. Implied order from random events can be used to a marketers advantage. Consider an unscrupulous investment newsletter trying to obtain subscribers through direct mail. They pick four stocks and predict whether the price will be up or down at the end of the next 6 months. There are a total of sixteen possible outcomes in this scenario. The investment newsletter prepares 16 different mailings, each with predictions of the 16 different outcomes from the scenario. One of them will have correct predictions for all four stocks at the end of the six months. If 10,000 people are sent the mailings, that means 625 people will receive a mailing that correctly predicts the stock direction of all four stocks. This would a a seemingly miraculous feat that motivates those 625 people to sign up for the newsletter. People seek to find reason in randomness. We are tricked because random data will have a sample set that looks like order. When collecting and interpreting data be careful about trying to assign conclusions to data that otherwise may be random. Specific considerations include: Use result metrics as a signal to investigate further cause but be sure to consider “random event” and one of the potential causes Use metrics that lead to results in addition to the results themselves. For instance, number of sales calls leads to revenue results If you have enough data, use statistical techniques to demonstrate that an event is statistically significant For further discussion on randomness see the book “The Drunkard’s Walk: How Randomness Rules Our Lives”.