Data Stories That Should Scare You

Dominic Vincent Ligot

8 November 2018

If you’re looking for a good cheap big data scare, here are a few suggestions picked out by our over-fitted recommendation engine. Although the movies vary in scope and tone, they are each quite horrifying in their own little way.

Fortunately these are all just stories and bear no resemblance to real life.

Or do they?

Invasion Of The Excel Army

At first glance, everything looked the same. It wasn't. Something evil had taken possession of the company.

Synopsis: The story depicts a rabid invasion of spreadsheet analysis that begins with a single report in a finance department. Pivot Tables, macros, and cell formulas spread from spreadsheet to spreadsheet and assimilate the memories and personalities of each manager and decision maker in the company. Little by little, a company VP discovers this “quiet” invasion and realizes the entire company is being run by spreadsheets.

Review: The movie starts slow, but it quickly grows on you as the eerie disposition of otherwise benign analysts become more ominous. The audience will easily relate with the VP protagonist as he uncovers the conspiracy, but will quickly abandon hope for him when he starts to fight to rid the company of data duplication and impose spreadsheet governance.

The Markov Chain(saw) Massacre

The events of that day were to lead to the discovery of one of the most bizarre crimes in the annals of data science.

Synopsis: The film follows a group of analysts in a business intelligence team of a company who fall victim to a team of sadistic, cannibalistic 3rd-party statisticians while in the middle of a critical audit.

Review: Although quaint, the title is actually misleading since there are no actual Markov Chains to be found in the entire movie – but that’s the least of the movie’s problems. The primary conflict build-up centers around the pervasive use of brute force random forests and logistic regression without heed for proper variable selection and significance testing. The movie repeatedly stresses the lesson that the use of statistics without domain knowledge or common sense will always be prone to error while leaving the audience guessing which variables will survive till the end of the movie.

Trivia: This film was banned outright in several countries, and numerous theaters later stopped showing the film in response to complaints about its denigration of statisticians.

Night Of The... Living Reports

George, those dead reports that won’t stay dead!

Synopsis: The story follows a group of IT managers who are trapped in an office while doing an audit of their company’s data extracts. The suspense begins when they start to tally that there were data extracts being sent to six hundred and sixty six (666) executives – which was strange for a company of only two hundred (200) employees.

Review: Compelling start, but unfortunately an anti-climactic finish. The IT team’s decision to do a “Big Data” project to migrate the data extracts to an open-source distributed file system fails to inspire when the reality is revealed that [Spoiler Alert] 80% of the 666 reports were the exact same table, albeit with one or two columns added or just aggregate versions of the same data. The unexpected twist at the end might be worth sitting through the three hour run-time though - tip: strong stomachs are required when the main characters discover the fate of the original requesters of some of the legacy reports.

Bonus: Watch out for a brief outtake after the end credits – a popular digital transformation evangelist gets his head blown off by a sniper shot in mid-sentence while saying the word “Hadoop”.

Rosemary's Data

They're coming to get you Rosemary. It's too late.

Synopsis: A data analyst fears that her manager may have made a pact with beings from other departments believing he may have promised them access to her database to be used as an unofficial data source in exchange for political favors and a promotion.

Review: Here’s one part data fable, two parts political and organizational commentary. If you get a sick sense of enjoyment watching Rosemary sink into the depths of paranoia at her manager’s dubious moves and reacting by making her database queries and ETL scripts more elaborate but all to no avail this is the fright-trip for you.

The truth: regardless whatever data engineering or data science mastery your team has, analysis and information will always be at the mercy and will of the political gods in any organization.

Doc is the founder of Cirrolytix where he helps organizations gain competitive advantage through data and analytics. At home he's also a frustrated horror movie critic.

This article first appeared on Linkedin. The wonderful dated b-movie art courtesy of Pulp-o-mizer.


