Data Dredging | Geckoboard
Let's Talk Data
Check out our new Medium blog for regular articles and a weekly news roundup on all things data. Read more

Data Dredging

Data dredging is the failure to acknowledge that the correlation was in fact the result of chance.

Data Dredging

Tests for statistical significance only work if you’ve defined your hypothesis upfront. Historically, this has been a problem with clinical trials where researchers have ‘data-dredged’ their results and switched what they were testing for. It explains why so many results published in scientific journals have subsequently been proven to be wrong. To avoid this, it’s now becoming standard practice to register clinical trials, stating in advance what your primary endpoint measure is.

Related Reading:

xkcd cartoon: Do green jelly beans cause acne? Statistical errors and P values explained: Why P values aren’t as reliable as many scientists assume How to avoid P Hacking and false positives in research studies