Thursday, May 8, 2014

Why Data Science Needs Statistics

If you've read my earlier posts about why a scientific approach is important to data science, you won't find it surprising that I recommend Jeff Leek's recent post on the Simply Statistics blog, "Why Big Data Is in Trouble: They Forgot about Applied Statistics". Leek, a biostatistics professor at Johns Hopkins, and one of the instructors in Coursera's Data Science specialization, argues that a number of recent big data failures, including that of Google Flu Trends, can be chalked up to a lack of statistical knowledge among the researchers in question. Leek cites sampling, data collection, causal logic, model specification, and sensitivity analysis as areas where a solid knowledge of applied statistics could have prevented serious errors. It's a short but cogent read.

2 comments:

  1. We at COEPD provides finest Data Science and R-Language courses in Hyderabad. Your search to learn Data Science ends here at COEPD. Here, we are an established training institute who have trained more than 10,000 participants in all streams. We will help you to convert your passion to learn into an enriched learning process. We will accelerate your career in data science by mastering concepts of Data Management, Statistics, Machine Learning and Big Data.

    https://www.coepd.com/AnalyticsTraining.html

    ReplyDelete