Thursday, May 8, 2014

Why Data Science Needs Statistics

If you've read my earlier posts about why a scientific approach is important to data science, you won't find it surprising that I recommend Jeff Leek's recent post on the Simply Statistics blog, "Why Big Data Is in Trouble: They Forgot about Applied Statistics". Leek, a biostatistics professor at Johns Hopkins, and one of the instructors in Coursera's Data Science specialization, argues that a number of recent big data failures, including that of Google Flu Trends, can be chalked up to a lack of statistical knowledge among the researchers in question. Leek cites sampling, data collection, causal logic, model specification, and sensitivity analysis as areas where a solid knowledge of applied statistics could have prevented serious errors. It's a short but cogent read.

2 comments:

  1. We at Coepd declared Data Science Internship Programs (Self sponsored) for professionals who want to have hands on experience. We are providing this program in alliance with IT Companies in COEPD Hyderabad premises. This program is dedicated to our unwavering participants predominantly acknowledging and appreciating the fact that they are on the path of making a career in Data Science discipline. This internship is designed to ensure that in addition to gaining the requisite theoretical knowledge, the readers gain sufficient hands-on practice and practical know-how to master the nitty-gritty of the Data Science profession. More than a training institute, COEPD today stands differentiated as a mission to help you "Build your dream career" - COEPD way.

    http://www.coepd.com/AnalyticsInternship.html

    ReplyDelete