Friday, March 14, 2014

Why Scientists Make Better Data Scientists

Have a look at this blog post by Mike Walker on why it's useful for data scientists to have a scientific background. The link came to me in a list of "featured articles" I receive weekly from Data Science Central. The tl;dr is that analysts without scientific training (the author singles out those with undegraduate business degrees) lack the tools for distinguishing correlation from causation. This leads to a range of maladies, including spurious correlations, cherry-picking data, and stringing "disconnected facts" together to construct a fallacious narrative. Walker acknowledges that not all successful analysis requires starting out with a hypothesis, but stresses that there are scientifically rigorous ways to explore data for unexpected relationships, such as A/B testing.

I find this refreshing, after spending a great deal of time lately looking at job ads for data scientists: most ads focus on experience with specific software packages, rather than experience conducting rigorous research. I suppose the former is more of an objective measure than the latter, but I'm not sure how useful it is to hire based on what applications a person has used before, especially in a profession where the start of the art changes rapidly. Another problem is that people have started slapping the word "data scientist" on a wide variety of jobs: I've seen it applied frequently to database architect positions, or even to positions that have more to do with software development than data analysis.

At the moment, all of this matters to me because the contract on which I was working ended last December, and I'm now looking for a job again. I've had two good interviews, but I'm finding it very hard to break into a profession with a background different from traditional data analysts and business analaysts. One thing I have learned is the power of networking: one of my interviews came from a contact my wife made while carpooling, and the other resulted from my submitting a resume to a small-business group recommended by a former co-worker. (Oh, and if anyone has any good job leads, I'm happy to network here, too. :)


Those of you who frequently visit my links page might have noticed that I've updated it quite a bit over the past few weeks, particularly in the sections covering online courses ("Self-teaching Resources" and "Formal Learning Resources"). Coursera and Udacity have some interesting new offerings that you might want to check out. I'm also planning to add a section listing portals and other commercial websites, and I need to go through all the links to make sure the information on them is up to date. As ever, if you have any suggestions for additional resources, please let me know!

2 comments:

  1. We at Coepd declared Data Science Internship Programs (Self sponsored) for professionals who want to have hands on experience. We are providing this program in alliance with IT Companies in COEPD Hyderabad premises. This program is dedicated to our unwavering participants predominantly acknowledging and appreciating the fact that they are on the path of making a career in Data Science discipline. This internship is designed to ensure that in addition to gaining the requisite theoretical knowledge, the readers gain sufficient hands-on practice and practical know-how to master the nitty-gritty of the Data Science profession. More than a training institute, COEPD today stands differentiated as a mission to help you "Build your dream career" - COEPD way.

    http://www.coepd.com/AnalyticsInternship.html

    ReplyDelete