Friday, July 11, 2014

Programming Languages for Big Data, Part 2

I mentioned the recent study on the relative speeds of programming languages to Tommy Jones, a specialist in natural language processing and fellow member of the Data Community DC, and he, being more industrious than I, dove into the code used by the authors of the paper in question. In their R code, he found gems such as a triple-nested "for" loop inside a "while" loop (instead of the much faster "apply" functions), which made the comparisons pretty useless, at least in the case of R. See Tommy's blog, Biased Estimates, for more details.

Nonetheless, it's a pretty interesting question, and I'd love to see someone who's proficient in all of the languages involved try this test again, using better code. I'm still intrigued by the very high speed of MATLAB/Octave—something that leads Andrew Ng to recommend those languages over R for prototyping—though Tommy pointed out to me that, since R is closer to being a full-featured language, it's more flexible than the former languages.

12 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. We at COEPD provides finest Data Science and R-Language courses in Hyderabad. Your search to learn Data Science ends here at COEPD. Here, we are an established training institute who have trained more than 10,000 participants in all streams. We will help you to convert your passion to learn into an enriched learning process. We will accelerate your career in data science by mastering concepts of Data Management, Statistics, Machine Learning and Big Data.

    http://www.coepd.com/AnalyticsTraining.html

    ReplyDelete
  3. Thank you for your post. This is excellent information. It is amazing and wonderful to visit your site.
    tail spend management

    ReplyDelete
  4. very nice information provided thanks for sharing for more ,do visit at
    Data Science Training in Texas

    ReplyDelete
  5. Really loved reading through your article. Wonderful job you have done by sharing your thoughts and educating who are lacking in this content.
    Data Science Online Training in USA
    Data Scientist Online Training in Australia
    Data Science Online Training in Singapore

    ReplyDelete
  6. We at Coepd declared Data Science Internship Programs (Self sponsored) for professionals who want to have hands on experience. We are providing this program in alliance with IT Companies in COEPD Hyderabad premises. This program is dedicated to our unwavering participants predominantly acknowledging and appreciating the fact that they are on the path of making a career in Data Science discipline. This internship is designed to ensure that in addition to gaining the requisite theoretical knowledge, the readers gain sufficient hands-on practice and practical know-how to master the nitty-gritty of the Data Science profession. More than a training institute, COEPD today stands differentiated as a mission to help you "Build your dream career" - COEPD way.

    http://www.coepd.com/AnalyticsInternship.html

    ReplyDelete
  7. Good blog and I never get bored while reading your article
    because, they are becomes a more and more interesting from the starting lines until the end.
    Data science online training in Hyderabad

    ReplyDelete
  8. Hadoop concepts, Applying modelling through R programming using Machine learning algorithms and illustrate impeccable Data Visualization by leveraging on 'R' capabilities.With companies across industries striving to bring their research and analysis (R&A) departments up to speed, the demand for qualified data scientists is rising.
    data science training in bangalore

    ReplyDelete
  9. mytectra placement Portal is a Web based portal brings Potentials Employers and myTectra Candidates on a common platform for placement assistance.

    ReplyDelete