Friday, July 11, 2014

Programming Languages for Big Data, Part 2

I mentioned the recent study on the relative speeds of programming languages to Tommy Jones, a specialist in natural language processing and fellow member of the Data Community DC, and he, being more industrious than I, dove into the code used by the authors of the paper in question. In their R code, he found gems such as a triple-nested "for" loop inside a "while" loop (instead of the much faster "apply" functions), which made the comparisons pretty useless, at least in the case of R. See Tommy's blog, Biased Estimates, for more details.

Nonetheless, it's a pretty interesting question, and I'd love to see someone who's proficient in all of the languages involved try this test again, using better code. I'm still intrigued by the very high speed of MATLAB/Octave—something that leads Andrew Ng to recommend those languages over R for prototyping—though Tommy pointed out to me that, since R is closer to being a full-featured language, it's more flexible than the former languages.

2 comments:

  1. Nice overview of big data programming languages 👍
    Informative post for learners exploring big data technologies. Thanks for sharing!modern traditional outfits

    ReplyDelete
  2. Great overview of programming language options for big data very informative and easy to understand.
    I especially appreciate how you break down strengths and use-cases for each language.
    Thanks for sharing this useful resource.
    esim turkey

    ReplyDelete