What Is R?

Ranjit Mishra
17/07/2017

R is fast catching up as a must-know language because of the popularity of Data Science skill.

R is a computer programming language which is particularly well suited to handling and sorting the large datasets associated with Big Data projects.

The software environment used to create code in R is open sourced, meaning it is free to download, anyone can use it, and there is a plethora of guidance and advice available on how to use it most effectively. However commercial distributions are also available, which often offer additional proprietary functionality or support packages.

Named from the initials of the two men who first developed the language at the University of Auckland, Robert Gentleman and Ross Ihaka, R has become very popular in recent years and is continuing to become more so, due to the explosion in analytic activities being carried out by business.

R's strengths as a statistical programming language draw from the fact it is designed from the ground up to facilitate matrix arithmetic, carrying out complex, often automated calculations on data which is held in a grid of rows and columns. R is very good for creating programs which can carry out calculations on these datasets, even when the datasets are constantly growing in size at an ever-increasing rate, and producing real-time visualisations based on this data.  

Its capability of producing these visualisations is another core strength of R. Its designers realised that visualisation was key to being able to understand the complex datasets that are being explored, so incorporated functionality to translate data into charts, graphs and complex multi-dimensioned matrices, as well as many user-defined methods of visualisation, into its core.

Online, R code is everywhere although you won't see it, as it's always hidden behind pretty graphical interfaces. But when you use Google, Facebook or Twitter you are almost certainly executing R code running on the servers of those organisations. In fact, it is often cited as the most widely used programming language for data science. APIs exist for almost all of these services, allowing applications written in R to access data from these outside sources and include it in their own analytics routines.

Thanks to this huge user base, just about every function that you might need for data analysis is available, often through open source extensions (known as packages) made available by the community. It is also capable of executing code written in other languages such as C++ or Java, so resources coded in those languages can be made available. Because it can be compiled to run on any major operating system, R code can easily be ported between Unix, Windows or Mac environments.

0 0

