For learning any things one must always start from basics. So here we start with the basics of R and R-studio, R environment, basic operations, various data types, R scripts and saving the work in R, My first R program, R functions, Most common errors in R, R-help and much more.
Why R ?
The reason why we want to learn R is that it’s a very easy language to learn. That is if someone starts learning R and if he puts real good effort for one or two weeks he can start doing analytics. Its learning curve is fairly good. There are various levels of learning R, like for one to be a very good R developer, it takes some time, but to do some essential analytics, few basics are sufficient to learn in R, you don’t need to go really very deep. To get started in R is very easy. Generally, for learning any good data analysis tool like SPSS, R, SAS, or any other tool for that matter, there are three major steps, which are mentioned below.
First one is to get an understanding of the basics, the environment, the syntax, how to debug, where does the error occur, where does the output come, etc. We should not get lost in the whole environment of language or the tool. We need to be very confident about the tool first. Though we are not writing big codes, we should know, what results to what.
Then as a data analyst, one should be good at data handling. It is very crucial to get data from various sources, merging it and preparing the data. Preparing the data for the data analysis is as important as doing the data analysis. We cannot always expect to get fresh data given by somebody, kept at one target place. We might have to fetch the data from different resources, prepare the data, create new variables, create new data sets, filter the data, etc. All this comes under data handling, which is considered as the second step.
The third step is to understand the important functions, basic libraries, perform analysis, etc.
These three are the major steps that a data scientist or a data analyst must focus on while learning a data tool. This session includes the first step – basic, environment, coding syntax, etc.
What is R ?
R is a programming language for data manipulations, statistical computing, graphs, data analytics, etc. Mostly it is used for statistical computing and it has good graphics capabilities. These two are the major areas where R is prominently used.
R is a programming environment which is commonly known as a language rather than a tool. For example, if somebody says SAS, we look at it as a tool, whereas, if somebody says R, we should understand it as a language with excellent graphical capabilities. It contains a lot of statistical algorithms which are not yet available in other tools. This is because R is an open source while others are not. Research scholars, while inventing something, they write in R. Most of the prominent university professors recommend R to their students. Many new algorithms are written in R.
R is a really comprehensive data analytical tool. It can connect to any type of database, it has good visualization graphics, almost all the statistical algorithms are available. Many more solutions in big data, data mining, machine learning, and data visualization are coming up in R. So it is fair to say that R is a comprehensive analytical tool.
Installation of R
- Go to the R homepage
- Locate the download link http://cran.r-project.org/
- Select the relevant version & download it.
- Install it by executing the .exe file.
What is R studio?
RStudio is for the people who do not really like the command line based interface. I will show you what is R and then we will see what RStudio is.
R is a command line based interface. That is, everything you need to do is enter a command and there will not be much usage of the mouse. Not everyone likes command line interface. So there is this new IDE interface created over R that is called RStudio.
R has all the algorithms. RStudio makes navigation and coding in R very easy. If we want to do any version control, project management, see what are the files that are involved, what is the code you have written, R studio makes them easy.
RStudio is not a tool, but a kind of a skin built on R. R is kind of a very raw programming interface. There are many shortcuts in R studio. We don’t need to write commands every time. If you need to download or import some data, there are shortcuts, we can use GUI in R studio. And this is how R studio makes coding and the version control is also very easy and efficient here.
How different is R from R studio?
They are not different. They both are one and the same. R is the core one, i.e., any command written here in R studio is thrown into R.
If we are writing something like print(co2 ), which is a small command to print the data set, where co2 is a dataset, it is sent to R in the background and then the output is fetched and shown in R studio.
RStudio is just an IDE. IDE stands for Integrated Development Environment. It is kind of an environment built for making coding easier on R. We actually use R studio for coding. Installing R studio again is easy. It is an open source. It can be downloaded from the R studio website and then install it.
- Go to the R-Studio homepage and locate the download link https://www.rstudio.com/
- Select the relevant version & download it
- Install it by executing the .exe file
After installation, let us start with our programming. The programming guidelines will follow in the upcoming sessions. In the next section, we will be discussing about R Environment.