R Programming

Course Duration : 60 hrs + Case Study

About R Programming

R is the most powerful Statistical Analysis Software according to researchers ,data scientists and analytics Professionals.
R is an Open Source Software available under GNU Project. R works on various operating system (Cross-Platform).
R has the most superior graphical capabilities for Data Visualization and Reporting.
R has the Compatibility and Readability for all file types.
R has most active User Community.

Course Overview

This advanced-level course focuses on the following key areas:

  • Creating datasets and Reading Datasets from other sources (TXT,EXCEL,CSV,SPSS,SAS)
  • Data Manipulation and Management
  • Basic Statistics in R [Measures of Central Tendency, Dispersion, Correlation, Regression]
  • Graphics in R [Base and Advanced Graphics]
  • Advances Statistics in R [Building Models in R-Linear Regression, GLM Regression, Logistic Regression]

What We Offer

Training under the guidance of 20+ years experienced Data Scientist with post graduation from IIT, PhD from Boston University, and 40+ research papers on Data Science.
After training, Internship at our Development Partner’s house (Ideal Analytics/ ArcVision) in real-time/live project work.
Case studies on real industry data
Classroom training with flexible timing
Customized/On-demand training
Unlimited access to exclusive Study Materials on Cloud

Chapter-1: Getting Started

1.1 Learning objectives
1.2 Download and Install R and R Studio
1.3 Working in the R Windowing Environment
1.4 Install and Load Packages


Chapter-2: Basic Building Blocks in R

2.1 Learning Objectives
2.2 R as a Calculator
2.3 Work with variables
2.4 Understand Data Types
2.5 Store Data in Vectors
2.6 Call Functions


Chapter-3: Advanced Data Structures in R

3.1.Learning Objectives
3.2.Create and Access Information in Data Frames
3.3 Create and Access Information in Lists
3.4 Create and Access Information in Matrices
3.5 Create and Access Information in Arrays


Chapter-4: Reading Data into R

4.1 Learning Objectives
4. 2 Reading CSV Files
4.3 Understanding Excel is not easily accessible in R
4. 4 Read from Databases
4.5 Read Data files from other Statistical Tools
4.6 Load binary R files
4.7 Load Data included with R
4.8 Scrape Data from the web


Chapter-5: Making Statistical Graphs

5.1 Learning Objectives
5.2 Using Datasets for creating Graphs.
5.3 Making Histograms , Bar graphs , Line graphs,Scatterplots,Boxplots etc with Base Graphics
5.4  Introduction to ggplot2
5.5 Histograms and density plots with ggplot2
5.6 Scatterplots with ggplot2
5.7 Box and violin plots with ggplot2
5.8 Creating Line plots
5.9 Control colour and shapes
5.10 Add themes to graphs


Chapter-6: Basics of Programming

6.1 Learning Objectives
6.2 The Classic “Hello World” Example
6.3 Basics of Function Arguments
6.4 Return a Value from a Function
6.5 Flexibility with the do call
6.6  If Statements for controlling Program Flow
6.7  If-Else Statements
6.8 Multiple checks using Switch
6. 9 Checks on entire Vectors
6. 10 Check Compound Statements
6. 11 Iteration- for and while loop
6. 12 Control loops with Break and Next


Chapter-7: Data Munging

7.1 Learning Objectives
7.2 Repeating Matrix Operations – the apply function
7.3 Repeating List Operations
7.4 The mapply function
7.5 The aggregate function
7.6 The plyr package
7.7 Combining Datasets
7.8 Joining Datasets
7.9 Switch storage paradigms


Chapter-8: Manipulating Strings

8.1 Learning Objectives
8.2 Combine String together
8.3 Extract Text


CHAPTER-9: Basic Statistics

9.1 Learning Objectives
9.2 Drawing numbers from Probability Distributions
9. 3 Summary Statistics-Mean, Variance,SD,Correlation
9.4 Compare samples with t-tests and Analysis of Variance


CHAPTER-10: Linear Models

10.1 Learning Objectives
10.2 Fit simple Linear models
10.3 Exploring the Data
10.4 Fit multiple Regression Models
10.5 Fit Generalised Linear Models(GLM)
10.6 Fit Logistic Regression
10. 7 Fit Poisson Regression
10.8 Analyze Survival Data
10.9 Asses Model Quality and Residuals
10.10 Compare Models


CHAPTER-11: Other Models

11.1 Learning Objectives
11.2 Select variables and improve predictions with elastic net
11.3 Decrease uncertainty with weakly informative priors
11.4 Fit Non-Linear Least Squares
11.5 Splines
11.6 Generalised Additive Models (GAM)
11.7 Fit Decision Trees to make a Random Forest


CHAPTER-12: Time Series Analysis

12.1 Learning Objectives
12.2 Understanding ACFs and PACFs
12.3 Fit and Assess ARIMA Models


CHAPTER-13: Text Mining

13.1 Learning Objectives
13.2 Text Extraction & manipulation
13.3 Sentiment Analysis
13.4 Social Media Analytics- Case Studies


CHAPTER-14: SQL & R Integration

14.1 Concept of SQL
14.2 SQL Operation
14.3 SQL Sever and R Studio Integration
14.4 Producing statistical graphs charts by importing data from Sql Server


CHAPTER-15: IntroducTion to Machine Learning

15.1 Statistical learning vs. Machine learning
15.2 Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
15.3 Concept of Overfitting and Underfitting (Bias-Variance Trade off) & Performance Metrics
15.4 Types of Cross validation(Train & Test, Bootstrapping, K-Fold validation etc
15.5 Recursive Partitioning(Decision Trees)
15.6 Ensemble Models (Random Forest, Bagging & Boosting)
15.7 K-Nearest neighbors

We have various case studies based on different industries. You can choose the case study as applicable for you.

Case Study 1: Regression Analysis

How to assess if you are paying correct price or not while buying a property?
Price is very important function for any business. Correct price can create a real gap between profit and loss. In this case study we will take an example of property pricing to gain a deeper understanding of regression analysis.

Step – 1: Data Preparation
A. Checking the outlier
B. Checking Missing Values and how to treat them.
C. Basic bivariate and univariate analysis i.e. checking correlations, how the variables are distributed.
Step – 2: Principle Component Analysis
Step – 3: Traditional Regression Analysis with variable selection


Case Study 2: Marketing Analytics

Being a key decision and strategy maker on an online retail store that specializes in apparel and clothing, how by establishing analytics practice opportunity to improve PnL could be figured out. Background of behavioural analytics – How human brains follow involuntary pattern (behave like other similar people around them) and the detection of the pattern is preciously the idea behind marketing analytics.

Step – 1: EDA – Exploratory Data Analysis
A. Exploring different patterns i.e. distribution of the customers across the number of product categories purchased by each customer.
B. Why the customers buying different product categories
C. Categorization of customers based on the # of product category they purchased.
D. Which category is contributing highest sales?
Step – 2: Association Analysis
E. Support/Confidence/Lift – Apriori concept
F. Market Basket Analysis
Step – 3: Customer Segmentation
A. Classification/Clustering


Case Study 3: Score Card ModelLing

Given the on-going turmoil on credit markets, a critical re-assessment of credit risk modelling approaches is more than ever needed. This modelling approach generates some probability of default score for each customer on basis of some collection of independent variables (it may differ as per business requirements). After that it is usable for predictive modelling, MIS reporting etc.

Step – 1: EDA – Exploratory Data Analysis
A. Data import and basic data sanity check.
B. Exploring different patterns i.e. distribution of data
C. Variables (categorical & numerical) selection approaches.
D. Training and validation data creation.
Step – 2: Model Preparation
E. Creating indicator variables
F. Apply step wise regression
Step – 3: validation of model
G. Check for multi Collinearity (using correlation matrix, VIF)
H. Generate Score using logistic regression.
I. KS calculation
J. Coefficient validation, coefficient stability and score stability.


Case Study 4: Web Scrapping & Text Analysis

The rapid growth of the World Wide Web over the past two decades tremendously changed the way we share, collect, and publish data. Firms, public institutions, and private users provide every imaginable type of information and new channels of communication generate vast amounts of data on human behavior. Web scrapping is a process to extract data from websites and applying some text analysis algorithms to analyze these data. Twitter analysis, google data analysis etc.

Step – 1: Setup connection
A. Create a key against developer account.
B. Run API request to fetch data.
Step – 2: Data Extraction
C. Save API requested data into excel/csv.
D. Data analysis and sanity check (dealing with missing data)
Step – 3: Text mining
E. Apply diff-2 algorithms like: sentiment analysis.

Tania Chakraborty

In-house Faculty (R & SAS)

Tania, with a background in engineering, have 3+ years of hands on working experience on various Analytics tools, mainly SAS & R. She played a major role in the student data analysis of two entire countries, Dominica & St. Kitts, on a popular student management software “openSIS”.
Other than that she has worked on various other data analysis projects like, Data Analysis on US Economic Indices, Twitter Sentimental Analysis, GDP rates etc.
Simultaneously with project work, she provides training on Big Data analytics using Hadoop and R, Base SAS & Advanced SAS. She has already educated over hundred high profile MNC professionals on Data Analytics. She is the most junior but most appreciated faculty of our team.

Tanushree Bhattacharyya

Guest Faculty (Advanced Excel, R)

Tanushree, a post graduate in M.Sc(Econometrics & Statistics), having 8 yrs of experience in Analytics & Mkt Research.Currently working in a big MNC house, proficient in statistical tools like SAS, Advanced Excel, VBA, Access, SQL, SPSS, Quantum. She is highly skilled in data analysis and building statistical model, creating publication quality report and automation of the models with VBA/SAS/SPSS with an excellent track record of managing clients, projects and exceeding expectations. She is an expert in handling analytical projects involving various statistical techniques like demand forecasting , multivariate techniques, optimization, segmentation and reporting the insights to the management to fulfill the business requirements. She is involved with NIVT for over a year now and has an excellent track record of providing training to professionals on Excel,VBA & R Programming. On behalf NIVT she has conducted training in some corporate houses like Dynamic Level.

Debajyoti Chakraborty

In-house Faculty (R & SAS)

Debajyoti, a Statistical Analyst, Member of Actuarial Society of India, Analytics Trainer on Statistical Softwares - SAS,R,Ms Excel with basic query language knowledge on SQL. Graduate in Statistics with Maths and Computer Science as other subjects. Having over 3yrs of work experience as Data Analyst. In Statistical Analyst role he has worked on multiple industry projects including dashboarding and analytics implementation for Retail and Healthcare projects. Also, as an Actuarial Analyst he assisted in Claim Analytics. As an Analytics Trainer, he is providing Analytics training to Industry Professionals and Academic Students on Statistical Software packages - SAS, R, MS Excel (Beginner to Advance) and SPSS, and overseeing Data Analysis projects undertaken by students and knowledge sharing for successful completion of projects on time.