Subscribe Newsletter Login ?

You are here:   Home » Careers


Data Science Course Starts June 10th 2016

Data Science
(With SAS, R, WEKA, SPSS & Excel)*

Module 1: Data Visualization and Summarization
Part-1: Descriptive Statistics:
 Introduction to Advanced Data Analytics
 Statistical descriptive and inferences for various Business problems
 Types of Variables
 Measures of central tendency
 Dispersion
 Variable Distributions
 Probability Distributions
 Normal Distribution and Properties
 Skewness and Kurtosis
 Five number Summary Analysis
Part-2: Data quality and outlier treatment
 Outlier treatment with robust measurements
 Outlier treatment with central tendency Mean
 Outlier with Min Max methods
 Imputation with series means or median values
 Z score Calculation
 Data Normalization
 Sampling and estimation

Part-3: Test of Hypothesis
 Null/Alternative Hypothesis formulation
 Type I and Type II errors
 One Sample T-TEST
 Paired T-TEST
 Independent Sample T-TEST
 Analysis of Variance ( ANOVA),
 MANOVA
 Chi Square Test (Non Parametric Tests)
 Kruskal-Wallis,
 Mann-Whitney,
 Wilcoxon, McNemar test
Module 2: Data Preparation and Quality Check
Part-4: Data Validation and Data Imputation
 Proc Univariate techniques & analysis (SAS)
 Q-Q probability plots
 Cumulative frequency ( P P plots)
 Explorer analysis ( SPSS)
 Steam and leaf analysis
 Kolmogorov Smirnov test
 Shapiro Wilks test
 Anderson darling test


Part-5: Data Transformation
 Log transformation
 Arcsine transformation
 Box- Cox transformation
 Square root transformation
 Inverse transformation
Module 3: Predictive Analytics (Supervised Learning)
Part-6: Predictive modeling & Diagnostics
 Correlation - Pearson, Kendall, Wilcox
 SLR Regression
 MLR Regression
 Residual analysis
 Auto Correlation
 VIF Analysis
 CP Indexing
 Eigen Value for PCA Analysis
 Homoscedasticity
 Heteroskedasticity
 Stepwise regression
 Forward Regression
 Backward Regression
 Quadraint Regression
 Transformed Regression
 Dummy Variables Regression

Part-7 Logistic Regression Analysis
 Logistic Regression
 Discriminant Regression Analysis
 Multiple Discriminant Analysis
 Stepwise Discriminant Analysis
 Binary Regression Analysis
 Profit and Logit Models
 Estimation of probability using logistic regression,
 Wald Test statistics for Model
 Hosmer Lemshow
 Nagurkake R square
 Maximum likelihood estimation
 AIC
 BIC (Bayesian information criterion)
Module 4: Advanced Analytics (Unsupervised Learning)
Part-8: Dimension Reduction Analysis
 Introduction to Factor Analysis
 Principle component analysis
 Reliability Test
 KMO MSA tests, Eigen Value Interpretation,
 Rotation and Extraction steps
 Varmix Models
 Conformity Factor Analysis
 Exploitary Factor Analysis
 Factor Score for Regression

Part-9: Cluster Analysis
 Introduction to Cluster Techniques
 Hierarchical clustering
 K Means clustering
 Wards Methods
 Variation Methods
 Linkage Methods
 Centroid Methods
Module 5: Data Mining and Machine Learning
Part -10: Data Mining Machine Learning
 Data partition (Training, Validating, Testing)
 Data Explore Analysis
 Data Testing Analysis
 Data Transform Analysis
 Linear Model
 Non Linear Model
 Random Forest Analysis
 Tree analysis (CHAID )
 J48 proned & unproned
 SVM(Supporting Vector Machine)
 ANN (Artificial Neural Network)




 Model Evaluation Testing
 Error/ Confusion matrices
 ROC
 MAPE
 Lift Curve
 Sensitivity
 Misclassification Rating












Note: * Business Analytics course can be implemented in the tool subject to the packages chosen