Machine Learning and Predictive Analysis Boot Camp

Overview

This hands-on machine learning courses advances the participants’ data analysis skills. The course covers real-world predictive modeling and basic machine learning techniques that will help participants excel at data analysis in their organizations. The course immerses participants in working with R to lay a solid data science foundation and trains them in techniques that enables them to leverage their data in more sophisticated and powerful ways.

What You'll Learn

Understanding machine learning and data science
Introduction to data mining
Working with missing values, outliers and duplicate records
Working with linear regression models and classification models
Performing cluster analysis
Learning the dimension reduction techniques

Curriculum

Data science as a quantitative discipline
- How to define Data Science scopes
- The many faces of Data Science: Data Mining, Data Analysis, Data Analytics, Machine Learning, Predictive Modeling, Statistical Learning, Mathematical Modeling. What are these all about?
- Data Mining as a data exploration process
- Machine Learning: supervised vs. unsupervised
- Machine Learning vs. Predictive Analytics
- Big Data Analytics: what is it and why it’s important
Overview of data mining process cycle
- Understanding business needs and identifying new business opportunities
- Formulating a business problem and associated requirements
- Defining key quantitative metrics to measure success and evaluating business benefits
- Translating business requirements into technical requirements and documentation
- Formulating data models based on business and technical requirements
- Identifying a set of quantitative models based on technical requirements and metrics of success
- Running the models and evaluating results
- Selecting the best model
- Deploying the model

Data sources
Types of data
- Structured vs. unstructured data
- Static data vs. real-time data
- Types of data attributes: numerical vs. categorical
- Role of time factor and time trends in data analysis
Working with missing values
- Main causes of missing data
- Understanding the importance of missing information
- Types of missing information
- Restoring missing values
- Imputing missing values and selecting imputation techniques
- Understanding and evaluating potential consequences of manipulating records with missing values
Working with outliers
- Defining quantitative criteria for outlier detection in 1D cases
- Understanding role of outliers in model building
- Deciding on outlier removal
- Defining outlier detection metrics in multi-dimensional space
Working with duplicate records
- Defining duplicates
- Understanding sources of duplicates
- Deciding on duplicate removal

Why sampling may be important for Machine Learning
Sampling techniques and sample bias
Statistical hypothesis
Z-score, t-score and F statistic
P-values
Implementation of hypothesis testing for model evaluation analysis

What is Machine Learning?
Supervised vs. unsupervised learning
Overview of supervised Machine Learning
- Regression models
- Classification models
Overview of unsupervised Machine Learning
- Clustering methods
- Principal component analysis and dimension reduction
- Association rules
Overview of major steps in building and testing quantitative models
- Criteria for model selection
- How to prepare a training set
- Criteria for selecting model attributes/predictors
- Working with collinear variables
- Addressing imbalance problem
- Dealing with over-fitting; bias-variance tradeoff
- Validation and cross-validation

Univariate regression vs. multiple regression
Mathematical foundation of linear regression overview: least square method vs. maximum likelihood method
Model assumptions
Working with continuous attributes
Dealing with collinear variable
Model subset selection:
- Forward stepwise selection
- Backward selection
- Shrinkage methods: ridge regression and Lasso
- Dimension reduction
- Information criteria
Automating model selection procedure
Model parameter evaluation, R squared vs. adjusted R squared
Validating the model
Working with categorical variables
Considering input variable interactions

Dealing with imbalanced training sets
Understanding confusion matrix
Evaluating binary classifiers using ROC / AUC

Overview of cluster analysis mathematical foundation
K-means clustering method
- Algorithm overview
- Convergence criteria
- How to determine the number of clusters

What is dimension reduction?
The practical goals of dimension reduction implementation
Principal component analysis vs. singular value decomposition
How many components to choose

What was not covered in the class
Big Data Analytics – the future of machine learning: main tools and concepts

Who should attend

The course is highly recommended for –

Data analysts
Machine learning professionals
Business analysts
Data mining specialists

Prerequisites

Participants need to have intermediate-level data analysis skills and basic knowledge of descriptive statistics. Having experience working with R would be beneficial. Technical requirements: Installed R and some R packages. Installation of RStudio is helpful, but not required.

Machine Learning and Predictive Analysis Boot Camp

Overview

What You'll Learn

Curriculum

Who should attend

Prerequisites

Interested in this Course?

Ready to recode your DNA for GenAI?
Discover how Cognixia can help.

Generative AI - Rewire

Generative AI - Organization

JUMP

Digital Mindset & Culture

Change & Adoption

REWIRE

Organization Transformation

Machine Learning and Predictive Analysis Boot Camp

Overview

What You'll Learn

Curriculum

Module 1: Overview of Data Science

Module 2: The Data Foundation

Module 3: Sampling and hypothesis testing

Module 4: Machine learning fundamentals

Module 5: Building a linear regression model with R

Module 6: Example of building a classification model with R

Module 7: Example of cluster analysis with R

Module 8: Dimension reduction techniques with R

Module 9: Class conclusion

Who should attend

Prerequisites

Interested in this Course?

Ready to recode your DNA for GenAI? Discover how Cognixia can help.

Ready to recode your DNA for GenAI?
Discover how Cognixia can help.