Skip to content

Data Science with Base SAS and SAS Enterprise Miner


Pairview Training

Summary

Price
£2,376 inc VAT
Study method
Onsite
Duration
4 days
Qualification
No formal qualification
Certificates
  • Certificate of completion - Free
Additional info
  • Tutor is available to students

Overview

This instructor-led course provides delegates with an overview of data mining as well as the fundamentals of using some of the most in-demand tools (e.g. SPSS, SAS, R, etc.) for advanced predictive analytics. The principles and practice of data mining are illustrated throughout the training using the CRISP-DM methodology and follows the stages of a typical data mining project. Each stage of the course will take you through each step of a typical data mining project from reading data to data exploration, data transformation, modelling, and effective interpretation of results. The course equips delegates with the basics of how to read, explore, and manipulate data with the concerned tool, and then create and use successful models from the data gathered.

Description

  1. Introduction to data mining
  • Data Mining Methodologies
  • Definition, Description and Business Application
  • Best Practise for Data Mining
  • Nodes Description and Modelling Steps in SAS Miner
  1. The basics of using the concerned analytics tool (SPSS, SAS, R, etc.)
  • A brief look at SAS Tool
  • A brief look at R Tool
  • A brief look at IBM SPSS Tool (OPTIONAL)
  1. Reading data files
  • Reading Data file into SAS
  • Reading data file into R
  • Reading data file from SQL database
  1. Data Understanding and Manipulation
  • Dealing with Missing data
  • Variable Attributes
  • Character Variables
  • Numeric Variables
  • Date Variables
  1. Searching for Relationships among fields
  • Using application identify relationships.
  • Studying relationships between two categorical variables
  • Correlation between two numeric fields
  • Analysing the relationship between numeric and categorical field
  1. Selecting, sampling and Partitioning Data for Modelling
  • Sorting and selecting observations
  • Using sample node to select records
  • Partition node for data partition
  1. Preparing Data for Modelling
  • Cleaning and Balancing the data
  • Numeric data transformation
  • Binning data Values
  • Data partitioning
  1. Modelling Techniques

– Creating Models with Decision Trees

  • Explain how decision trees identify split points
  • Build Decision Trees in interactive mode
  • Change splitting rules
  • Explain how missing values can be handled by decision trees
  • Assess probability using a decision tree
  • Prune decision trees
  • Interpret results of the decision tree node, including: trees, leaf statistics, treemaps, score rankings overlay, fit statistics, output, variable importance, subtree assessment plots
  • Explore model output (exported) data sets

– Creating Models with Regression Technique

  • Explain the relationship between target variable and regression technique
  • Explain linear regression
  • Explain logistic regression (Logit link function, maximum likelihood)
  • Explain the impact of missing values on regression models
  • Select inputs for regression models using forward, backward, stepwise selection techniques
  • Adjust thresholds for including variables in a model
  • Interpret a logistic regression model using log odds
  • Interpret the results of a REGRESSION node (Output, Fit Statistics, Score Ranking Overlay charts)
  • Use fit statistics and iteration plots to select the optimum regression model for different decision types

– Predictive Model Assessment

  • Explain reasons for oversampling data
  • Adjust prior probabilities
  • Build a profit/loss matrix
  • Add a profit/loss matrix to a predictive model
  • Determine an appropriate value to use for expected profit/loss for primary outcome. (from the data, possibly a mean value)
  • Optimize models based on expected profit/loss
  • Compare Models suing Model assessment statistics
  • ROC Chart
  • Score Rankings Chart, including (cumulative) % response chart, (cumulative) Lift chart, gains chart.
  • Total expected profit
  • Effect of oversampling

– Deploying and using models

  • Score data sets
  • Configure a data set to be scored
  • Use the SCORE node to score new data
  • Save scored data to an external location with the SAVE DATA node

Questions and answers

Certificates

Certificate of completion

Digital certificate - Included

Reviews

Currently there are no reviews for this course. Be the first to leave a review.

FAQs

Study method describes the format in which the course will be delivered. At Reed Courses, courses are delivered in a number of ways, including online courses, where the course content can be accessed online remotely, and classroom courses, where courses are delivered in person at a classroom venue.

CPD stands for Continuing Professional Development. If you work in certain professions or for certain companies, your employer may require you to complete a number of CPD hours or points, per year. You can find a range of CPD courses on Reed Courses, many of which can be completed online.

A regulated qualification is delivered by a learning institution which is regulated by a government body. In England, the government body which regulates courses is Ofqual. Ofqual regulated qualifications sit on the Regulated Qualifications Framework (RQF), which can help students understand how different qualifications in different fields compare to each other. The framework also helps students to understand what qualifications they need to progress towards a higher learning goal, such as a university degree or equivalent higher education award.

An endorsed course is a skills based course which has been checked over and approved by an independent awarding body. Endorsed courses are not regulated so do not result in a qualification - however, the student can usually purchase a certificate showing the awarding body's logo if they wish. Certain awarding bodies - such as Quality Licence Scheme and TQUK - have developed endorsement schemes as a way to help students select the best skills based courses for them.