Guerrilla Data Analytics (GDAT)\{\Large How to Get Beyond Monitoring}
Guerrilla Data Analytics (GDAT)
How to Get Beyond Monitoring
From Linear Regression to Machine Learning
Complete with a review of
the R programming language
and application of
R statistical tools
Data comes from the Devil.
Therefore, you need to know how to waterboard it with the right
statistical tools to get it to tell you the truth.
Contents
1 Purpose
2 Certification
3 Course Goals
4 Dates and Registration
5 Who Should Attend?
6 Course Outline
6.1 GDAT Day 1
6.2 GDAT Day 2
6.3 GDAT Day 3
6.4 GDAT Day 4
6.5 GDAT Day 5
7 Registration and Materials
7.1 Registration
7.2 Textbook
7.3 Location
7.4 Meals
1 Purpose
You already understand the essential concepts of computer system capacity planning
(e.g., Level II certification) and
you've collected cubic light years of performance data. But now you realize that's not sufficient.
Why? Because raw performance data is not the same thing as performance information.
To extract the pertinent information, you need to transform your data.
And that's precisely what this class teaches you.
Moreover, the data analysis techniques we present are general purpose, and therefore
not tied to any particular computing platform or data collection tools.
Although there are no prerequisites, it is strongly recommended that you take the
Level II GCaP class
before embarking on the this Level III GDAT class.
2 Certification
This class (GDAT) corresponds to Guerrilla Capacity Planner: Level III certification,
where the levels are defined as:
- Entry level for newbies, e.g., Guerrilla Boot Camp (GBOOT),
which is usually offerred on a demand basis only.
Please
contact Performance Dynamics
if you would like to take this Level I class.
- Exposure to a wide variety of computer systems capacity planning concepts, methods, and
tools that can be adapted opportunistically to support the needs of
enterprise-level platform-independent performance management.
- Detailed study of a particular capacity planning technique or performance analysis tool,
e.g., Guerrilla Data Analysis Techniques (GDAT).
A printed certificate reflecting the level of achievement is awarded to each attendee at the completion of the
respective course.
3 Course Goals
After completing this course, students will know how to:
- Transform data into information.
-
Use statistical techniques and tools to reduce the number of metrics that
need to be monitored and analyzed.
-
Apply regression analysis to determine the scalability of web applications and services.
4 Dates and Registration
Check out the
schedule
for dates and online registration.
5 Who Should Attend?
Computer system administrators, mainframe system operators, network system administrators, performance engineers,
test engineers, IT consultants, data center managers, Devops, IT technical managers and software development engineers.
This course does not assume any prior experience with performance analysis methods, but a working knowledge of computer systems
and high school algebra is helpful.
6 Course Outline
Class typically begins at 9am and the instructor is generally available until 9pm each day.
Many class discussions have been known to continue over dinner.
A morning break of half an hour is serviced around 10:30am
Seated lunch service is provided from Noon until 1pm.
A serviced afternoon break of half an hour occurs around 3:00pm
A large number of practical exercises (with solutions in R) will be given and discussed throughout
the five days. You are encouraged to bring a laptop computer.
6.1 GDAT Day 1
- How to Detect Bad Data
-
- All data is wrong by definition
- Broken performance tools
- The power of good statistical models
- Introduction to R
-
- Why R is de RigueuR on Wall St and elsewhere
- My special 911.r script
- R commands
- R language
- R graphics
- Installing R
- Expressing Measurement Error
-
- Measurement is a process not a number
- Confidence intervals and sigma levels
- Confidence bands and QQ plots
- How to express errors
6.2 GDAT Day 2
- Review of Elementary Statistics
-
- Descriptive statistics
- Measures of central tendency: mean, median and mode
- Meaning of the means: arithmetic, geometric, harmonic
- Measures of dispersion: stdev, variance, stderr, percentiles
- Summarizing data and its statistics
- Histograms and Distributions
-
- Review of Uniform, Normal, Poisson, Exponential distributions
- How to determine normal distributions
- How to determine exponential/Poisson distributions
- Weighted multi-class workloads
6.3 GDAT Day 3
- Regression Analysis
-
- Linear regression done right
- Hubble's bubble & the most famous scatter plot
- Fitting and projecting
- VAMOOS: Visualize, Analyze, Modelize, Over and Over, until Satisfied
- Multivariate and Nonlinear Regression
-
- Multivariate regression
- ANOVA: Analysis of Variance
- Nonlinear regression
- Moving averages
6.4 GDAT Day 4
- Application Scalability Analysis
-
- Load test data and QA analysis
- Universal scalability law (USL)
- Applying USL to production data
- Analyzing data for scalability zones
- Applying Regression Analysis to Web Traffic
-
- Web server scalability
- Web traffic profiles and time zones
6.5 GDAT Day 5
- Taming the Data Torrent
-
- Principal component analysis
- Reducing the number of monitored metrics
- Case studies: PerfViz, Apdex, Barry
- Machine Learning for CaP
-
- Machine learning algorithms
- Support Vector Machines
- Supervised learning
- The SVM package in R
- Detecting performance patterns and defining exceptions
- Wild (Not Mild) Distributions
-
- Power law data and distributions
- Case studies: SQL access patterns, web traffic, data recovery
- Data validation using qqplots, log-linear plots and log-log plots
- Review and Class Discussion
7 Registration and Materials
7.1 Registration
All registration is now done online.
Please consult the
Guerrilla Training Schedule
for current pricing and conditions.
7.2 Textbook
There is no separate Guerilla book for the GDAT course but, this
reading list
includes books that are referred to during the class.
7.3 Location
See the
Guerrilla Training Schedule
for details about the hotel location and room reservations.
The city of Pleasanton is right next door to Castro Valley.
Lunch is provided each day.
File translated from
TEX
by
TTH,
version 3.81.
On 11 Jan 2021, 09:37.