Skip to Main Content

*RESEARCH DATA SERVICES (RDS) @ Georgia State University Library

R Series


R Series

Get GSU Data Ready! Badge Micro-Credentials for completing these tutorials to show others your commitment to learning data skills! Learn more at lib.gsu.edu/data-ready

 


TUTORIALS & MATERIALSONLINE GUIDE

The R series will introduce participants to the fundamentals of using the R programming language and associated tools for the purposes of performing common data analysis tasks. The R programming language is 100% free to use and is extremely popular amongst researchers in both academia, business, and non-profits. It is especially useful for conducting statistical analysis.

This series consists of four tutorials. For individuals who are new to R, coding, or data analysis, it is highly recommended that the tutorials be attended in sequential order. Additionally, while these tutorials are taught exclusively using code (i.e. there are no point-and-click methods), attendees do not need to have any prior experience with programming, coding, or scripting. All are welcome.

Skill Requirements: None.

Software Requirements for Hands-on Participation:

For participants wishing to follow along with the “hands-on” portion of the tutorial, please see the directions at the following url: https://research.library.gsu.edu/R/workshop


R 1: Getting Started with R and RStudio

Topics:

  • Using RStudio to work with R
  • R syntax, commands, functions, and packages
  • Opening, viewing, and exploring data
  • Generating basic descriptive statistics from data


R 2: Tidyverse and Manipulating Data

Topics:

  • Introduction to Tidyverse packages (emphasis on dplyr)
  • Transforming and generating variables
  • Handling data with missing values
  • “Piping” data and data processes

R 3: Data Visualization and Mapping

Topics:

  • Creating statistical plots using ggplot2
  • Customizing plot colors, themes, labels, etc…
  • Working with GIS data to create maps
  • Modifying maps with overlays, custom aesthetics, and additional data

R 4: Statistical Modelling

Topics:

  • Basic analysis, descriptive statistics, t-tests
  • Creating linear models (multiple linear regression & logistic regression)
  • Evaluating linear models and generating predictions
  • Creating simple machine learning models (Time permitting)

SPSS Series


SPSS Series

Get a GSU Data Ready! Badge Micro-Credential for completing these tutorials to show others your commitment to learning data skills! Learn more at lib.gsu.edu/data-ready


TUTORIALS & MATERIALSONLINE GUIDE

The SPSS 1 and SPSS 2 tutorials in this two-part series focus on using the point-and-click method for using SPSS; the syntax/code method is introduced briefly.


SPSS 1: Getting Started

This tutorial is the first of a two-part series on SPSS, a statistical software package that is widely used by scientists throughout the social sciences for analysis of quantitative data.

Please note: This tutorial focuses on using the point-and-click method for using SPSS; the syntax/code method is introduced briefly.

Topics

  • Navigating SPSS
  • Entering and importing data from different formats (such as text and Excel files)
  • Defining variables (defining and labeling codes, selecting appropriate levels of measurement)
  • Manipulating and transforming data (selecting cases and splitting files; recoding and computing variables)
  • Running descriptive statistics
  • Generating simple graphs

Prerequisites: None.


SPSS 2: Analyzing Data

This tutorial is the second of a two-part series on SPSS, a statistical software package that is widely used by scientists throughout the social sciences for analysis of quantitative data.

Please note: This tutorial focuses on using the point-and-click method for using SPSS; the syntax/code method is introduced briefly.

Topics

  • Cross-tabulation and Chi-Square tests
  • Analysis of Variance (ANOVA)
  • T-tests
  • Correlation analysis
  • Multiple regression analysis

Prerequisites: Attendance at SPSS 1 preferred, or completion of Parts 1-6 of the Lynda.com "SPSS Statistics Essential Training" tutorial.

SAS Series


SAS Series

Get a GSU Data Ready! Badge Micro-Credential for completing these tutorials to show others your commitment to learning data skills! Learn more at lib.gsu.edu/data-ready


TUTORIALS & MATERIALSONLINE GUIDE

This series completes all analysis using code. No previous knowledge of coding is required. This series is for the Windows version of SAS.


SAS 1: SAS Basics

This is the first SAS tutorial in a two-part series. This interactive tutorial will introduce users to the SAS system. Applied, hands on, examples using real data will be used. NOTE: This tutorial is aimed at people who do not have experience using the SAS system. Those who have used SAS in the past may find this tutorial too foundational and are encouraged to attend our forthcoming advanced SAS sessions.

Topics:

  • Reading data into SAS
  • Conducting basic data cleaning and recoding
  • Using basic SAS procedures (e.g., PROC CONTENTS, PROC PRINT, PROC FREQ) to view and understand data.
  • PROC FREQ, PROC UNIVARIATE, and PROC MEANS will be demonstrated to complete basic descriptive statistics.
  • An introduction to bivariate statistics in SAS.

Prerequisites: No prior experience with SAS is required. Basic understanding of univariate and bivariate statistics is helpful but not required.


SAS 2: Data Analysis

This is the second SAS tutorial in a two-part series. In this interactive tutorial, SAS users will go beyond the basics to develop comfort with more advanced statistical analyses using the SAS system. Applied, hands on, examples using real data will be used. NOTE: Basic knowledge of the SAS system will be helpful for those who want to participate in the applied portion of the tutorial.

Topics:

  • Conducting bivariate and multivariable analyses using SAS procedures like PROC FREQ, PROC TTEST, PROC ANOVA, PROC GLM, PROC LOGISTIC, and PROC REG.
  • Best practices for checking statistical assumptions, selecting appropriate statistical procedures, and reporting and visualizing results will be discussed.

Prerequisites: Basic knowledge of the SAS system will be helpful for those who want to participate in the applied portion of the tutorial. Basic understanding of bivariate and multivariable statistics is helpful.

Stata Series


Stata Series

Get a GSU Data Ready! Badge Micro-Credential for completing these tutorials to show others your commitment to learning data skills! Learn more at lib.gsu.edu/data-ready


TUTORIALS & MATERIALSONLINE GUIDE

This series completes all analysis using code. No previous knowledge of coding is required. This series is for the Windows version of Stata. See the Stata research guide here.


Stata 1: Introduction to Stata

This tutorial is the first of a three-part series on Stata. Stata is a statistical software package. Stata is widely used by scientists throughout the social sciences for analysis of quantitative data ranging from simple descriptive analysis to complex statistical modeling.

Please note: This tutorial completes all analysis using code. No previous knowledge of coding is required.

Topics:

  • Opening data
  • Generating variables (basic)
  • Frequency distributions
  • Analysis: summary statistics

Prerequisites: None.


Stata 2: Basic Data Analysis

This tutorial is the second in a three-part series on Stata. Stata is a statistical software package. Stata is widely used by scientists throughout the social sciences for analysis of quantitative data ranging from simple descriptive analysis to complex statistical modeling.

Please note: This tutorial completes all analysis using code. No previous knowledge of coding is required.

Topics:

  • Generating variables (advanced)
  • Analysis: Chi-square, ANOVA, regression
  • Graphs
  • Navigating help features

Prerequisites: Stata 1 or basic knowledge of Stata.


Stata 3: Advanced Data Analysis

This tutorial is the third in a three-part series on Stata. Stata is a statistical software package. Stata is widely used by scientists throughout the social sciences for analysis of quantitative data ranging from simple descriptive analysis to complex statistical modeling.

Please note: This tutorial completes all analysis using code. No previous knowledge of coding is required.

Topics:

  • Troubleshooting code
  • Generating scales
  • 3, 4, and 5 way cross tabulations

Prerequisites: Stata 1 and Stata 2 or moderate knowledge of Stata.

Python & Data Series


Python & Data Series

Get a GSU Data Ready! Badge Micro-Credential for completing these tutorials to show others your commitment to learning data skills! Learn more at lib.gsu.edu/data-ready


TUTORIALS & MATERIALSONLINE GUIDE

The Python & Data series will introduce participants to the fundamentals of using the Python programming language and associated tools for the purposes of performing common data analysis tasks. Python is an extremely popular programming language used by analysts, researchers, and scientists in many different disciplines.

This series consists of three tutorials. For individuals who are new to Python, coding, or data analysis, it is highly recommended that the tutorials be attended in sequential order. Additionally, while these tutorials are taught exclusively using code (i.e. there are no point-and-click methods), attendees do not need to have any prior experience with programming, coding, or scripting. All are welcome.

Skill Requirements: None.

Software Requirements for Hands-on Participation:

  • Participants will need a Google / Gmail account in order to access Google Colab
  • No software installation is required.

Python & Data 0: Google Colab

This tutorial provides a short, high-level overview of Google Colab and how it relates to the other Python tutorials. 

Topics:

  • Brief overview of Google Colab
  • Uploading, managing, and saving data in Google Colab environment

Python & Data 1: Getting Started with Python

Topics:

  • Using Google Colab and Jupyter to work with Python
  • Python syntax, commands, functions, and packages/modules
  • Opening, viewing, and exploring data
  • Generating basic descriptive statistics from data


Python & Data 2: Manipulating & Transforming Data

Topics:

  • Selecting, sub-setting, and manipulating data
  • Transforming and generating variables
  • Handling data with missing values
  • Generating crosstabs / contingency tables

Python & Data 3: Visualizing Data & Creating Models

Topics:

  • Plotting and visualizing data using Matplotlib and Seaborn
  • Defining statistical models using both formulas and matrices
  • Fitting and inspecting statistical models (e.g. anova, linear regression)

Python for Machine Learning (ML) Series


Python for Machine Learning (ML) Series

Get a GSU Data Ready! Badge Micro-Credential for completing these tutorials to show others your commitment to learning data skills! Learn more at lib.gsu.edu/data-ready


TUTORIALS & MATERIALSONLINE GUIDE

This applied Machine Learning (ML) series introduces participants to the fundamentals of supervised learning and provides experience in applying several ML algorithms in Python. Participants will gain experience in regression modeling; assessing model adequacy, prediction precision, and computational performance; and learn several tools for visualizing each step of the process.

This series consists of three (3) tutorials. For individuals who are new to Python and/or Google Colab, it is highly recommended that you first complete the prerequisite Python & Data Series 0-3 tutorials. For those who are new to Machine Learning, it is highly recommended that the tutorials in this series be attended in sequential order. While these tutorials are taught exclusively using code (i.e., there are no point-and-click methods), attendees do not need to have any prior experience with programming, coding, or scripting. All are welcome.

Software Requirements for Hands-on Participation:

For participants wishing to follow along with the “hands-on” portion of the tutorial, please see the directions here.


Python for Machine Learning (ML) 1: Univariate Linear Regression

Fundamentals of supervised learning in Python; applying a rudimentary ML model using univariate linear regression (i.e., one feature).

Topics:

  • Overview: “What is Machine Learning?”
  • Univariate Linear Regression Model
  • Mean-Squared Error Cost Function
  • Gradient Descent Algorithm for Linear Regression

Prerequisites: Python & Data Series 0-3: https://lib.gsu.edu/rds-recordings


Python for Machine Learning (ML) 2: Multivariate Linear Regression

Fundamentals of supervised learning in Python; applying an ML model using multivariate regression (i.e., multiple features).

Topics:

  • Multivariate Regression Model
  • Vectorization
  • Feature Scaling
  • Feature Engineering

Prerequisites: Python & Data Series 0-3: https://lib.gsu.edu/rds-recordings and Python for Machine Learning (ML) 1: Univariate Linear Regression.


Python for Machine Learning (ML) 3: Logistic Regression

Fundamentals of supervised learning in Python; applying an ML model using logistic regression (e.g., classification prediction).

Topics:

  • Logistic Regression Model
  • Cost Function for Logistic Regression
  • Gradient Descent for Logistic Regression
  • Overfitting & Model Adequacy

Prerequisites: Python & Data Series 0-3: https://lib.gsu.edu/rds-recordings, Python for Machine Learning (ML) 1: Univariate Linear Regression tutorial, and Python for Machine Learning (ML) 2: Multivariate Linear Regression tutorial.