Welcome to Data Science Discovery!
Data Science Discovery is an open-source data science resource created by The University of Illinois with support from The Discovery Partners Institute, the College of Liberal Arts and Sciences, and The Grainger College of Engineering. The aim is to support basic data science literacy to all through clear, understandable lessons, real-world examples, and support.
At The University of Illinois, this material is used as the basis for STAT 107: Data Science Discovery and several other courses. This material is divided up into six "badges", each with many sections to explore individual topics:
Basics of Data Science with Python
- What is Data Science?
- Types of Data
- Experimental Design and Blocking
- Python for Data Science: Introduction to DataFrames
- Row Selection with DataFrames
- Observational Studies, Confounders, and Stratification
- Simpson's Paradox
- DataFrames with Conditionals
- Software Version Control with git
Exploratory Data Analysis
- Exploratory Data Analysis Overview
- Descriptive Statistics
- Grouping Data in Python
- Histograms
- Quartiles and Box Plots
- Basic Data Visualization in Python
Prediction and Probability
- Probability Introduction w/ The Monty Hall Problem
- Random Numbers in Python
- Multi-event Probability: Multiplication Rule
- Multi-event Probability: Addition Rule
- Conditional Probability
- Bayes' Theorem
Simulation and Distributions
- Overview of Simulation
- For-Loops in Python
- Simple Simulations in Python
- Sample Space
- Conditionals in Python
- Functions in Python
- Normal Distribution
- Law of Large Numbers
Polling, Confidence Intervals, and Hypothesis Testing
- Random Variables
- Bernoulli & Binomial Random Variables
- Python Functions for Random Distributions
- Central Limit Theorem
- Polling and Sampling
- Confidence Intervals
- Hypothesis Testing