cells:
- markdown: |
# Introduction to Data Analysis with Python
### Prabhu Ramachandran
### The FOSSEE Python group &
### Department of Aerospace Engineering
### IIT Bombay
metadata:
slideshow:
slide_type: slide
- markdown: |
## Introduction
- A world of data!
- Can we use data to drive decisions and form opinions?
metadata:
slideshow:
slide_type: slide
- markdown: |
## Real data is not perfect
- Partial information
- Uncertainty
- Errors
metadata:
slideshow:
slide_type: subslide
- markdown: |
- Important to check and clean data
metadata:
slideshow:
slide_type: fragment
- markdown: |
## Statistical approach
- Data collection
metadata:
slideshow:
slide_type: subslide
- markdown: |
- Visualization
- Inference
- Modeling
- Prediction
metadata:
slideshow:
slide_type: fragment
- markdown: |
## Importance of computers
- Datasets are large
- Easy to process on the computer
- Simulation!
metadata:
slideshow:
slide_type: subslide
- markdown: |
## This course
- Use Python for data analysis
- Exposes you to the basic tools available
- Does not teach you statistics!
- Will point out resources for this
metadata:
slideshow:
slide_type: slide
- markdown: |
## Pre-requisites
- Basic Python programming
- `numpy`
- Python 3.x, `Jupyter, scipy, matplotlib, pandas, statsmodels`
- Mathematics (12th grade)
- Introduction to statistics
metadata:
slideshow:
slide_type: slide
- markdown: |
## Tools and Topics
- Simple statistics with `numpy`
- Statistical plots with `matplotlib`
- Random variables with `scipy.stats`
- Using `pandas` for data ingestion and analysis
- Introduction to `statsmodel` for regression
metadata:
slideshow:
slide_type: slide
- markdown: |
## Resources for learning
- [Khan Academy Statistics and Probability](https://www.khanacademy.org/math/statistics-probability)
- [Concrete introduction to Probability](http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb) by Peter Norvig
- [Penn State Stat 414 course](https://onlinecourses.science.psu.edu/stat414)
- [Computational and Inferential Thinking](https://www.inferentialthinking.com/) by Ani Adhikari and John De Nero
- [Think Stats2](http://greenteapress.com/wp/think-stats-2e/) by Allen B. Downey
metadata:
slideshow:
slide_type: slide
- markdown: |
## Summary
- Introduction to data analysis
- Pre-requisites for this course
- Tools covered
- Resources for statistics and probability
metadata:
slideshow:
slide_type: slide