cells: - markdown: | # Introduction to Data Analysis with Python ### Prabhu Ramachandran ### The FOSSEE Python group & ### Department of Aerospace Engineering ### IIT Bombay metadata: slideshow: slide_type: slide - markdown: | ## Introduction - A world of data! - Can we use data to drive decisions and form opinions? metadata: slideshow: slide_type: slide - markdown: | ## Real data is not perfect - Partial information - Uncertainty - Errors
metadata: slideshow: slide_type: subslide - markdown: | - Important to check and clean data metadata: slideshow: slide_type: fragment - markdown: | ## Statistical approach - Data collection
metadata: slideshow: slide_type: subslide - markdown: | - Visualization - Inference - Modeling - Prediction metadata: slideshow: slide_type: fragment - markdown: | ## Importance of computers - Datasets are large - Easy to process on the computer - Simulation! metadata: slideshow: slide_type: subslide - markdown: | ## This course - Use Python for data analysis - Exposes you to the basic tools available - Does not teach you statistics! - Will point out resources for this metadata: slideshow: slide_type: slide - markdown: | ## Pre-requisites - Basic Python programming - `numpy` - Python 3.x, `Jupyter, scipy, matplotlib, pandas, statsmodels` - Mathematics (12th grade) - Introduction to statistics metadata: slideshow: slide_type: slide - markdown: | ## Tools and Topics - Simple statistics with `numpy` - Statistical plots with `matplotlib` - Random variables with `scipy.stats` - Using `pandas` for data ingestion and analysis - Introduction to `statsmodel` for regression metadata: slideshow: slide_type: slide - markdown: | ## Resources for learning - [Khan Academy Statistics and Probability](https://www.khanacademy.org/math/statistics-probability) - [Concrete introduction to Probability](http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb) by Peter Norvig - [Penn State Stat 414 course](https://onlinecourses.science.psu.edu/stat414) - [Computational and Inferential Thinking](https://www.inferentialthinking.com/) by Ani Adhikari and John De Nero - [Think Stats2](http://greenteapress.com/wp/think-stats-2e/) by Allen B. Downey metadata: slideshow: slide_type: slide - markdown: | ## Summary - Introduction to data analysis - Pre-requisites for this course - Tools covered - Resources for statistics and probability metadata: slideshow: slide_type: slide