summaryrefslogtreecommitdiff
path: root/parsing_data/slides.org
blob: 90bbeaeb571f16b7c225d23428f6b2319c3bf757 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
#+LaTeX_CLASS: beamer
#+LaTeX_CLASS_OPTIONS: [presentation]
#+BEAMER_FRAME_LEVEL: 1

#+BEAMER_HEADER_EXTRA: \usetheme{Warsaw}\usecolortheme{default}\useoutertheme{infolines}\setbeamercovered{transparent}
#+COLUMNS: %45ITEM %10BEAMER_env(Env) %10BEAMER_envargs(Env Args) %4BEAMER_col(Col) %8BEAMER_extra(Extra)
#+PROPERTY: BEAMER_col_ALL 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 :ETC

#+LaTeX_CLASS: beamer
#+LaTeX_CLASS_OPTIONS: [presentation]

#+LaTeX_HEADER: \usepackage[english]{babel} \usepackage{ae,aecompl}
#+LaTeX_HEADER: \usepackage{mathpazo,courier,euler} \usepackage[scaled=.95]{helvet}

#+LaTeX_HEADER: \usepackage{listings}

#+LaTeX_HEADER:\lstset{language=Python, basicstyle=\ttfamily\bfseries,
#+LaTeX_HEADER:  commentstyle=\color{red}\itshape, stringstyle=\color{darkgreen},
#+LaTeX_HEADER:  showstringspaces=false, keywordstyle=\color{blue}\bfseries}

#+TITLE:    
#+AUTHOR:    FOSSEE
#+EMAIL:     
#+DATE:    

#+DESCRIPTION: 
#+KEYWORDS: 
#+LANGUAGE:  en
#+OPTIONS:   H:3 num:nil toc:nil \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t
#+OPTIONS:   TeX:t LaTeX:nil skip:nil d:nil todo:nil pri:nil tags:not-in-toc

* 
#+begin_latex
\begin{center}
\vspace{12pt}
\textcolor{blue}{\huge Parsing Data}
\end{center}
\vspace{18pt}
\begin{center}
\vspace{10pt}
\includegraphics[scale=0.95]{../images/fossee-logo.png}\\
\vspace{5pt}
\scriptsize Developed by FOSSEE Team, IIT-Bombay. \\ 
\scriptsize Funded by National Mission on Education through ICT\\
\scriptsize  MHRD,Govt. of India\\
\includegraphics[scale=0.30]{../images/iitb-logo.png}\\
\end{center}
#+end_latex
* Objectives
  At the end of this tutorial, you will be able to,

  - Split a string using a delimiter.
  - Remove the whitespace around the string.
  - Convert the datatypes of variables from one type to other. 
* Exercise 1
  Split the variable line using a space as argument. Is it same as
  splitting without an argument ?
* Solution 1
  We see that when we split on space, multiple whitespaces are not
  clubbed as one and there is an empty string everytime there are two
  consecutive spaces.
* Exercise 2
  What happens to the white space inside the sentence when it is
  stripped? 
* Solution 2
  #+begin_src python
    In []: a_str = "     white      space     "
    In []: a_str.strip()
  #+end_src
* Exercise 3
  What happens if you do =int("1.25")=
* Solution 3
  It raises an error since converting a float string into integer
  directly is not possible. It involves an intermediate step of
  converting to float.
  #+begin_src python
    In []: dcml_str = "1.25"
    In []: flt = float(dcml_str)
    In []: flt
    In []: number = int(flt)
    In []: number
  #+end_src
* Summary
  + How to tokenize a string using various delimiters
  + How to get rid of extra white space around
  + How to convert from one type to another
  + How to parse input data and perform computations on it
* Evaluation
  1. How do you split the string "Guido;Rossum;Python" to get the words.

  2. How will you remove the extra whitespace in this sentence
     "      Hello    World    " 

  3. What does int("20.0") produce

     - 20
     - 20.0
     - Error
     - "20"
* Solutions
  1. line.split(';')

  2. "      Hello    World    ".strip()

  3. Error 
* 
#+begin_latex
 \begin{block}{}
  \begin{center}
  \textcolor{blue}{\Large THANK YOU!} 
  \end{center}
  \end{block}
\begin{block}{}
  \begin{center}
    For more Information, visit our website\\
    \url{http://fossee.in/}
  \end{center}  
  \end{block}
#+end_latex