advanced_python/more_arrays.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310

More Arrays
===========

In this section, we shall explore arrays a bit more, since they are the
fundamental building block of all the computation work we wish to do. 

Arrays were introduced, stating that they are very fast as compared to
lists, owing to their homogeneity. They are definitely a lot more
convenient than a list of lists for doing element-wise operations. Let's
now look at the speed difference, quantitatively. 

Arrays vs. Lists
----------------

Let us create a large array with, say a million elements and see how long
it takes to get a new sequence that contains the ``sin`` of each of these
elements.

::

    a = linspace(0, 2*pi, 1000000)

    sin(a)

    b = []
    for i in a: b.append(sin(i))

There is a visible difference in the run-time of the two ways of doing it.
But, let's look at it quantitatively using the ``%timeit`` magic command of
IPython to get a quantitative difference between doing is using a ``for``
loop and using the array way of doing it.

::

    %timeit sin(a)

    b = []
    %timeit for i in a: b.append(sin(i))

The first method turns out to be around 70 times faster on my machine. So,
as you can see, the homogeneity. The speed gains due to homogeneity are
because, the loops are run in C, rather than in Python and Python loops are
comparatively much slower than C loops. 

Fancy Indexing
--------------

We have looked at slicing and striding to get pieces of arrays. But they
only allow us to do so much. 

For instance, let's say, we have an array a of length 40 with some random
elements. We wish to get a new array which is obtained by dropping every
element that is at a position that is a multiple of 8, i.e., we wish to
drop the elements 0, 8, 16, 24, 32. How do we go about doing this?
Essentially, we wish to drop the elements a[::8]. We shall use something
called Boolean arrays to solve this problem. 

Let's look at a small example before solving this problem. 

::

    p = array([1, 2, 3])
    q = array([True, False, True])

    p[q]

``q`` is called a Boolean array and can be used to index another array.
Only the elements at positions corresponding to ``True`` values in q, will
be returned. 

Now, to solve the problem of dropping the elements a[::8] from a, we need
to create a Boolean array, with those elements as ``False`` and the rest as
``True``. There could be multiple ways of doing that. Here's one

::

    b = arange(len(a))
    b%8!=0
    
    a[b%8!=0]

Checking if the elements of ``b`` are not multiples of 8, returns a Boolean
array, which we are then using to index ``a``. 

Numpy arrays can also be indexed using arrays of indices. Let's solve the
same problem using arrays with indices. Let's look at the same small
example that we looked at, for Boolean arrays, before solving the actual
problem. 

::

    p = array([1, 2, 3])
    q = array([0, 2])

    p[q]

The array ``q`` has the indices of the elements that we wish to pick out of
the array ``p``. The order in which the indices are present, doesn't
matter. The returned array, will contain the elements of ``p`` in the order
in which they have been indexed. They can even be repeated, if we wish to
get the same element out of ``p``.

::

    r = array([2, 0])
    p[r]
    
    s = array([1, 1])
    p[s]

Now, to solve the problem of dropping all the elements of ``p`` whose
indices are a multiple of 8, we need to generate an array, which doesn't
have these elements. 

::

    q = array([i for i in range(40) if i%8])
    p[q]

Copies and Views
----------------

Simple assignment of mutable objects, do not make copies in Python. For
instance, 

::

    a = array([1, 2, 3])
    b = a

Here, ``b`` and ``a`` are not copies of each other, but different names for
the same object. If one of them is changed, it reflects in the other one,
as well. By the way, if you recall, this behaviour is similar with lists as
well. 

::

    a[0] = 1
    print a, b


Slicing an array, returns a view of it. Assigning the slice of an object to
a new variable, means the new variable references a portion of the data of
the original object. Hence, changing the data of the new variable, changes
the original variable, as well. 

::

    a = arange(10)
    b = a[5:]
    
    b[:] = 5
    print b, a 

If we require a complete (or deep) copy of the data of an array, we should
use the ``copy()`` method.

::

    a = arange(10)
    b = a[5:].copy()

    b[:] = 5
    print b, a 

Broadcasting
------------

Let's now look at a more advanced concept called Broadcasting. The
broadcasting rules help us understand how Numpy deals with arrays of
different dimensions. 

We know that array operations are element-wise, and for them to work, the
two arrays should be of the same shape. 

For instance,

::

    a = array([1, 2, 3, 4])
    b = array([2, 4, 6, 8])
    
    a * b

works, but, 

::

    a = array([1, 2, 3, 4])
    c = array([6, 8, 10])
    
    a * c

does not work, which is expected. But, surprisingly (or may be not so
surprisingly, since we have seen it before) multiplication of a vector with
a scalar works. 

::

    a = array([1, 2, 3, 4])
    c = 6
    
    a * c

The above multiplication works, thanks to Broadcasting. Let us look at
another example of Broadcasting before looking at the general rules of
broadcasting. 

::

    x = array([1, 2, 3])
    y = array([4, 5, 6])
    x+y

    y.shape = 3, 1
    print x, y
    x+y

When we are operating on two arrays, Numpy broadcasts the arrays, so that
the shape of the two arrays becomes the same. Then the required operation
is performed on the two arrays. 

The procedure of broadcasting is as follows. 

1. When operating on two arrays, the arrays with the smaller number of
   dimensions, has 1s prepended to it's shape, so that the number of
   dimensions of both the arrays becomes the same.
2. The size in each dimension of the output shape is the maximum of all the
   input shapes in that dimension.
3. An input can be used in the calculation if its shape in a particular
   dimension either matches the output shape or has value exactly 1.
4. If an input has a dimension size of 1 in its shape, the first data entry
   in that dimension will be used for all

Let's look at the example of adding ``x`` and ``y`` to each other, step by
step. 

::

    x.shape
    y.shape

``x.shape`` is (3,) and ``y.shape`` is (3,1). Now, ``x`` has the smaller
dimensions of the two. 1 is prepended to the shape of x, until both the
arrays have the same dimension. ``x.shape`` becomes (1, 3). 

Now, by rule 2, we know that the output shape is the maximum of all the
input shapes, in each of the dimensions. So, the shape of the output is
expect to be (3, 3)

The condition 3 satisfies, for both ``x`` and ``y``. So, this is a valid
operation on ``x`` and ``y``.

To see how the last step works, we use the ``broadcast_arrays`` command. 

::

    broadcast_arrays(x, y)

The values of ``x`` are broadcasted or copied, in the dimension where it's
size was 1. Similarly, for ``y``. 

Let's look at another example, before ending our discussion on
Broadcasting. Let's say we have the co-ordinates of a set of points, and we
wish to calculate the distance of a new point from each of these points. 

::

    x = array([[ 2.,  2.],
               [ 6.,  9.],
               [ 6.,  6.],
               [ 9.,  0.]])

    y = array([4., 3.])

We shall use broadcasting to calculate the difference between ``y`` and each of
the points in ``x``. 

::

    x - y

The array ``y`` has been broadcast and the difference has been obtained.
The following commands, will now give us the required distances. 

::

    (x-y)**2
    sqrt(sum((x-y)**2, axis=1))

As you can see, broadcasting makes the code much simpler. Also, the
operation of subtracting ``y`` from each of the elements of ``x`` is
performed using iterations in the underlying C language, rather than us
writing ``for`` loops in Python. This turns out to be much faster, as long
as the array sizes are small. 

But this may turn out to be slower, when the number of objects gets larger.
This discussion is not in the scope of this course. Look at
`EricsBroadcastingDoc <http://www.scipy.org/EricsBroadcastingDoc>`_ for more
detail.


.. 
   Local Variables:
   mode: rst
   indent-tabs-mode: nil
   sentence-end-double-space: nil
   fill-column: 75
   End: