Intro to Python and Numpy
Authors
In this notebook, you will learn how to represent different kinds of data in Python. You will get a first look at creating arrays in Numpy and also analyze some real neuroscience data. Finally, you are going to explore the differences in performance between Numpy and built-in Python functions.
Setup
Import Libraries
import owncloud
from pathlib import PathDownload Data
Path('data').mkdir(exist_ok=True, parents=True)
owncloud.Client.from_public_link('https://uni-bonn.sciebo.de/s/3bRwjQ3p7S3f7Wi').get_file('/', 'spikes.npy')TrueSection 1: Storing Data in Variables
In the first section, you are going to learn how to represent different
kinds of data and store them in variables. You will encounter
four basic data types: integers, floating-point numbers, Boolean values
and text strings. You are also going to use lists which are collections
of data. Data can be assigned to a variable using the = operator which
takes the value on the right and assigns it to the variable on the left.
In this sense, a variable is simply a container that we can use to store
and access data. The data type of a variable can be determined with the
type() function. We can also convert variables from one type to
another - for example, the int() function will try to convert a
variable to an integer. Finally, Python provides operators for the
arithmetic operations like addition +, subtraction -, multiplication
* and division /. Let’s test how this works!
| Code | Description |
|---|---|
x = 3.14 |
Assign the floating-point number 3.14 to the variable x |
x = True |
Assign the boolean value True to the variable x |
x = "hello" |
Assign the string "hello" to the variable x |
x = [1,2,3] |
Assign the list of integers [1,2,3] to the variable x |
type(x) |
Get the data type of variable x |
int(x) |
Convert the variable x to an integer, if possible |
+, -, *, / |
Add, subtract, multiply, divide values |
Exercises
Example: Assign the integer value 1 to a variable called one and print its type().
one = 1
type(one)intExercise: Subtract 0.5 from the variable one.
Solution
one - 0.50.5Exercise: Assign the floating value 0.001 to a variable called small and print its type.
Solution
small = 0.001
type(small)floatExercise: Assign the Boolean value False to a variable called this_is_false and convert it to an integer.
Solution
this_is_false = False
int(this_is_false)0Exercise: Assign the Boolean value True to a variable called this_is_true and convert it to an integer.
Solution
this_is_true = True
int(this_is_true)1Exercise: Assign the string value "goodbye" to a variable called goodbye and print its type.
Solution
goodbye = 'goodbye'
type(goodbye)strExercise: Add the string "hello" to the variable goodbye.
Solution
goodbye = goodbye + 'hello'
goodbye'goodbyehello'Exercise: Create a list with the numbers 1 through 6 to a variable called dice and print its type.
Solution
dice = [1,2,3,4,5,6]
dice[1, 2, 3, 4, 5, 6]Exercise: Multiply the list dice by 2. What happens?
Solution
dice*2[1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]Exercise: Try to add 1 to the list. What error message do you observe?
Solution
Section 2: Analyzing Neural Spiking Data with Numpy
Numpy offers many useful functions for data analysis - let’s test them
on real neuroscience data! In this section, you will load
and analyze the spiking of a neuron in the primary visual cortex of a
mouse. The spikes are represented as a sorted list of time points where
spikes were observed. For example, [0.05, 0.24, 1.5] indicates that a
spike was observed 50, 240 and 1500 milliseconds after the start of the
recording. Using the functions below, we can answer some interesting
questions about the firing behavior of a given neuron.
| Code | Description |
|---|---|
import numpy as np |
Import the module numpy under the alias np |
x = np.load("data.npy") |
Load the file "data.npy" into an array and assign it to the variable x |
np.size(x) or x.size |
Get the total number of element stored in the array x |
np.min(x) or x.min() |
Get the minimum value of the array x |
np.max(x) or x.max() |
Get the maximum value of the array x |
np.sum(x) or x.sum() |
Compute the sum of all values in the array x |
np.mean(x) or x.mean() |
Compute the mean of all values in the array x |
np.std(x) or x.std() |
Compute the standard deviation of all values in the array x |
np.diff(x) |
Compute the difference between consecutive elements in the array x |
Exercise: Import the Numpy module under the alias np.
Solution
import numpy as npExercise: Load the file "spikes.npy" into a Numpy array.
Solution
spikes = np.load('spikes.npy')Exercise: What is the total number of spikes in this recording?
Solution
np.size(spikes)721Exercise: What is the duration of the recording (assuming the recording stopped after the last spike was recorded)?
Solution
spikes.max()298.4843451836275Exercise: Compute the neuron’s average firing rate (the total number of spikes divided by the duration of the recording).
Solution
np.size(spikes)/spikes.max()2.415537067970653Exercise: Compute the inter-spike intervals (i.e. the time differences between subsequent spikes).
Solution
np.diff(spikes)array([0.05456682, 0.15250043, 0.26966743, 0.03310009, 0.27270077, ...,
0.05906683, 0.27290077, 0.2008339 , 0.30426752, 0.17040048])Exercise: What is the average inter-spike interval for this neuron?
Solution
isi = np.diff(spikes)
isi.mean()0.4144865856420776Exercise: What is the standard deviation of inter-spike intervals for this neuron?
Solution
np.diff(spikes).std()0.47663480650055273Exercise: What is the shortest time between two spikes?
Solution
np.diff(spikes).min()0.0005666682648097776Section 3: Creating Arrays in Numpy
Numpy also offers many functions for generating arrays. The simplest way
to create an array is to convert a list but there are other functions
for specific purposes like generating arrays of random numbers or
numbers within a certain range. Like variables, Numpy arrays can have
different data types. The type of an array is stored in the .dtype
attribute. In this section, you will create and explore different
kinds of arrays.
| Code | Description |
|---|---|
x = np.array([2,5,3]) |
Create an array from the list [2,5,3] and assign it to the variable x |
x = np.random.randn(100) |
Create an array with 100 normally-distributed random numbers and assign it to the variable x |
x = np.arange(2,7) |
Create an array with all integers between 2 and (not including) 7 and assign it to the variable x |
x = np.arange(2,7,0.3) |
Create an array with evenly spaced values between 2 and 7 with a step size of 0.3 and assign it to the variable x |
x = np.linspace(2,3,10) |
Create an array with 10 evenly spaced values between 2 and 3 and assign it to the variable x |
x.dtype |
Get the data type of the numpy array x |
Exercises
Example: Create an array from the list [1, 2, 3], assign it to the variable a and display its type.
a = np.array([1,2,3])
aarray([1, 2, 3])Exercise: Multiply the array a by 2 and add 1 to it
Solution
a + 1array([2, 3, 4])Exercise: Create an array from the list [0.1, 0.2, 0.3], assign it to the variable b and display its type.
Solution
b = np.array([0.1,0.2,0.3])
type(b)numpy.ndarrayExercise: Create an array from the list [1, True, "a"], assign it to the variable c and display its type.
Solution
c = np.array([1, True, "text"])
type(c[0])numpy.str_Exercise: Try to add 1 to the variable c. What error message do you observe?
Solution
Exercise: Make an array containing the integers from 1 to 15.
Solution
array_of_numbers = np.arange(1,15,1)
array_of_numbersarray([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])Exercise: Create an array that contains all even numbers up to and including 100.
Solution
np.arange(0,100+2,2)array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24,
26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50,
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76,
78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100])Exercise: Make an array of only 6 evenly-spaced numbers between 1 and 10.
Solution
np.linspace(1,15,15)array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.,
14., 15.])Exercise: Create an array of 10 normally-distributed random numbers and compute its mean and standard deviation.
Solution
x = np.random.randn(10)
x, x.mean(), x.std()(array([ 0.27299568, -0.64684942, -1.30570342, -1.95991131, -2.08556602,
-0.37904938, -0.45618923, -0.62556584, -0.64754801, 0.39586097]),
-0.7437525972608371,
0.7858819849146609)Exercise: Now, create arrays with 100 and 1000 normally-distributed random numbers and compute their means and standard deviations.
Solution
x = np.random.randn(100)
x.mean(), x.std()(-0.06324362027843711, 0.9665345203945966)x = np.random.randn(1000)
x.mean(), x.std()(-0.017241949802111974, 0.9806111510322726)Section 4: Quantifying Numpy’s Performance
One of the key advantages of Numpy is that it is a lot faster than basic Python. How much faster? Let’s find out! The code below creates an array of ten thousand random numbers as well as a list with exactly the same data. We can use these to test how Numpy compares to basic Python with respect to performance.
Exercises
my_array = np.random.randn(10000)
my_list = list(my_array)sum(my_list)4.249007775753211np.sum(my_array)4.249007775753224To time our code, we are going to use the %%timeit command. Adding
%%timeit at the top of a cell makes it so that running that cell
displays the time it took to run the code. By default, the code is
executed ten times in a loop and the result is averaged over all loops.
This procedure is repeated seven times so that we get one average
duration for each run. The reported numbers are the average duration
across the seven runs and its standard deviation.
Example: Estimate the time for computing the sum of my_list using Python’s built-in sum() method with %%timeit.
%%timeit
sum(my_list)782 μs ± 9.73 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)Exercise: Use %%timeit to estimate how long it takes to compute np.sum() of my_array.
Solution
%%timeit
np.sum(my_array)9.4 μs ± 126 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)Exercise: Use %%timeit to estimate how long it takes for Python’s built-in max() function to
find the maximum of my_list.
Solution
%%timeit
max(my_list)297 μs ± 3.42 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)Exercise: Use %%timeit to estimate how long it takes for the np.max() function to find the maximum of my_array.
Solution
%%timeit
np.max(my_array)9.69 μs ± 259 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)Exercise: The code below estimates the time it takes to multiply every element of my_list by 2. Use %%timeit to test how long it takes to multiply my_array by 2 (Hint: use the * operator).
Solution
%%timeit
[item*2 for item in my_list]1.18 ms ± 27.4 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)%%timeit
my_array*27.07 μs ± 55.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)Exercise: What is faster: multiplying an array by 2 or adding the array to itself?
Solution
%%timeit
my_array+my_array12 μs ± 422 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)