From 4d6a25def092c38096f80b955bb2d54cc0efefc5 Mon Sep 17 00:00:00 2001 From: AdityaNikhil Date: Tue, 9 Feb 2021 17:36:32 +0530 Subject: [PATCH] Added NumPy notebooks --- NumPy/Lesson 1 - Introduction.ipynb | 497 +++++++ NumPy/Lesson 2 - Dimensions.ipynb | 634 +++++++++ NumPy/Lesson 3 - Basic Operations.ipynb | 1279 +++++++++++++++++ NumPy/Lesson 4 - Indexing and Slicing.ipynb | 1383 +++++++++++++++++++ NumPy/Lesson 5 - Universal Functions.ipynb | 325 +++++ NumPy/Lesson 6 - Statistical Methods.ipynb | 672 +++++++++ NumPy/Lesson 7 - Linear Algebra.ipynb | 500 +++++++ NumPy/Lesson 8 - Arrays in Files.ipynb | 237 ++++ 8 files changed, 5527 insertions(+) create mode 100644 NumPy/Lesson 1 - Introduction.ipynb create mode 100644 NumPy/Lesson 2 - Dimensions.ipynb create mode 100644 NumPy/Lesson 3 - Basic Operations.ipynb create mode 100644 NumPy/Lesson 4 - Indexing and Slicing.ipynb create mode 100644 NumPy/Lesson 5 - Universal Functions.ipynb create mode 100644 NumPy/Lesson 6 - Statistical Methods.ipynb create mode 100644 NumPy/Lesson 7 - Linear Algebra.ipynb create mode 100644 NumPy/Lesson 8 - Arrays in Files.ipynb diff --git a/NumPy/Lesson 1 - Introduction.ipynb b/NumPy/Lesson 1 - Introduction.ipynb new file mode 100644 index 0000000..c2e1e8b --- /dev/null +++ b/NumPy/Lesson 1 - Introduction.ipynb @@ -0,0 +1,497 @@ +{ + "cells": [ + { + "attachments": { + "image-2.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![NumPy Logo](attachment:image-2.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "NumPy which simply stands for **Numerical Python**.\n", + "\n", + "It is a python library responsible for carrying out complex array operations. It is a widely adopted library in python to perform various linear algebra operations, Fourier Transforms and random number generation. It is 50x faster than generic python lists and it was originally written in C, C++ for faster computation.\n", + "\n", + "NumPy Github link : https://github.com/numpy/numpy\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Installation " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (1.19.5)\n", + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install numpy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Import\n", + "\n", + "You can import the NumPy package using the below code. It is a common practice to import as `np`. You may use a different name than `np`, but you might find it convenient to follow the common practice of naming NumPy as `np`. " + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now operate on the variable `np` to use the NumPy functions.\n", + "\n", + "To get started, let us create a random matrix with NumPy. We will be using NumPy to mainly to handle matrices and perform operations on them. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# NumPy is Important. Why?\n", + "\n", + "In several of the machine learning programs, we have to deal with matrices. Rather, most data we have is in the form of a matrix.\n", + "\n", + "You can consider for example a simple sales register. This sales register could come from an Excel document. This simple data that comes from an Excel document, is arranged in the form of rows and columns. A row and column format is considred as a 2 dimensional matrix. This means if you have an excel sheet with 5 columns and 100 rows, then you have a matrix of size 100 x 5, or as we would write it in NumPy (100, 5)\n", + "\n", + "Here, the first number denotes the number of rows, and the second number denotes the number of columns.\n", + "\n", + "As a matter of fact, even image data and video data is in the form of matrices. We represent images as 3 dimensional matrices. If we have to pass input to machine learning models, and handle the output from machine learning models, we almost never can do it without the use of some form of matrix data. This is why understanding how to work on matrices in Python is very important. \n", + "\n", + "Keep in mind that NumPy is not your only option for handling matrix information. You could use simple arrays to handle dimensional data. However, NumPy was specifically written to handle large volumes of data efficiently. It was specially written to perform operations much faster than what those operations would take using simple arrays. \n", + "\n", + "In most scenarious, data is going to be very large. Using a package like NumPy is going to save you a ton of execution time. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Creating your first matrix" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Before we start creating our first array. Let's understand what is a **matrix**.
A **matrix** is a collection of numbers arranged into a fixed number of rows and columns. \n", + "\n", + "![](https://chortle.ccsu.edu/VectorLessons/vmch13/mtrx1.gif)\n", + "\n", + "This matrix is a 3x3 matrix because it has three rows and three columns. In describing matrices, the format is:\n", + "**rows X columns** \n", + "\n", + "Each number that makes up a matrix is called an element of the matrix.\n", + "\n", + "Let us create our first matrix using a random number based matrix generator. NumPy provides several inbuilt functions to quickly create matrices. This includes creating a matrix with random values. " + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-0.14155633, -1.38513525, 1.12584434],\n", + " [-2.01933397, 1.07991499, -0.62370323],\n", + " [-0.07299109, -0.18702077, -0.74727163]])" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data = np.random.randn(3, 3)\n", + "data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There we go. We have successfully created a `3x3` matrix. This is a square matrix, filled with random numbers. \n", + "\n", + "Let us now try creating a matrix with 2 rows and 3 columns. Or, as we would call it a `2x3` matrix." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-0.5848593 , -0.54368605, 1.28138602],\n", + " [-0.39456216, 0.21461109, -0.51766601]])" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.random.randn(2, 3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also quickly try 3 rows and 2 columns, as shown below." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-0.21009713, 0.71181626],\n", + " [ 0.03790045, 0.46613502],\n", + " [-0.30579528, -0.90247215]])" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.random.randn(3, 2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Performing mathematical operations\n", + "\n", + "In NumPy it is really easy to perform certain mathematical operations on the data. Let's try a few on randomly generated matrices. " + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Before [[-0.41243102 -0.52198758 -0.43280414]\n", + " [-1.24399283 -1.01050165 -0.01265889]]\n" + ] + } + ], + "source": [ + "data = np.random.randn(2, 3)\n", + "\n", + "print('Before', data)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "After [[ -4.12431021 -5.21987576 -4.32804135]\n", + " [-12.43992834 -10.1050165 -0.12658892]]\n" + ] + } + ], + "source": [ + "print('After', data * 10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Keep in mind, the above multiplication operation will not actually change the original matrix. It returns a new matrix with values of the previous matrix multiplied by 10. We can check this by printing the values of `data`. " + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-0.41243102, -0.52198758, -0.43280414],\n", + " [-1.24399283, -1.01050165, -0.01265889]])" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There are several different types of mathematical operations that we can perform on such matrices. NumPy provides us a simple and business user friendly syntax to perform such operations. Let's try a few more.\n", + "\n", + "## Add matrix to itself" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-0.82486204, -1.04397515, -0.86560827],\n", + " [-2.48798567, -2.0210033 , -0.02531778]])" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data + data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Add 1 to each element of the matrix" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 0.58756898, 0.47801242, 0.56719586],\n", + " [-0.24399283, -0.01050165, 0.98734111]])" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data + 1" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Divide by half" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-0.20621551, -0.26099379, -0.21640207],\n", + " [-0.62199642, -0.50525083, -0.00632945]])" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data / 2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Add 2 different matrices" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Data 1 => [[-0.41243102 -0.52198758 -0.43280414]\n", + " [-1.24399283 -1.01050165 -0.01265889]]\n", + "Data 2 => [[ 0.23872176 1.87540312 0.25237765]\n", + " [-0.3875943 0.03512294 -0.00854803]]\n", + "Data 1 + Data 2 => [[-0.17370926 1.35341554 -0.18042649]\n", + " [-1.63158713 -0.97537871 -0.02120692]]\n" + ] + } + ], + "source": [ + "data2 = np.random.randn(2, 3)\n", + "\n", + "print('Data 1 =>', data)\n", + "print('Data 2 =>', data2)\n", + "print('Data 1 + Data 2 =>', data + data2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There are some restrictions ofcourse. The two matrices must be of the same size, if they are to be added to each other. It will not work if you take matrices with different dimensions." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# NumPy is really fast\n", + "\n", + "The biggest advantage of NumPy is that it is really fast. That is the whole reason why we use it. \n", + "\n", + "Using the below code, let us compare a NumPy array to a Python list. We will take both the array and the list as the same size. We will perform exactly the same operation on the both the items. We will measure the time it takes to execute both, and we will have a comparision. " + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: user 0 ns, sys: 3.77 ms, total: 3.77 ms\n", + "Wall time: 3.43 ms\n" + ] + } + ], + "source": [ + "import numpy as np\n", + "arr = np.arange(100000) # Numpy array\n", + "lis = list(range(100000))# Python List\n", + "%time for _ in range(10): arr2 = arr * 2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have the result for NumPy above. Now let us try performing the same operation using a list in Python. The list should ideally take longer." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: user 80.7 ms, sys: 11.8 ms, total: 92.5 ms\n", + "Wall time: 90 ms\n" + ] + } + ], + "source": [ + "%time for _ in range(10): my_list2 = [x * 2 for x in lis]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There we go. The execution time for the list in `ms` (milli-seconds) is substantially larger.\n", + "\n", + "We have taken a large array, so that the time difference is noticeable. If we take a small array, the time to perform the operation would be really small, and the time in both cases could be very similar. You can try reducing the array size, and confirm that the time taken is almost the same. \n", + "\n", + "When handling large data, NumPy-based algorithms are generally 10 to 100 times faster (or more) than their\n", + "pure Python counterparts and they also use significantly less memory. When handling large data, the CPU time is not your only problem. You might actually run out of RAM. NumPy is optimised to save RAM. It can perform the same operations faster, and with a lesser memory footprint. " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/NumPy/Lesson 2 - Dimensions.ipynb b/NumPy/Lesson 2 - Dimensions.ipynb new file mode 100644 index 0000000..61bcd98 --- /dev/null +++ b/NumPy/Lesson 2 - Dimensions.ipynb @@ -0,0 +1,634 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Dimensions & Shapes\n", + "\n", + "First things first. Let us understand what a matrix really is within Python. Is it a list, is it a tuple? Let's find out.\n", + "\n", + "To start, lets start by creating a list. " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 2, 3, 4, 5]\n" + ] + } + ], + "source": [ + "list1 = [1,2,3,4,5]\n", + "print(list1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have a list, let us convert this to an ndarray." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1 2 3 4 5]\n" + ] + } + ], + "source": [ + "import numpy as np\n", + "\n", + "mat1 = np.array(list1)\n", + "print(mat1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What changed? Nothing actually. `mat1` is exactly same as `list1`. Well, not really. A lot has changed. They both appear the same when you print them, but they are actually completely different objects. \n", + "\n", + "`list1` is of type `list`, where as `mat1` is of type `ndarray`. The `ndarray` is a class within the NumPy package. \n", + "\n", + "Let us confirm this once by looking at the specification of the two variables." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "list1?" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "mat1?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Ok, so what we just created as a list, is actually a 1D array in NumPy. Let us understand more about 1D arrays in Python." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 1D Array \n", + "\n", + "So, how do we define a one dimenstional array, or 1D array? \n", + "\n", + "*An array is said to be in 1D(one-dimension) if either the number of columns or rows are 1.* \n", + "\n", + "Yes, our `mat1` is a 1D array, because it has one row and 5 columns. \n", + "\n", + "In linear algebra, a 1D array is simply called a **vector**. We might interchangeably use the term `vector` and `1D array`, so get used to it. Both mean the same thing. \n", + "\n", + "Now, what if we wanted to created a 1D array that has 1 column and 5 rows. How would we do that? And we are going to do that using a `list` first and then using `ndarray`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[1]\n", + " [2]\n", + " [3]\n", + " [4]]\n" + ] + } + ], + "source": [ + "list2 = [[1],[2],[3],[4]]\n", + "mat2 = np.array(list2)\n", + "print(mat2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Interesting uh? What just happened? In a literal sense, the line `list2 = [[1],[2],[3],[4]]` actually created 5 independent lists. \n", + "\n", + "1. [1] -> first list\n", + "2. [2] -> second list\n", + "3. [3] -> third list\n", + "4. [4] -> fourth list\n", + "5. [....] -> fifth list (the outer main list)\n", + "\n", + "\n", + "To create a vertical 1D array, or as you can say a column vector, we basically added lists within the primary list. We created the main list, but instead of adding plain numbers `1`, `2`, `3`, `4`, to the list, we put the numbers in a list of their own, each of size 1, and then added each of the lists `[1]`, `[2]`, `[3]` and `[4]` to the primary list to create a 1D column array. \n", + "\n", + "Let us spend some more time to understand the datatype of each item here. We have to very clearly understand what is actually going on here. To slice a matrix, find a transpose, interting matrices, the fundamental concepts of how the data is actually arranged in a matrix how to be very clear. \n", + "\n", + "Let us look at the type of `list2`" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "list2?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As expected, `list2` is of type `list`.\n", + "\n", + "Now what about the type of the element within the list? " + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "list2item1 = list2[0]\n", + "list2item1?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`list2item1` which represents the first element of `list2` is also of type `list`. Just to be sure, how does this compare to the `list1item1`? " + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "list1item1 = list1[0]\n", + "list1item1?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And there we go. The `list1item1` is actually of type `int` and not `list`. This clearly explains that to create a row array we created a single list, but to create a column array, we created 5 lists. Some developers coming from C/C++ might call a vertical 1D matrix, as actually a 2D matrix. You would not be entirely incorrect in doing so.\n", + "\n", + "Let us see how to create a 2D matrix" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 2D Array \n", + "\n", + "An array that has **1D arrays as its elements** is called a 2D array. This is same as the column array that we saw above, but the inner arrays typically have more than one element each. Below is an example." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[1 1]\n", + " [2 2]\n", + " [3 3]\n", + " [4 4]]\n" + ] + } + ], + "source": [ + "list2d = [[1,1],[2,2],[3,3],[4,4]]\n", + "\n", + "mat2d = np.array(list2d)\n", + "\n", + "print(mat2d)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We first created a 2 dimensional list, and then converted that list into an `ndarray`. As a result, what we have effectively got is an array that has 4 rows, and each row has 2 columns. \n", + "\n", + "Keep in mind, each row is actually an independent list. \n", + "\n", + "``` Python\n", + "list2d = [[1,1],[2,2],[3,3],[4]]\n", + "mat2d = np.array(list2d)\n", + "```\n", + "\n", + "Do you think we could use an uneven combination like the one shown above? The last list has just 1 element in it instead of 2. The answer is, you cannot do this. We must past the right combination for `ndarray` to be able to matrix from it; else Python will throw an error." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Why use 2D arrays?\n", + "\n", + "It is simple actually. Most of the datasets you use will be in 2D form. If you have data in Excel and you want to read the Excel data, the same will be in 2D. You can easily represent any tabular data (row & column) in 2D, and most of the structured data you encounter will be in 2D form. Excel files and CSV files, are the most commonly encountered forms of 2D data. \n", + "\n", + "Here are some example datasets:
\n", + "1) [Titanic dataset](https://www.kaggle.com/c/titanic)
\n", + "2) [Auto-MPG dataset](https://www.kaggle.com/uciml/autompg-dataset)
\n", + "3) [Credit Card Fraud Detection dataset](https://www.kaggle.com/mlg-ulb/creditcardfraud)
\n", + "\n", + "We can represent all of these datasets as 2D matrices. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 3D Array \n", + "\n", + "An array that has **2D arrays as its elements** is called a 3D array. As you can see in the below code block, There are two 2D arrays stacked up on top of each other as elements. The process to create a 3D array is same as that of creating a 2D array. Let's look at an example." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[[1 1]\n", + " [2 2]\n", + " [3 3]\n", + " [4 4]]\n", + "\n", + " [[5 5]\n", + " [6 6]\n", + " [7 7]\n", + " [8 8]]]\n" + ] + } + ], + "source": [ + "list2d_1 = [[1,1],[2,2],[3,3],[4,4]]\n", + "list2d_2 = [[5,5],[6,6],[7,7],[8,8]]\n", + "\n", + "list3d = [list2d_1, list2d_2]\n", + "\n", + "mat3d = np.array(list3d)\n", + "print(mat3d)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[[ 1 2 3]\n", + " [ 4 5 6]]\n", + "\n", + " [[ 7 8 9]\n", + " [10 11 12]]]\n" + ] + } + ], + "source": [ + "arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])\n", + "print(arr3d)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have basically combined 2 different 2D arrays to make a 3D array. Again, all lists must be appropriately sized to make a full formed 3D matrix; else we will receive an error.\n", + "\n", + "## Why use 3D arrays?\n", + "\n", + "Well there is lot of content that often comes in 3D form. We basically have rows and columns of data, but then each cell in this 2D matrix has more than one property associated with it; and that makes up the 3rd dimension. \n", + "\n", + "An example use case of a 3D array is image data. An image consists of a 2 dimensional space of pixels, and then each pixel has an RGB value. The RGB stands for Red, Green and Blue colors. Combinations of the RGB colors make the effective color of the pixel. \n", + "\n", + "Let's take a look at an image, and how the data corresponding to it would appear.\n", + "\n", + "An **image** is simply a matrix of pixels. (Checkout the below image)\n", + "![3D](./images/Numpy_lsn2.jpg)\n", + "\n", + "Each element in the above matrix is a pixel value in that image.\n", + "\n", + "
In the shape,
1) 500 - Width of the image.
2) 800 - Height of the image.
3) 3 - Color channel of the image.
\n", + "\n", + "For each image, there's a channel(RGB - Red, Green, Blue) associated to it which will give information of colour to the image. Hence a combination of a pixel co-ordinate and the channel produces an image with vivid colours. Therefore, images are 3 dimensional matrices.\n", + "\n", + "## How RGB color codes work\n", + "\n", + "The combination of 3 colors can supposedly create any color that we see. Our computer displays, televisions sets, render all colors as a combination of RGB values. The value of each color goes in numercial form from 0 to 255. \n", + "\n", + "So R = 0 would mean the color R is fully absent, and R = 255 would mean the color R is fully present. \n", + "\n", + "(R = 255, G = 255, B = 255) would mean that the colors Red, Green and Blue are mixed in equal proportions of 33% each to make the effective color. The effective color produced by mixing all in equal proportions is White.\n", + "\n", + "(R = 0, G = 0, B = 0) would mean no color is mixed. If all colors are absent, then we actually do not have a color. Or effectively, we do have a Black color. \n", + "\n", + "(R = 127, G = 127, B = 127) what would this mean? This means all colors are mixed equally. So would we get White? The answer is Yes and No. Since all are mixed equally the color has to be white. But the intensity of each color in computer form represents brightness. This means we have a white, but that is only half as bring as the (255,255,255). What does less bright white make? The answer is grey. We would thereby have a 50% grey shade. (255,255,255) would similarly be 100% grey, but 100% grey is called white. \n", + "\n", + "Let's take a few more examples: \n", + "\n", + "> (R = 255, G = 0, B = 0) would make solid Red color\n", + "\n", + "> (R = 0, G = 255, B = 0) would make solid Green color\n", + "\n", + "> (R = 0, G = 0, B = 255) would make solid Blue color\n", + "\n", + "> (R = 127, G = 0, B = 0) would make dark Red color\n", + "\n", + "> (R = 50, G = 0, B = 0) would make an even darker shade of Red\n", + "\n", + "\n", + "## Converting to Greyscale\n", + "\n", + "In many ML projects, you will convert RGB images to greyscale images. Greyscale are nothing but black and white images. Such shade only make use of the color grey. 0% grey would mean black, and 100% grey would mean white. Between these would exist several other shades of grey, with a total of 256 grey shades possible.\n", + "\n", + "Since to produce the color grey, we will always use equal proportions of Red, Green and Blue; we can represent greyscale images by a single number.\n", + "\n", + "50% grey, which is (127,127,127) could actually be represented by just the number 127. And this is the standard convention. For grey scale images we specify a single value for the % of grey used, in the range of 0 to 255, without specifying the individual RGB values. \n", + "\n", + "\n", + "## Grey scale images are 2D\n", + "\n", + "Since we don't need RGB values for greyscale images; we can fit the pixel data of a greyscale image in a 2D matrix. The rows and columns represent pixes, and the each pixel will then hold an integer value in the range of 0 to 255.\n", + "\n", + "```\n", + "[[0,0],[0,0]]\n", + "```\n", + "\n", + "The above list can represent an image of size 2x2. It is a greyscale image. There are 4 pixels in the image, and all pixes are in black color. So what we have here is a 2x2 square sized black image. \n", + "\n", + "\n", + "```\n", + "[[255,255], [255,255]]\n", + "```\n", + "\n", + "The above list represents a similar 2x2 greyscale image, but this one is a square white image. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**From above,** \n", + "\n", + "1) In 1 dimensional matrix(mat_1), **(6,)** denotes that the matrix is one-dimensional and there are 6 rows in it.
\n", + "2) In 3 dimensional matrix(mat_3), **(2,2,3)** denotes that there are 2 matrices of shape (2,3). \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# More than 3 dimensions\n", + "\n", + "As the level of complexity increases with the data, the dimensions keep increasing and algebraically we call it **N-Dimensions**. We can use `ndarray` to make arrays of any number of dimensions. \n", + "\n", + "To make a 4 dimensional array, we would simply put 3D arrays into a list; and so on and so forth to make any number of dimensions. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Determining Shape & Dimensions\n", + "\n", + "The number of dimensions and items in an array is defined by its shape, which is a tuple of N non-negative integers that specify the sizes of each dimension. The type of items in the array is specified by a separate data-type object (dtype), one of which is associated with each n-dimensional array(ndarray).\n", + "\n", + "Let's take a look." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(4, 2)\n", + "2\n" + ] + } + ], + "source": [ + "list2d = [[1,1],[2,2],[3,3],[4,4]]\n", + "mat2d = np.array(list2d)\n", + "\n", + "print(mat2d.shape)\n", + "print(mat2d.ndim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The shape is returned in a form of a Tuple. We use the `shape` function to fetch the shape of a matrix. Here 4 is the number of rows, and 2 is the number of columns. We are basically looking at a 4x2 matrix. \n", + "\n", + "We also use the `ndim` function to determine the number of dimensions of the matrix. Here it returns `2` which means this is a 2D matrix.\n", + "\n", + "Let us try something interesting. Getting the shape of a 1D matrix. " + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1 2 3 4]\n", + "(4,)\n", + "1\n" + ] + } + ], + "source": [ + "mat1 = np.array([1,2,3,4])\n", + "print(mat1)\n", + "print(mat1.shape)\n", + "print(mat1.ndim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What we got is very interesting. We would have assumed for this matrix to have 1 row and 4 columns. This is not actually true. The shape of the matrix is actually 4 rows and no second dimension, which is represented as `(4,)`. This also effectively is same as a matrix with 1 row and 4 columns.\n", + "\n", + "We can confirm that this is a 1D matrix, as the `ndim` function returns the value `1`\n", + "\n", + "Let's try a vertical matrix." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[1]\n", + " [2]\n", + " [3]\n", + " [4]]\n", + "(4, 1)\n", + "2\n" + ] + } + ], + "source": [ + "mat2 = np.array([[1],[2],[3],[4]])\n", + "print(mat2)\n", + "print(mat2.shape)\n", + "print(mat2.ndim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we see that the shape function officially considers this a 2D matrix. The shape is `(4,1)`, which means 4 rows and 1 column. The `(4,)` and `(4,1)` mean very different types of data. We need to understand this fully and keep this in mind at all times. The `ndim` function also returns the value `2`, thereby indeed confirming that this is a 2D matrix.\n", + "\n", + "Let's now try to get the shape of a 3D matrix." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[[1]\n", + " [2]]\n", + "\n", + " [[1]\n", + " [2]]]\n", + "(2, 2, 1)\n", + "3\n" + ] + } + ], + "source": [ + "mat3 = np.array([[[1],[2]],[[1],[2]]])\n", + "print(mat3)\n", + "print(mat3.shape)\n", + "print(mat3.ndim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The above example clearly denotes, that this is a 3D matrix; where in there are 2 individual matrices each of shape `(2,1)`. \n", + "\n", + "So out of curiosity, how would we create a `(2,2,2)` matrix?" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[[1 1]\n", + " [2 2]]\n", + "\n", + " [[1 1]\n", + " [2 2]]]\n", + "(2, 2, 2)\n" + ] + } + ], + "source": [ + "mat3 = np.array([[[1,1],[2,2]],[[1,1],[2,2]]])\n", + "print(mat3)\n", + "print(mat3.shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There you go. We have a (2,2,2) matrix. This is also the equivalent of a Cube. " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/NumPy/Lesson 3 - Basic Operations.ipynb b/NumPy/Lesson 3 - Basic Operations.ipynb new file mode 100644 index 0000000..90e47f6 --- /dev/null +++ b/NumPy/Lesson 3 - Basic Operations.ipynb @@ -0,0 +1,1279 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Creating ndarray" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "NumPy offers several array creation functions. We have already tried the `.array` function to create an array from a List. There are several other built-in functions offered by NumPy for quick use. " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([1, 2, 3, 4, 6])" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import numpy as np\n", + "\n", + "a = [1,2,3,4,6] # This is a list\n", + "arr = np.array(a) # Passing the list, converting to numpy array\n", + "\n", + "arr" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This is the default function we have used so far to create an array. \n", + "\n", + "Let's try something else. " + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0 1 2 3 4]\n" + ] + } + ], + "source": [ + "a = np.arange(5)\n", + "print(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Nice uh? We have quickly created a 1D array, comprising of 5 integers, starting from the number 0 to the range value specified. \n", + "\n", + "Let's try again. " + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0 1 2 3 4 5 6 7 8 9]\n" + ] + } + ], + "source": [ + "a = np.arange(10)\n", + "print(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since we specified `10` as input to `arange`, we got an array with 10 elements in it.\n", + "\n", + "Let's try some more examples." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0. 0. 0. 0. 0.]\n" + ] + } + ], + "source": [ + "a = np.zeros(5)\n", + "print(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here you go. We now have an array containing 5 elements, but each element has the value `0`. It is interesting to note that the value `0` is actually of type float. We can spot it because the value reads `0.` and not `0`. The `0.` is same as having the value as `0.0`\n", + "\n", + "So how do we produce an array with integer zeros? We do so by explicitely specifying the datatype. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0 0 0 0 0]\n" + ] + } + ], + "source": [ + "a = np.zeros(5, dtype=np.int32)\n", + "print(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Many array creation functions in NumPy support an extra parameter of data type. Here `dtype=np.int32` explicitely specifies that we need the elements to be of type `int32`. We can see that the result now contains `0` and not `0.`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Array Creation Functions\n", + "\n", + "The below table shows an exhuastive list of array creation functions found in NumPy. You can give them a try and see how each of them works. We strongly encourage that you do try them and play around with them. \n", + "\n", + "![](./images/array_creation.JPG)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Data Types\n", + "\n", + "The below table shows an exhaustive list of data types. Keep in mind that you can also create 1D, 2D and 3D arrays with string data. You are not just restricted to Numercial data. In fact, elements of the array can be of any type, including objects. \n", + "\n", + "Do play around to create arrays using various of these data types. \n", + "\n", + "![](./images/data_types.JPG)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Datatype Conversions\n", + "\n", + "We can convert the datatype of elements in an array. NumPy provides a conveninet `astype` function to fetch a matrix in the desired datatype. For the operation to succeed, it must be possible to convert all elements to the specified datatype. " + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['1' '2' '3']\n", + "[1 2 3]\n" + ] + } + ], + "source": [ + "a = np.array(['1', '2', '3'])\n", + "print(a)\n", + "b = a.astype(np.int32)\n", + "print(b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the above example, we started with an array of type String. The array `a` has all elements as string, and this can be viewed in the corresponding print. \n", + "\n", + "We used the `astype` function then to create a new array of type int. All elements within the array were convereted to int and returned. This is represented by the array `b`\n", + "\n", + "Keep in mind that calling the `astype` function creates a new array. The original array remains unchanged." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Arithmetic Operations\n", + "\n", + "Let's perform some basic arithmetic operations like **addition, subtraction, multiplication and division** on below 2 arrays using numpy. " + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "a = np.array([[1,1],[1,1]])\n", + "b = np.array([[2,2],[2,2]])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Addition" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[3, 3],\n", + " [3, 3]])" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a+b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Subtraction" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-1, -1],\n", + " [-1, -1]])" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a-b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Multiplication" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[2, 2],\n", + " [2, 2]])" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a*b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Division" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.5, 0.5],\n", + " [0.5, 0.5]])" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a/b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Raising powers\n", + "\n", + "Also we can raise powers of array values." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1, 1],\n", + " [1, 1]])" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Array 'a' values raised to the power of 2.\n", + "np.power(a,2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's try the same thing with the `b` matrix and see if the power raise works. " + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[4, 4],\n", + " [4, 4]])" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Array 'b' values raised to the power of 2.\n", + "np.power(b,2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Reshaping a matrix\n", + "\n", + "We can also reshape our matrix using **reshape** function.\n", + "\n", + "## Reshaping a 1D matrix\n", + "Let's reshape a matrix with 1 dimension from shape (1,3) to (3,1).
\n", + "**Example 1**" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([1, 2, 3])" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.array([1,2,3])\n", + "a" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1],\n", + " [2],\n", + " [3]])" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.reshape(a, (3,1))\n", + "a" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Example 2**
\n", + "Reshaping a matrix from shape (1,8) to (8,1)." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 4, 5, 6, 12, 76, 26, 67, 73])" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.array([4,5,6,12,76,26,67,73])\n", + "a\n", + "# Shape : (1,8)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 4],\n", + " [ 5],\n", + " [ 6],\n", + " [12],\n", + " [76],\n", + " [26],\n", + " [67],\n", + " [73]])" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.reshape(a, (8,1))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Reshaping a 2D matrix\n", + "Let's reshape a matrix with 2 dimensions from shape (2,3) to (3,2).\n", + "\n", + "**Example 1**" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1, 2, 3],\n", + " [4, 5, 6]])" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "b = np.array([[1,2,3],\n", + " [4,5,6]])\n", + "b" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1, 2],\n", + " [3, 4],\n", + " [5, 6]])" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "b = np.reshape(b, (3,2))\n", + "b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It is important to closely observe how the matrix was actually reshaped. See how 3 and 4 got arranged on the same row after the reshape operation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Example 2**
\n", + "Reshaping a matrix from shape (3,5) to (5,3)." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 56, 76, 31, 44, 55],\n", + " [ 33, 78, 123, 33, 28],\n", + " [ 12, 23, 43, 1, 2]])" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "b = np.array([[56,76,31,44,55],\n", + " [33,78,123,33,28],\n", + " [12,23,43,1,2]])\n", + "b\n", + "# Shape : (3,5)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 56, 76, 31],\n", + " [ 44, 55, 33],\n", + " [ 78, 123, 33],\n", + " [ 28, 12, 23],\n", + " [ 43, 1, 2]])" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.reshape(b, (5,3))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Reshaping a 3D matrix\n", + "\n", + "Let's reshape a 3D matrix from shape (2,2,3) to (3,2,2).\n", + "\n", + "**Example 1**" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[ 1, 2, 3],\n", + " [ 4, 5, 6]],\n", + "\n", + " [[ 7, 8, 9],\n", + " [10, 11, 12]]])" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c = np.array([[[1,2,3],\n", + " [4,5,6]],\n", + " \n", + " [[7,8,9],\n", + " [10,11,12]]])\n", + "\n", + "c" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[ 1, 2],\n", + " [ 3, 4]],\n", + "\n", + " [[ 5, 6],\n", + " [ 7, 8]],\n", + "\n", + " [[ 9, 10],\n", + " [11, 12]]])" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.reshape(c, (3,2,2))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Example 2**
\n", + "Reshaping a matrix from shape (1,2,3) to (1,3,2). This should effectively just reshape the inside 2D matrix. Let's give it a try. " + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[1, 2, 3],\n", + " [4, 5, 6]]])" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c = np.array([[[1,2,3],\n", + " [4,5,6]]])\n", + "c" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[1, 2],\n", + " [3, 4],\n", + " [5, 6]]])" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.reshape(c, (1,3,2))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**NOTE** : You can reshape the 2D matrices present inside a 3D matrix. Consider the above example, we reshaped a 3D matrix from shape (1,2,3) to (1,3,2). That is, we simply reshaped the inside 2D matrix (2,3) to (3,2). " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "# Joining\n", + "\n", + "Out of different join functions available in numpy, we mostly use `concatenate` to join 2 or more arrays. Make sure that either of one axes is the same.\n", + "\n", + "**Use Case** This is one of the most useful functions which can be used to combine two or more datasets to work on a single dataset.\n", + "\n", + "**Note** : In numpy,\n", + "\n", + "`axis=1 refers to rows`\n", + "`axis=0 refers to columns`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Joining two 1D arrays\n" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 1, 2, 3, 4, 5, 6, 12, 76, 26, 67, 73])" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.array([1,2,3])\n", + "b = np.array([4,5,6,12,76,26,67,73])\n", + "\n", + "np.concatenate([a,b])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The contatenate function produces a new matrix. The orignal matrices `a` and `b` remain unchanged after the concatenate operation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Joining two 2D arrays\n" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 1, 2, 3],\n", + " [ 4, 5, 6],\n", + " [ 56, 76, 31],\n", + " [ 44, 55, 33],\n", + " [ 78, 123, 33],\n", + " [ 28, 12, 23],\n", + " [ 43, 1, 2]])" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.array([[1,2,3],\n", + " [4,5,6]])\n", + "b = np.array([[ 56, 76, 31],\n", + " [ 44, 55, 33],\n", + " [ 78, 123, 33],\n", + " [ 28, 12, 23],\n", + " [ 43, 1, 2]])\n", + "\n", + "np.concatenate([a,b], axis=0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When we specify `axis=0` we actually mean that we want to append the columns. We can see that rows got added into the matrix, and the columns received the additional values. \n", + "\n", + "Let us try performing a concatenation for `axis=1`" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 1, 2, 3, 56, 76, 31],\n", + " [ 4, 5, 6, 44, 55, 33]])" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.array([[1,2,3],\n", + " [4,5,6]])\n", + "b = np.array([[ 56, 76, 31],\n", + " [ 44, 55, 33]])\n", + "\n", + "np.concatenate([a,b], axis=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Joining two 3D arrays" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[ 1, 2, 3],\n", + " [ 4, 5, 6],\n", + " [ 99, 199, 299]],\n", + "\n", + " [[ 7, 8, 9],\n", + " [ 10, 11, 12],\n", + " [ 88, 188, 288]]])" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.array([[[1,2,3],\n", + " [4,5,6]],\n", + " \n", + " [[7,8,9],\n", + " [10,11,12]]]) # Shape : (2,2,3)\n", + "\n", + "b = np.array([[[99,199,299]],[[88,188,288]]]) # Shape : (2,1,3)\n", + "\n", + "np.concatenate([a,b],axis=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Splitting\n", + "We use `array_split` to split an array into `n` number of parts.\n", + "\n", + "**Use Case** Splitting can be used to divide the dataset, hence working or experimenting on various parts of data to understand it's behaviour." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[array([1, 2]), array([3, 4]), array([5, 6])]\n" + ] + } + ], + "source": [ + "a = np.array([1, 2, 3, 4, 5, 6])\n", + "newarr = np.array_split(a, 3)\n", + "\n", + "print(newarr)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see above, a new array is created, splitting `a` into `3` different arrays. \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Random\n", + "\n", + "The numpy.random module is responsible for generating arrays with random numbers. Each time you run the below cells, different numbers are generated." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## np.random.randn\n", + "\n", + "The **randn** generates an array of random float numbers whose shape is to be given and it draws the numbers from a **standard normal distribution.**" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 0.34291505, 0.15846929],\n", + " [-0.28041854, 1.37424051],\n", + " [-1.00608324, -1.23477004]])" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import numpy as np\n", + "\n", + "# Below (3,2) represents the shape of the array.\n", + "np.random.randn(3,2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## np.random.randint\n", + "The **randint** generates an array of random integers.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[4, 7],\n", + " [3, 6],\n", + " [2, 9]])" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Below, an array of size (3,2) is generated with random values between 1 to 10.\n", + "np.random.randint(1,10, size=(3,2))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## np.random.rand\n", + "The **rand** generates an array of random float numbers whose shape is to be given and are drawn from a **uniform distribution.**" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.61846814, 0.38662633, 0.25613095, 0.5793173 ],\n", + " [0.31091698, 0.79434679, 0.02547833, 0.11601057]])" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Below (2,4) represents the shape of the array.\n", + "np.random.rand(2,4)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## np.random.seed\n", + "\n", + "The **seed** is associated with a number and each **np.random** computation done below it shall be fixed and no matter how many times the cell is run, the array numbers are fixed and will not change.
Simply, each random arrays generated are associated with the seed value.\n", + "

In the below code block, \n", + "1. First we set the seed value to 5. \n", + "2. Next, we will generate an array of random integers between 1 to 10.\n", + "3. Now run the code block at least 2 times and you'll notice that the values are unchanged.\n", + "4. Now set a different seed value and run the code block at least 2 times to generate new random values." + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[4, 7],\n", + " [7, 1],\n", + " [9, 5]])" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.random.seed(seed=5)\n", + "\n", + "np.random.randint(1,10, size=(3,2))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can try running the above code a multiple times, and you will always get the same matrix. This is because, we have explicity specified a seed value. This is why we call a random number generator in computer programming as a psudo random number generator. It does not generate numbers that are truly random. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "# Broadcasting\n", + "\n", + "This is one of the most essential topics in numpy. The term broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations.\n", + "\n", + "NumPy operations are usually done element-by-element which requires two arrays to have exactly the same shape. However, convenience methods in NumPy allow us to use shortcuts. Let's say want to multiple the elements of an array with the number 2. How would we do it? " + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2 4 6]\n" + ] + } + ], + "source": [ + "a = np.array([1, 2, 3])\n", + "b = 2\n", + "\n", + "print(a * b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From above result, we can infer that the scalar `b` is stretched during the arithmetic operation into an array with same shape as `a` so the shapes are compatible for element-by-element multiplication.\n", + "![Array Broadcasting image](https://numpy.org/doc/stable/_images/theory.broadcast_1.gif \"Array Broadcasting image\") [source](https://numpy.org/doc/stable/user/theory.broadcasting.html#array-broadcasting-in-numpy)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The multiplication of the matrix by the number 2, is actually not a valid arithmatic operation. This is why we say that NumPy actually created an array from the `2` by stretching, and then performing the appropriate multiplication operation.\n", + "\n", + "## The Broadcasting Rule\n", + "\n", + "> In order to broadcast, the size of the axes of both arrays in an operation must either be the same size or one of them must be one.\n", + "\n", + "\n", + "Consider an example below,\n", + "\n", + "Suppose there's a matrix `a` and you would like to scale `1st row by 3 times, 2nd row by 4 times and 3rd row by 5 times`.\n", + "\n", + " a = [[1,2,3], \n", + " [4,5,6],\n", + " [7,8,9]]\n", + " \n", + " Shape of a = (3,3)\n", + " \n", + "Now in order to scale the matrix `a` by given values, we'll make another matrix `b`,\n", + " \n", + " b = [3,\n", + " 4,\n", + " 5]\n", + " \n", + " Shape of b = (3,1)\n", + " \n", + "As the broadcasting rule says,\n", + "1. Size of both arrays are same, (i.e) size of a = 3, size of b = 3.\n", + "2. One of the array's axes is equal to `1`. (i.e) `y` axis of array `b` = 1\n", + "\n", + "\n", + "Now since the **Broadcasting Rule** is met, the broadcasting can happen.\n", + "\n", + " Output = [[3,8,15],\n", + " [12,20,30],\n", + " [21,32,45]]" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[ 3 8 15]\n", + " [12 20 30]\n", + " [21 32 45]]\n" + ] + } + ], + "source": [ + "a = np.array([[1,2,3],\n", + " [4,5,6],\n", + " [7,8,9]])\n", + "\n", + "b = np.array([3,4,5])\n", + "\n", + "print(a*b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "**Case when broadcast fails**
\n", + "When the trailing dimensions of the arrays are unequal, broadcasting fails because it is impossible to align the values in the rows of the 1st array with the elements of the 2nd arrays for element-by-element addition.\n", + "![Broadcast Fail](https://numpy.org/doc/stable/_images/theory.broadcast_3.gif \"Broadcast Fail\")\n", + "\n", + "[source](https://numpy.org/doc/stable/user/theory.broadcasting.html#array-broadcasting-in-numpy)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/NumPy/Lesson 4 - Indexing and Slicing.ipynb b/NumPy/Lesson 4 - Indexing and Slicing.ipynb new file mode 100644 index 0000000..3a0e649 --- /dev/null +++ b/NumPy/Lesson 4 - Indexing and Slicing.ipynb @@ -0,0 +1,1383 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Indexing and Slicing\n", + "\n", + "NumPy array indexing is a rich topic, as there are many ways you may want to select\n", + "a subset of your data or individual elements. So far we have studied the various types of arrays, and how to make them. Now let us look at how we can work with these arrays. \n", + "\n", + "Once you have created an array, you most definitely want to fetch data from it. There are various ways in which one can fetch data from an array. This chapter will cover some of the most useful operations you can perform on arrays.\n", + "\n", + "Let us start with a 1D array as shown in the below example." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![](./images/1.JPG)" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "arr = np.array([1,2,3,4,5,6,7,8])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The first element of the array is marked by index `0` and the last element of the array is marked by index `7`. This is shown in the above representation for reference. \n", + "\n", + "We can directly query a single cell of the array by specifying the index we want to select. Let us try a few examples." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "8" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr[7]" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "2" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr[1]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now what if we want to query a range of the array. Basically fetch out a slice from the primary array. We can do so by specifying an index range in the form `start:stop`. The `start` refers to the start index (inclusive) and the `stop` refers to the last index upto which we want to fetch (exclusive). This means element at `start` will be included in the slice, but element at `stop` will not be included in the slice." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![](./images/2.JPG)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([3, 4, 5])" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr[2:5]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As can be seen in this example, we fetched all values from index 2 to index 5. Obvious the value at index 5 was not included in the array, as the `stop` value is exlusive, but the `start` value is inclusive. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Values fetched are by reference\n", + "\n", + "There is one very important concept we need to understand here. The slice fetched is not actually a new array, but references the primary array. Which means any changes in the primary array, will reflect in the slice and vice-versa." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![](./images/3.JPG)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "arr_slice = arr[2:5]" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "arr_slice[:] = 25" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 1, 2, 25, 25, 25, 6, 7, 8])" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see that the 3 corresponding values within the array also got set to 25. It is important to note that we made the change to `arr_slice`, but the change also showed up inside `arr`. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Copying Values\n", + "\n", + "Let us say we did not want to pick up the slice by reference, but actually create a copy of the slice. We can do so by using the `copy()` function. Let us take a look. " + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "arr = np.array([1,2,3,4,5,6,7,8])\n", + "arr_slice = arr[2:5].copy()\n", + "arr_slice[:] = 25" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1 2 3 4 5 6 7 8]\n" + ] + } + ], + "source": [ + "print(arr)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[25 25 25]\n" + ] + } + ], + "source": [ + "print(arr_slice)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As we can see in the above example, the elements within `arr` did not change. The `arr_slice` however got the updated value of 25.\n", + "\n", + "Keep in mind that the `copy()` function is literally creating a new array from the slice. This operation could take a while to perform if the array is very large. It is always recommended to not use the copy and just get a view of the original array. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Broadcasting\n", + "\n", + "``` Python\n", + "arr[2:5] = 25\n", + "```\n", + "\n", + "This line is an example of broadcasting. Over here we can say that the value 25 is progated to the entire selection, or in other words we can say that the value 25 is broadcasted to the entire selection. \n", + "\n", + "In data science, Broadcasting is very commonly used in several data operations. Python makes it really easy for us to change the values of several elements, in a single command. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 2D Array\n", + "\n", + "Now that we say how a 1D array looks, let us perform some quick operations on 2D arrays." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![](./images/4.2.jpg)" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 1, 2, 3, 4],\n", + " [ 5, 6, 7, 8],\n", + " [ 9, 10, 11, 12]])" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])\n", + "arr2d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![](./images/4.1.jpg)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Fetching a single element\n", + "\n", + "Let us use the row and column indexes of the array, to fetch a single element. It is important to the remember here, that this 2D array is actually made of 3 independent arrays that have the cell values, and another outer array that holds the 3 rows. Keep this in mind when you are fetching the indexes. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\"drawing\"" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[0][3]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have successfully fetched row = 0 and column = 3. The cell value of `4` gets returned. \n", + "\n", + "Let us try to fetch an entire row." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([5, 6, 7, 8])" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[1]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see that we fetched the slice of an entire row. This is row #2 or rather the row at index 1." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 3D Array\n", + "\n", + "Let us take a look at a 3D array, and some of the operations we can perform on the same. " + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[ 1, 2, 3],\n", + " [ 4, 5, 6]],\n", + "\n", + " [[ 7, 8, 9],\n", + " [10, 11, 12]]])" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr3d = np.array([[[1, 2, 3],\n", + " [4, 5, 6]], \n", + " \n", + " [[7, 8, 9], \n", + " [10, 11, 12]]])\n", + "\n", + "arr3d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\"drawing\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The 3D array shown above, is actually 2 arrays, each of shape `(2,3)`." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(2, 2, 3)" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr3d.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Extracting a 2D Slice\n", + "\n", + "Let us try to extra one of the 2D arrays as a slice of the 3D arry" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"drawing\"" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1, 2, 3],\n", + " [4, 5, 6]])" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr3d[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see that `arr3d[0]` got us the row at index 0. And the row at index 0 is actually a 2D array, so `arr3d[0]` actually represents a 2D array." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Copy / Broadcasting" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[31, 31, 31],\n", + " [31, 31, 31]],\n", + "\n", + " [[ 7, 8, 9],\n", + " [10, 11, 12]]])" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "vals = arr3d[0].copy() # Saving a copy of the first 2D array\n", + "arr3d[0] = 31 # Replacing the array elements of the first 2D array\n", + "\n", + "arr3d" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[[ 1, 2, 3],\n", + " [ 4, 5, 6]],\n", + "\n", + " [[ 7, 8, 9],\n", + " [10, 11, 12]]])" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr3d[0] = vals # Replacing the elements of the first 2D array with the copy we made earlier\n", + "\n", + "arr3d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "By default we always get a view of the same primary array, unless we use the `copy()` function to create a new array. The broadcasting operation can be performed on the entire 2D array slice, wherein all elements in the 2D array, that is located within the 3D array, got set to value `31`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Advanced Operations" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"drawing\"" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([1, 2, 3])" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr3d = np.array([[[1, 2, 3],\n", + " [4, 5, 6]], \n", + " \n", + " [[7, 8, 9], \n", + " [10, 11, 12]]])\n", + "arr3d[0][0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The above example shows how to fetch the first row of the first 2D array that is present within the 3D array. Once we extract `arr3d[0]`, anything we do after that operates the same way as a 2D array. We should remember this for ease of understanding. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Indexing with Slices" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let us start with a 2D array. Here we will try and use some advanced slicing operations." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 1, 2, 3, 4],\n", + " [ 5, 6, 7, 8],\n", + " [ 9, 10, 11, 12]])" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])\n", + "arr2d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now let us pick out the first 2 rows from the 2D array." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1, 2, 3, 4],\n", + " [5, 6, 7, 8]])" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[:2]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can read this expression as\n", + "\n", + "> select the first 2 rows of arr2d\n", + "\n", + "We must become fully comfortable with this syntax and how it operates. We use such slicing operations extensively in machine learning problems. \n", + "\n", + "We can also pass the slice ranges for each dimension. " + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[2, 3, 4],\n", + " [6, 7, 8]])" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[:2, 1:]" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1],\n", + " [5]])" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[:2, :1]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First of all, we did an index and slice extraction across both the dimensions. But what is the difference between `:1` and `1:`? \n", + "\n", + "It is simple actually. \n", + "\n", + "**:1**\n", + "\n", + "The `:` before the index means get everything before the specified index. Data at the specified index *IS NOT* included. This means `:0` would actually be an invalid operation. \n", + "\n", + "**1:**\n", + "\n", + "The `:` after the index means get everythign after the specified index; upto the end of the array. At the specified index *IS* included. \n", + "\n", + "## Few more examples" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1, 2, 3],\n", + " [4, 5, 6],\n", + " [7, 8, 9]])" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d = np.array([[1,2,3],[4,5,6],[7,8,9]])\n", + "arr2d" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([7, 8, 9])" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[2, :] # same as arr2d[2]" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[7, 8, 9]])" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[2:, :] # same as above" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([4, 5])" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[1, :2]" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[4, 5]])" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2d[1:2, :2] # same as above" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Boolean Indexing\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [], + "source": [ + "names = np.array(['Tom', 'Amy', 'Will', 'Tom', 'Will', 'Will', 'Amy'])\n", + "data = np.array([[ 1, 2 , 3, 4],\n", + " [ 5, 6, 7, 8],\n", + " [ 9, 10, 11, 12],\n", + " [ 13, 14, 15, 16],\n", + " [ 17, 18, 19, 20],\n", + " [ 21, 22, 23, 24],\n", + " [ 25, 26, 27, 28]])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Lets us take 2 arrays. The `names` array represents the names of persons, where as the `data` array corresponds to some activity done by the persons. \n", + "\n", + "We have total 7 names inside the `names` array, and we have a row of data per person, thereby 7 rows of data inside the `data` array. " + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array(['Tom', 'Amy', 'Will', 'Tom', 'Will', 'Will', 'Amy'], dtype=' The boolean array must be of the same length as the array axis it’s indexing.\n", + "\n", + "Let us try some more examples. Say get only columns from index 2 and further where name == Amy" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 7, 8],\n", + " [27, 28]])" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data[names == 'Amy', 2:]\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also do a not equals, to get all rows corresonding to people who are not named `Amy`" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 1, 2, 3, 4],\n", + " [ 9, 10, 11, 12],\n", + " [13, 14, 15, 16],\n", + " [17, 18, 19, 20],\n", + " [21, 22, 23, 24]])" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data[names != 'Amy']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The operation also supports combining multiple conditions" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 5, 6, 7, 8],\n", + " [ 9, 10, 11, 12],\n", + " [17, 18, 19, 20],\n", + " [21, 22, 23, 24],\n", + " [25, 26, 27, 28]])" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data[(names == 'Amy') | (names == 'Will')]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setting values conditionally\n", + "\n", + "We can also use boolean indexs when setting values to elements in an array. All elements qualified with the boolean condition will get the value set. \n", + "\n", + "Let us try to set all even numbers in our data array to `0`. We can do this in a single line of code." + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [], + "source": [ + "data[data % 2 == 0] = 0" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 1, 0, 3, 0],\n", + " [ 5, 0, 7, 0],\n", + " [ 9, 0, 11, 0],\n", + " [13, 0, 15, 0],\n", + " [17, 0, 19, 0],\n", + " [21, 0, 23, 0],\n", + " [25, 0, 27, 0]])" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Fancy Indexing\n", + "\n", + "**Fancy indexing** simply means passing an array of indices to access multiple array elements at once. If we know that we want to pick up elements at a specific index, then we can simply specify that index rather than passing a Boolean array with `True` / `False` values. " + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": {}, + "outputs": [], + "source": [ + "arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Suppose we want to access three different elements. We could do it like this:" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[4, 6, 8]" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[arr[3], arr[5], arr[7]]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Alternatively, we can pass a single list or array of indices to obtain the same result:" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([4, 6, 8])" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ind = [3, 5, 7]\n", + "arr[ind]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When using fancy indexing, the shape of the result reflects the shape of the index arrays rather than the shape of the array being indexed:" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[4, 8],\n", + " [5, 6]])" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ind = np.array([[3, 7],\n", + " [4, 5]])\n", + "arr[ind]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Negative indices\n", + "\n", + "We can use a negative index to count the index from the bottom of the array. A value of -1 means the last index of the array. -2 means the second last index, and so on. " + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([10, 9, 8])" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr[[-1, -2, -3]]\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/NumPy/Lesson 5 - Universal Functions.ipynb b/NumPy/Lesson 5 - Universal Functions.ipynb new file mode 100644 index 0000000..c55d272 --- /dev/null +++ b/NumPy/Lesson 5 - Universal Functions.ipynb @@ -0,0 +1,325 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Universal Functions\n", + "\n", + "NumPy provides some useful out of the box functions to perform operations on matrices. These functions are divided into 2 different segments:\n", + "1. Unary Functions\n", + "2. Binary Functions\n", + "\n", + "A Unary function operates on a single matrix by itself, while a binary function requires 2 matrices. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Unary Functions\n", + "\n", + "## np.sqrt\n", + "This function simply returns square root of all the elements in an array." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[2., 3., 4.],\n", + " [5., 6., 7.]])" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import numpy as np\n", + "a = np.array([[4,9,16],\n", + " [25,36,49]])\n", + "\n", + "np.sqrt(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## np.exp\n", + "This function computes the exponent of each element in an array." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[5.45981500e+01, 8.10308393e+03, 8.88611052e+06],\n", + " [7.20048993e+10, 4.31123155e+15, 1.90734657e+21]])" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.exp(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## np.log\n", + "This function will compute the natural logarithm (base e) of each element in an array. " + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1.38629436, 2.19722458, 2.77258872],\n", + " [3.21887582, 3.58351894, 3.8918203 ]])" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.log(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Similarly, we can also use **log10, log2, logp**." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## np.rint\n", + "This function rounds elements to their nearest integer. The function `rint` can be read as `Round to Integer`" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1., 2., 3.],\n", + " [3., 4., 4.]])" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a = np.array([[1.38629436, 2.19722458, 2.77258872],\n", + " [3.21887582, 3.58351894, 3.8918203 ]])\n", + "np.rint(a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![](./images/ufuncs.JPG)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Binary Functions\n", + "\n", + "Lets start with 2 matrices and perform some basic arithmatic operations on them. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "a = np.array([[1,2,3],\n", + " [4,5,6]])\n", + "\n", + "b = np.array([[2,4,8],\n", + " [16,32,64]])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Addition" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 3, 6, 11],\n", + " [20, 37, 70]])" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a+b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Subtraction" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ -1, -2, -5],\n", + " [-12, -27, -58]])" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a-b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Multiplication" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 2, 8, 24],\n", + " [ 64, 160, 384]])" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a*b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Division" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.5 , 0.5 , 0.375 ],\n", + " [0.25 , 0.15625, 0.09375]])" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a/b" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There are several binary functions that one can choose from. The table below shows some of the binary function operations available.\n", + "\n", + "> Keep in mind that the matrices have to always be of appropriate size for the binary operation to work. We cannot add a (2,2) matrix to a (3x3) matrix. If you can perform the matrix operation mathematically, then the system can also perform it for you. But it will fail on attempting an invalid matrix operation.\n", + "\n", + "\n", + "![](./images/binary_funcs.JPG)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/NumPy/Lesson 6 - Statistical Methods.ipynb b/NumPy/Lesson 6 - Statistical Methods.ipynb new file mode 100644 index 0000000..609ef90 --- /dev/null +++ b/NumPy/Lesson 6 - Statistical Methods.ipynb @@ -0,0 +1,672 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Statistical Methods\n", + "\n", + "NumPy provides several statistical functions. We use basic statistics very often in daily life. Computing average, sumation and several others. \n", + "\n", + "In Machine Learning, we often have to perform several statistical operations. Even when plotting charts, requires use of several statistical functions. Let's take a look at some of the statistical functions provided by NumPy." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "data_list = np.random.rand(4,4) # products a random (4,4) sized 2D matrix" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.72686874, 0.19474382, 0.05061276, 0.35455887],\n", + " [0.61115868, 0.55553073, 0.44741632, 0.92858649],\n", + " [0.02789257, 0.69665108, 0.31917997, 0.48274819],\n", + " [0.81314283, 0.11493034, 0.7677079 , 0.53254535]])" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data_list" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Sum\n", + "\n", + "Used to fetch the summation of all values within the matrix. " + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "7.624274629565662" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data_list.sum()" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Mean\n", + "\n", + "The Mean value is nothing but the Average value. We simply produce a sum() of all numbers and then divide by the number of numbers we did a summation on. \n", + "\n", + "![image.png](attachment:image.png)\n", + "\n", + "`np.mean(array)` - returns mean of array-list." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.47651716434785385" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.mean(data_list)" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfcAAABlCAYAAABQrKrbAAAgAElEQVR4Ae3dd7RFz1UX8LFXVGIPAooYBQQJJQYMJJQQAkKoUhJKAKXXkAARSMBAiIKREhCQEhSVEqKQAEoCJGAgVqQKliUBBNdyLVH4X9cHZv+yf5M5955z3y3vvrdnrfPOfefMmfI95Tu7zJ7WKhUChUAhUAgUAoVAIVAIFAK3EIHfcgvbVE26DgR+a2vtta+jqdXKQqAQKATuDwLv0lr7h621978/Xa6erkDgt+3IYzD4Z1tr79dae0Fr7Udba797R/46VQgUAoVAIXAmBP5Ya+2rW2v/p7X2/1pr33ymequa24/A72ytfWVr7fcuNPV3tda+prX291trP9Va+5Ei9wWk6nAhUAgUAmdE4O1aa/++kzpi/6bW2p86Y/1V1e1G4Pe01v5ta+0P7mgmyZ60Tmovct8BVJ0qBAqBQuAcCDwpSeu/2lp7yjkqPaCO39Fai+2Ay1ddgqCiDrbja0m/PbV7l7+EPv3t1tpTe8ce16XtZ7XWdqnd15C7IreS+9p2X8t9qHYWAoVAIXArEHhiInYSO6K/jelPtta+obX2dd0fYJfDFvPCx7fWHt47wofA/2+9p2OI5tm9jq9trX30nvy35fRrJWy+sbX2mIWGIf2vaq19eNfQvLATO3xI2l/eWlsa0FDH75PcVRvk/ooVanl5n9Pxdl8fu9DuOlwIFAKFQCGwAYHHD8T+RRuu3ZL1D7TW/kxr7fW6qv91+x5hR/pD/bxzr9N//+k42Vp7g9baz7XWPraTLklylth+X9Jae17vGzJjC/64Tmgk1aWE2N6ntfaRrbUXt9YQ/KkTWzZs9DvwYQ55/VQxYo08sJHXeddKBjMGZp/UWvvrrbU37MfHHZU608tDe37SuqScf9m3KBPx/r60/ZHW2r9rrbln+fh4H4Lc1zjUGUxx2vyo1tp3d9x7k2pXCBQChUAhcAgCPvQ/1D/yiOGHW2u//5CC9lzjY/+tqR51xfbLrbU/3J20Xp6Ox3n7IHjkjph2Seya8lattX/cScP1n97bR7r9hdbal+1pb5w2IPgH8c+J9iTp5y70W9uRvQEHqTZjEr95pktB7kh/X4LDB/TyYhBg8AWbTPaf150rOVjG9qo+UIr/7an4mTEiBbnTBIzEH3nGPZLXR4OqSoVAIVAIFAI3QICUFyRh/543KGvfpQiFxPklqU7kkYkawbysS9uf3Vp7005aUfZacidZknypmHltq1v6c73uv9b/37c7hNyR1NaEDLVZu+J+GFRkrQaSRKTO80Z/VJe+w7a+hdy1zwAHNjGYow5X9iNaazQfkjrdk9j+eJfcYRvH7KOMftkDavki90Ck9oVAIVAInAkBZEDFGmRCas5Ee6pmqPeXer3f1VrLZPiE1tr/7WrxWf1ryd21iAm5sD+HkxgHMlP89DPXO6vLsa3k/l6ttc9vrQXhLpW7dJyq+6Udm59prTFTRHqj1hpHx8+NA8N+C7mrJ7CJtsIJ2VO903rIM6atNndq+ZLcRxTr/0KgECgETojA2yRiR/CI7FyJVKpORIu0pL/aSX+X9mALuVP1q+Nv9PKRzL/owVUe0lXJ+wh+K7mzG3/9Doe03pSdu09O98VgR6Jx+OkdxC7PFnL/o70O7ZWQNiKm6fig7lDYTz1ot4/cDRTeqbX2vq21n2it/WTXRjgWg4gHFZj+cS9KLZ8AqZ+FQCFQCByCwOckEkGCnNTOlcynV6ftGa21N+5E/6F7GrCF3NmilR82ZYTuf2TPG36Nav4S5I6kf7G39UU94hspHk670hZy/4vdcS5L1Y/uTohfsaOSfeSOoGkWODAqxybojWct2+RnVRS5z1CpY4VAIVAIbECAN7TQskGw9iFBbyjm4KxstN/f60dctiV1c65kC7l/YGvt29NULJLjl7bW/lFr7Wm50B2/t5I7ZzDS59JUsh1VPehUaDao4ZkukHyYFh6UMf2zhdxdFh7xqYgHbO35WP7NFu+5GW3sOc/s9z6p3TVF7jPk6lghUAgUAhsQ4MD1/IHcxQQ/Z/qUVP/fWVnxFnJX5EhgSGYLMW0l92Oo5bWbFB0DL4OgsR8zuLaS+6yMNceQ8ClSkfspUK0yC4FC4F4hQL06Su7suudMHLeCwDhw7ZNMtW0rud+0P5cid4Ovf9PxYWvfFe41+nguco/6jr0vcj82olVeIVAI3DsEkPu/SuSKZM9J7j7k35lsyxzr8rSvpRuyj9z/RGvtzQ/YluqbkXtI/zQA4/Yx3Ts/poeN59cMYLSFiSIW7nFvlqLN5XbvI3dtOQSbY1/D3j9LRe4zVOpYIVAIFAIbEDDN6T8O5H5OtTybMslUpLiQ3o9hczcNLcrbskcsszQjd3Pz/0MPpiOgTt7+Ux+w5GP5999aoaEwv185fyUNfv7Jiuv2kfs7H4jNFhzX5P1fM6DL5r6ASh0uBAqBQmADAjNyD6/yDcUclBWJIy9R50iT4Vj3yhRsZqngfZL7h7TWvveAbckBbkbuNAzqmW0c9oRQ5fU/O/+XljrWj5sGKAbAI/v/4Vi3RrOxj9xJ4Idgc+xrvmUBg5LcF4Cpw4VAIVAIrEXgUuRuShRizyaAHCXvHfZ0YB+577l88+kZue8q5CYOdTHP3z7SOGUwjs/2+8h9ds1tOlbkfpvuRrWlECgErhKBGbmfWi1vMRbR5wTPyUlI05jXvc+x7q6S+3t0bES4y2nUbLDlL6Ui9yVk6nghUAgUAvcEgRm5I85jpnA8Ew2Nuppqmap6TOZOW8aVvVYe6mPe4rO50YeQu2vMeRe0Zqvp4ZSSuwAysCHt/3xrbTYdkLnAojdhyxaYx3UzM8IpyV19loQVvc4Kbhwyj52uUXK3St+Htda+sK/G535WKgQKgULgYgiQCE/tUCegi7W/s9e3aV3vlnodq9JZaSwITECbH2itPTnli59byd18cY57wqr+s96Wp0RhK/anInfq9h9MMfb1HQafltpkgGOhGCu1BTby6A/CHz3vT0XuiF2kOcF/4CikrFC1ZiYcM12a3E3NtCTwX17RKe+PBZBC4/Sz/R79574s74oijpZF5L8nttZof2YD4qNVVAUVAoXA7Ufg1OTuI0NKZ2O3tvhH9O3v9pXhAiESoOljpFf5DAgsH4rUZlL2FnLXR056T++VCQTzgk7wloRdk05F7qTgLx6wsdLbw1KjtBcuMLHBEJ7Cub57yhc/T0Xu79r9JGKq4tt2IhP9L1aPizbcZH9Jcofdf+39YjoazSO5XwY71r034DIIffu02I9jQu+eK3lGXtjbom7anyL4c6Ff9RQCtxCBU5P7qbq8hdxj4Rgf61jtTlx5H8GPXtnAreRugGLhmFGqXlndjbKditw/o2NGekccBmRWlBMaN69ad6PGX3gqHIndcxEbLcXM9KGPTEzy/VprzSDRs/XD6VqDtHMRrAGwdkS7rfL40JveiLq+ECgErheB+0Du7o7VyTjyRaL29iFEwmvSVnJ/3f7BX1P2sfOcitypqz+zrzqnzcwFVPNU0msi563t5yUldzEFMknO/B/043Vaa/+jP0NWGAzNBe2GoFAvaa1Z6/5ciWnkxxK5a8MxB1zn6kfVUwgUAkdCALnnjwLC4xx029MWyX3siw8xiRMp7fI6z9dtJfd87bl/n4rcx36QVj0vzxpP3PD/S5K7pjN9fFNrjeloSWq3emFIyd889NczdQpHw6Ga1/j3LVtr2vK1rbXXf42zdaAQKATuFQLXLLmLDmfaHrX7lkRq59w3TsWblQEf5VOxCyRzDSnI/c1625HlsRNPfQOk7ziy6QExav83dpI9drvXlrdvgR6hc4PcRQ28LclgZGlAclvaWO0oBAqBMyDwWmfwlj9FN9gY2dDZFn98g8f2O3Xnuof3j+Aum3g4KQkbS8o3Te8aUpC7dhsAmTZ3zMSOjNThgUj2rc++tm4DKTMZtFt0Pg6WtzW9SSJ3Cy9VKgQKgULgViFwreSOUJgPhK61XyOtWHSF13yoLNlHn7TnbiBKddgesifvbTmNfF8vYUPKPmb6gtYa5zrJfSC5ipdw06TdPPEDb8/mORMNB98BvgVs2PwmRru1wSCVe5gkSO//tB9zfBfWMPIMeaY4u2Xzl76z44vYuHVqoWthRcMkEJSylbWU5FWPaZiP6jMztM11ZkBsSTQtZnaIeyAuBU0aX4wxwdazEgNBONoMoPMAm8ks8sV+Vl6Uz6+Bn4RojgZcS/EF9I/D4z7s4VepELgTCFwruW8FH2H86+4dz7nORkoUiKXSegTMq39xd06kETCv2poAxyD39a04TU429ljcR1wG2ppvG6oSAEn0RFuo5WlHSO82+Wfk7NhPdE2TsmMZX8RonQHXmhuvTI56TB67puHlZplOyW9G22myhHXm6GiQMibPuzUPstOg39/TWhNXwnVr/QUQqrJo0AILv61B8OdTxcozs4CpxcafITC0/5SeN/I5z3cgMLUGwYip/587xIfQhp/r5RnwRDJgzNjDX14DnMDedRn7J8TFtS8ErhWB+0Du+ujljg9Q3ufY9td6D8/V7g9YwPD7ugR2rnacqp7nDaTnOUE+OX1sItF4jgwCkKrNc5Yl8riWA+h/m+Anqt0v9yl0PPOVb858lG0Qui+JI/Er6RrXGnCM5C46o3NI3CwRz/4b9Todc05AqzXk/uyen+8KTQ6JnRklFn8yQCHJS96/F02IWH0GAwZVkS+udy62/9m1OT3bb2il/ks//0Ottce31h7R4wpEoCwOsJG0LWIXRJn2H94HUmYWwF77M/Z5dk2UVftC4GoQuA/kTr33wT30rA9cbI6dwtnsam7+xobyxg7s8t6H9S4kKmaqeKsVBgmM5E7tThX8jikPad11ju9Sh1N7c8QjYUf5IjUit0yob50GGSTpfYlqW/2PTeWO5E7dHPVSoY8JkZHg15D7Z6V6njoUJKqfwY7+mSLo+yIxGcDmU9O1SJXmIved6lzwH9fTQBhQZkxJ4SFl60+W6PlsfFkq38qKkWDPwdQALLDn1wH7rPb3LIcmggmv0g0QyOqTGxRTlx6IgA/aOBXOSLdSIXBfESDVBgGM5B6YvGnKw+dgjc+Ha+Wjbo7ykXAQYJTt//xO5miFkWe2R4IIUdkjuf+FVKdIemNiq3bNPnJn3or5/Uh2tFGzmQvHG/0bIyjKH+RMyp5FiIzBA3PDmJBxlC3fmDjKRphm0QMj/oB8bPsZe4OP0ZzkewgHdSD5WXTMsc76f4KAUbCRH9XQpdLjNkQpu1QbT1lvkfsp0a2yrxGBU5I7PES9C4ISYnlMI7mbU78mIfewJ4/kboAQdf5UdyQl7Xv/IwlpLEjRroFK1mqwi8+EM6GSoy6hlceUCXo8z4FRlL+XTwIj0YzEwEL5s0EPST40FCRzfYw0kru+jAkecb06LslNY9su+v/sRi81yIjJyMko7pIAUi8aoY0P2VK779rxGbmzD9408Tq2IEpthcExn4GIa3/T53PX9RGa2Mf92JK7ejO5f/ykIcg9L+ZElb8mIfdQO4/kTvXtexuka4/82Ky1xyJOWcqd1acMznJRxj/vCxdxsszbM1IeA4BxsEB1H2Ug0uzhznvfuedMGmBBnLiOdP7ZQ73Rhoj3Ly+NRSTtyNjzbxjT+D3kgb8zURV8QmvNjcwbA77pCIcknqoap7zY+/1JrbVjfJx3tckIyBQidXGsCO/HtR6GHl7emWw8b7Gron6OzfSTez/1FW5G1zYvoo1tJg8uPCTwkDfOR17xo3MAE84UHgQPy31L48MMh2Oo5b2w8SLWvrA41jNw6m+b9993Itq7htzlGQls13ckE8xsbYMxsBQTwJq0i9xdj+jCES36l/cWntkVvIfdPFTWriOYGYTMNvlsvq2jX4v+ZQLOqnuBovDC7D7jm2gvlf6sXseYNNQt/kV2bsQPWS3vPo9pHFjtJfe/mRoVjYs9T0k3ZUuK1ZiijHE/G5FsKX9fXnYTwI31xpSGXdcbGMTD/Ym7MqZz+aaOdcb/vEVj/jLCesWkfZHXnp1MWyTmAR6/HhhTe+5TmpH77MXaiglsfUxqKwyO+QwY6J86ZbXyErmPEeoOJXfCx5hoNbPNfS25cxpbUstHHRzWLGgTdu/8TfRbcKIlgjdgD82AvMIO44JdW3ZWizbY52/6d/bBER7EhzQC44DANfgi2qufprHtqttgJCfkHtyjnKWBVR7A7NWa8DIludtIuQo2OomGriW5aKjRza+n65XzzA4Yb8RjSF5R12zv5n9oX3Yy1i/WhjXkTuKXlzpodGaY1eUYJw5l65sHIXCzVz8VzjhlhOeoecyRl+rdaJGmQTl5RKcON5E9R76Zk8dS2679OPV5/pDAC96VCoH7isAayZ3ndXxbtjjUwZSqOq6dEYzvIgk08uwlmH6j9knueCG0m8wbhESCZ5ai1Wlq2SwZWCHeaBei3KfKn5XjmCBLYT9nHiBQmBao7Jk3v2sEq4m67d2DrSljP5PcR63JXsk9NyC8LAXUiIayheTpADn/+JtTASl1nA8IrEskThjRj33kbt5ljIoEQDgkjU4Vgj0srUhFla9tAkYg7KURabTDSyq//X1JJPe4J/ruhduqSbovWB27n7Qb1ICciPY9m8euu8pbRuCU5D5Kj0uSeyb3tZL7LnKnPfF+M1WOCWlrh/M23/SlFCZM+WbBZfJ1vvdjhL98PhPt53XtKY2CKHKzRDMbAwL1Z3X+mB/Oo8Od9y2r5WcDq5Hc1w6sfqP+IPcvTNI7aZGTwZpk4j2pnaQeN8P+UuSe1z/eR+6hijG/EYiHJgEKct9nI00+CdaZ5nUZ6vp99T2yS+7ux32ZAsGbNL8wP5rMFfvwqvOHI+D58qGk5jT49KEjuYSp6PCS68qbIrDVW969W6uWly+0t75hM4I5VHKnlg+1uQF7DmKDpNTHrj5TeWfz3K7vuMA08b1gxuRPNktInel2l3TPiS++494BvGYa3a5kxbu45qt3ZAxeys7asH9+uv5kkruVqaz5Gw3d1yn9CNsw1QgVf1xrfylyz3aQXQ8Fe01MMXj6jpuy5hQSzn0X2zknUrqIS1s98UlPoaIyXeM+pPBODTytflbptAhQQfpA+jgxUyF5H0r3wMerCP60+O8rPUvuS1o89t54Z3x/1t4zEmWWWJfIPfs0rZUeSe5LNnckp72es3eeAOCZjJC4vNJ3pdCIKs/iQbMZDMELu6Rr9n/T3gJHQtW+gEimukWY3lf1wD1jW9nhcc1LB7MBcs/Yn4zcdZ7dOjqGiPKUgLHB/o/IRcINcpqLa+0vRe7UPNGOXeSepzGM6pJZX3cdM7INElY3m01I514EdnhbhD/cVdZ4jse8MgWDuA/q6QgYEffw0SMg9f9REaCxMjj/9kHaE/M77sGu9+iojanCHkAAORto+S5n6Q7huR80piReUc/kEao27pdvhfnhvsm+zTOid98JdL4vEerV9aR45blOPHZ18SGKQCzyiNgmj29o2MwfaHhrjU8Xwv381Kafb62Zksa/yPcyJHfliYqH4JEyid33mHOccwiRELkr6R9HQ/ltBqQEKvzlO0xDaxDB52mfRsM8/yiHeXpf3dpFU8vxznUi+D2tL3pDU2H2lZlY+DR8z/QRriP27nNgz2wBwy9KA23lB/bMyDPsH4RTqOUBgDxCxaGgHC7vQRf1f8zFNpXB6CXUDgHMWnIHNhv1Whv/rB35GNCiDX4vpVClUxeNXoxL1+w6ngcV6ucYEtoBI0BLex6SstOG33c5GTlnezs7303MJXcZq2P1jeo03hcLX0SiNQpnUYRS9yGQOc8eAQruEvdm3DuHPF6wI49r5FPWmEJyHsuN/12Xv6VxPO/VPZstQBOa842/DRqC3F+WnPUQoPc/nKIRew76MvZh/B8x+tZGfbQG+Ix6nYS8JsXSzcqgjV6bONONAl5ohu2zUzSnuGjjbA97g6DZuTgG+5k540HtDXJXmJRtCF7updGBB4tXcwTE36qWZwMxOjRSZJcx5cGUCHaPfYnTj/n4Hj42FA4X4nL7AOUHconcDSZ4xwPK9cdIM8c6Zg4Pm8A0hyYDLg+9tu5yLDm0/Nt0nbgE8fDab3m5blM/rqktBuHxQaTOzR+MGADzFaFarHQ+BAg9/HTY2xGHjeqdTdlKcDYSK4nX98Vxm++5PYmRmlcZM2mVVzmBxHdUfuWLzWHanTppbszgIf0zleU8ztvMAJrxA+JWt+u0J9qkLnWqm0Dle0+owwW0xp43JCxSmynAa6Tm8Y4gZ7xC8PRtJ4mv9R9Tlv4YnGjLVqdSfdFnErd+2PitZF8DdZiOB4cZrnHPmCUIzHF/YJjvz8yUMWLxG8D7kJLcJRXGBzarl/vpB3ZsF/JF8JUtank3IEY1iAvBixoV9QJ29tConMon20WM9GKFHY5xEeRfWUvknr0c2RePlfLASP1UQR7amyQPQowIqU+3PnA3qfuc15IuYsAFO32tdB4EfAy9c6Otkg3evfAdGD9Q52lZ1XKXETj1t2w2qFmL502uVccSf62t/yj5jAi8wCG5G0Vl8qReniWjfGTKfiCtldypJMIBAWnFKjokVOH6tMU2k1JJdmE20EajU6pvZZgjOYYzjAFLb+IDu1AJqeeY0mFWoSubNuIYKabE0XAcKkFRhRl9H2uD/U1fgIxNXtzBYO3QfuYy6/d6BMZ76b0OW+wuL+D1NVTOQqAQOCsCQe5UCZFyMH4EPNrbSL4cCDLx53mJiG1mc6cO51zgPFUgoskJUX93P09ayIFd2F5CPY3YZ3MPqeuz9LckuWctg0HBsRI1Uh4Y6QN1/U0T9Q7MlDfDdV/57KrWIFbGMTdOO8dIVHxUv9pm4EezU+myCIRzLYKvNeYvey+q9kLgIASC3ENyVwivvhyxzlSvnEjEyD2T7xrJPaLB+YiTsmehALMzBruERKoIgnNtHO+nH7TL+ZbIneNFkBwp/lhpxEAdx9AMPCW199APLfuYQc2+zVQYA7Vxc3zcsn32UAzdiyB2Ent+pg4ts667GQK0PKb1GIAz01UqBAqBK0QgyN20iEhjWL88x5ptlDTP2S5PsRiJbZQwETQ7apCq+Yg0AONmikfkEb1HGgOb7ArBx4kirl8id32NPKYcHCNxAGFj5zsQpgN1GMSMmo+t9dGqRHt39X1ruZfOz+kl+sWEESaaS7frPtdP+yVaJYn9WO/Gfcaz+l4IXAyBGblrDCeb+PBSh8eHl+chqX6cJreP3McVfKLsXXtSOOlwnDqwS7rL5L4kNbPnR728Em+aeJYidssPslWOjnXhdHhoPdnkMYt+d2i5l7zOnH/3gHRovudsSs0l23cf647pb+ZJ5/fiGKal+4hn9bkQuCgCS+SOzMPG7SNMpS5xfDK3fbR57yP3HIpQeRabod5f2pgJQiWY59CzO++a+5jJfUlyV3aQOxXkTRK/AVIOW3F4FCPzKN+eB/9NEoyivEMC4dyk7lNdm2cBuKeC1xTBnwrt/eXSrH1XD/SRtW7MVvw1jhELYn8rKkchUAgcDYElcleBeYhBKlTqCBqxz0LT7iN3znTZ2ezvbeiBaDzRDirv/PEZi8nknv0Icr7c1puouWFHyjEIeuNUgfmO2XP/po512YxwqI8A7/PRBHKT/4+xUhsth1kXcW+Zeo5hx0+3on6uRMDaEJ5ZUjoiN1C1mcFilgZ1faVCoBC4IgR2kXuWQKlPOVT5EAs7O6ZMmPLMCDh/yM1tX/vBEI4wCMAeKS2lTO4k3lkS8CbK0/9DEokfqSP30etfedlpT13+PzTlQdYhDnW0LBELIPp9jP1HHtqh4bo8DU7gnyL4AaAT/xshjtnaX9las1iPTVhQz4nQnnVPTnwTqvhC4NgIBLnzyB4TR7Asgf737kw3C2e4htyzs9wPLAwAtIGK0JJ7yhQMYFTp55V1xjZncs9Ogjlf9NmHi/f41kSyMWXPgGdctz3KIgFFvGH13MSxLuImG0wcYv+Ep2mLsDnmtsv3IXBYsx8dODlsVjoPAjQw+wZ6fEgqFQKFwJUhQJ3s5V6yT+epafItScNryJ262oc7PiYk6FkShQ5xCpofKXuM5+Nx3t5A4BWp/KW2+qCFR/uS6j6Xm39zKHxRr2NfWNkscevzIc5wiNmCB65n1mDeuA3p2BGY+HhETGl9/dLb0Ml70AYDdRoYA+/Z9uSa534PnoLq4p1BQLAVEeEsZxeLxVPBiV1rjntEndNhKvAgQvusEqdWZ8tVVg584+NMVe24j3aeMkdqFshfHgHyx+X8XEMqZucjsUfyEaIydB3P9MfFib6nZfiyfl4em75p7+gMJG8MMnjk70ryklDF+P2CHkhG2YiIaj73LcpByEJ5Cs4SbbEXaY4zIEwyxnHdbE+lHqF697V1dv01HcuOg+6xuNiVCoFCoBAoBFYiwPs8e8JnAiIxf0Vad5aEFmphNvNICEw+K+9YNSqX4bdynLOyl0FDThZAiNW/BDAhmfqwC3OJNHmez7zCkX3MlffxJ8ELtM9mS6pVpyX3clvk46g1EnzYeYXQXbL9O/7CXm4uM35Tu5N2xiRyG1t85Mv7wMW87pl5YyzLXOO4PkcRHPPdhf8NiCI0sT5/X3oO70L/qg+FQCFQCJwUAWo4H072V9Jg7P1GOlSiOai/mOmIPa+ug9x5vCO/uN4ggLSsHJv1aZHte096wxMXmSPX0Az4sJNud3liW1GI/dh1YdM2UNBu6vxv6WRoLj7p3zlTfEZ1tnn6CATZLtmOH9LV/OrSR06A+mdvMMEEMAtfy8RhSb7ICwu/aQt+vA94rFa0JoW5QztvOm1vTX2XzmPAFoMZ+0MdHi/dj7tQ/0wrdRf6VX0oBO4sAjxf2b8RuM3/8dv8YwQ6JurpMbkml8MxKpelHNLpLvss6ZiqGnFRya9NruORz3ucaSDmSFv4hmmBOp762yBiJiGTEmNQsWT71ynzdRMAABKlSURBVBYmDBtcAiP9ckzfZ0l/5Rm3uMZ1BkdrUtjtBchZqm9NOdeSJ097RO45OuK19OHa2+k55x/C1FZe8td+N6v9hcA9RID0jEDGULq3BYrs+GdFvPuQREbL6xq89D50+kR93CJ5GwwLocw8Fg6czGoGpJUKgUKgELgqBCJmPZU3P4Dblqw1b/DBBLDkF3Db2nzT9tC8hD+GvvMNsVpgpW0I0B6Z3vpuKy8zJZQmy1z3F/fnrsh9JXiVrRAoBG4fAkGgbOq3KWUv+UOm0N2mvmxpC9+IHxvs7ks+EVvKvW95Se1f16e5rek7E5JBFBPWRxW5r4Gs8hQChcBtRoAvgQAzs+l1l2x3ePNzxrtPyRTBkdzzIibXggWSfO20zXxWTtmXreSe23IIubPN5/76vdavJNddvwuBQqAQOBoCEUqWdz0HvEsnDn4GG9+zYT78pdt8rPpn5M5B8poSO7VAR8wLsfl/H8GbwSEa3If1ziJHvhaiNZpZ8pgNICDbLZJ7LvoQcreKpGmv0V+zQipOQUa1fhcChcBFEGBzN7WOM9ElPYQtDGOOvAV67qM6ekbup8DBPV6axTE77hhpeCaNjsf4R9A+GCya2mmKqABLu2Y7iAXBv8C0Tb4GYhpYOdGx57bWhGt2/ONXvh3nJvd379NQ9fX7e1utRVHpMgiYDWQKM18KM1D2DSwv08rrqNXsEfiZGs1calbXIRwR5dBqRTlbnF6vA61b2koS/A/1l+ISTfTAiIL3bZMldS/RnkvUOSP3Y0vupGFxG5g8kK/Nb8QUMQn4YngZJeT9xT2+gRgHEfMgrrV//57XLshdPr9N09z1ElPh/2Bfge2DOjFy8tTGiM/w6H7c7IE1HuzHIHfOdaaArkkGP/pp+5gi9zWQrc6DWAgfOTro0sXugzgbnCENBmMTk2PX4HKpvJsc1xbhzQXiutYk+JnZVDRSBusCtonHIvrolkQjbOAu2qhy3J9fWQjWtqXcyrsBAaOqSyYf5XO/hJfs71j3qckdUVNxx0dvaS84U5AoKUggo6W8jiPlSJnco4w4N9vT1vBU59AWYZyROCe3SAYk6iHB5zL1xyBk3OShlvcRGs/F/1H2uD9ELZ/LEDVSW0tyz6gc9huxx3oZyOCRe4pBRLD/mdbae7XWvj09t2sDZ+2pYvXp8BsyUBW18xoTrSE8x20rufMbGsvw/2xF0WvEqdpcCOxFADGy3eYX4diSO0mZFCQAkiA5uS4fRxoc7chJoKSvTHlFGyShKmOUqLaSuwEdtTYyj4/zGNb4Gb3ur0mNog34rK6+p8IfN2sxkBbG4/5H/D7+sxTkTkrJA4lZ3tmxIvcZKocdY4bJzyfNEol4lmJgKD+VPDVyrEvhmCiZowlpVs4xjiEzpB5tJ616h64xCZYGT9J79Gcrueu3ch41lFPkfo1PRLX5IARm5H5Kb3nEHJEKvbhmTozEriM+lC/vjo7P3GNz20ruARTJ/Zf6ByQv7WswEnbsTMg+8pzsPmKyCTFtvrpQ0kvnl8I8F7nHHbn83qAvCMVexMpZylNnPcMxKHufTvA0T8ceJM/aEcc8v3mdCObO2+CsHO07ZM/8FvfiEHKPOpnbopwi90Cl9ncegRm5n/qjFCF+44UzSs+JhBwS9WfkEwu/DyX3t+8vPVt3HmC8ST/+i/0DSa2+zxZ+DJv7oUFsSnJfeDAOPGwtDRI4UsimmlxcLN3tGSbd58SsdIlAUBz5BOCi5crrkuS2XdNv/jbxjbgJuedyityv6Qmott4IgUuQOzKPl9bedLSceKs7zsluTTqU3EP1Tm2e0+f0+n3cqVVJ4+yZS+pZ1xa5ZwSv/zdfoFg/Y9YbfhvxDJMwb0viQ3NXPPWZNQLjm5A7R9sop8j9tjyp1Y6TI3AJcvcBosqMF45qXGhi6dn9+JYFbA4hd9J4TIOLee7qJ6HHMsdP6hIYM8KoXfjN1r76703InSMcLNhJfZh3DSJeXeOrf5Xk/moszvUrtDvuGxKqdHwEityPj2mVeI8QQCandqibwcmbN8jd/hN7MBm/ea7vmso2lncIuZtyw8NZ8KKHpgKRdNjoqDkRvaA2+9JWcn+b7ngXyzDr9//uKl7Htni+F7nvuzvrz7uP7OcGoPw+lqRgSyPH82tK59pkUKlMazqMDm+0BVT6S3XuqoOGycDUu6Bsz/dSktfccc6ptvA3cR1fgq1Jew3OTcHTpyVTxq5y4e095O8S6RCbu3JguFROSe6Bbu3vPAJezEuEnx0d6wwwfrXbC/OLueYGHELuPoQv6VPhRknZ0sXsqOyu5jCvSVvJ3RSrb2it8ca3IXR7Jgobwl6bitzXIrU736e21p7ftUoxz/rLh0veuwcPylPeRAdERLYXLBCrgYJBo/Jj7rXrTP1CqM9prb2sa2+8j8qn+l+TxGTwvH5H9wxXPq3UbJrxW3RNg2BNMTgRTIyULPbDj6zwL4k2IVLT/WjhlKE8bTetNU9VjfyzvQG0wRFfAe1W1lf1GTFbyD3K4Ujo3imHbw+P+VxOkfvsLtSxO4kAcr+E5A7MkJDjI+Ol9BHcmg4hd3UYRIzEHnWHhBX/79sj968/INjGvnLXnC9yX4PS/jyCWcWzGPtRKn/6JE/kjf3DJlWZ7hnn855TJ1JDtt4HA76f7nl/vbX2dpOyxkOWDc5l+u1dGr3lEaDlnX+ytfasXvZ7tNa+sGuwXOdbsCbuBykdIbuGA5/IfGaMGKQwY9GIGbB6L2aJdG8wa0CvDGRs6isiFi2Uqc7UUudsZpTMkkH6UjlmD2hLLufNZ4XUsULgLiJwSXK3PGq8vPZG7IekQ8n9kLp2XeNDHL4Du/Id+1yR+3EQRYZIK3tXj+TuWUPUecocad11js+IPVpHBS7+v3DX8dwjIBqkTMQkcSQsD0l6X0Jw6qdlinJHcifFk2qdn0nVyNm5NeSurph2xzdmNKG9ZY8sp7xvmTTewDn8WgwC1O2YZLCtrePAf6bJMkDI5YiDEU6QygltSWBiX5J7B7p2dx+BS5K7j2Ge8+43W+DWdFvIfWu7j5W/yP1YSP5mOTyzgxBGco+ass2dtLk2WI182VGM1D5qq+J51gaBadau9cB2HurxkdwNOqJPs6h7bOXaso/cSeJBqAg+D0oCG/s8cH9iPtH9a6IttAazZMBg3ZHINyP3rLEQXGqWtJeZIMopcp+hVMfuJAKXInfOOxGkJl48ex7qW1N8DE15iWAiW8u45vxF7se9e4Gn5/HY5E6izFOzqKLH5HlGzvFe8MxfkwyMI8b9SO7ZLCCuw9t2CTmkboMOwaKo1ZdU6dogqFO0CzZLZi0+NeJEyKtN/FikfNy5pZgaJHDlR10juQsIlQWDpQEQjQB/hCiHz0GlQuBeIMDufG6bO5seOx2phJd8fkkd30rQa8ndYjPCwF7jthTdzkMaZLTFw/5ePNwHdjLwRAjHJndNyuT+cZM2ep7zO7nWsQ65m06p3SO50w7kAYM8JHVaBNNP32rHKoy5iZw9gyiXovfJL4hPtEV+QX8k4WDjeueD9PvpB3b7yD1rBvRjnHkQBRW5BxK1v3cIIPdzesuTFEKtF/PLBZGJF54N7hEb78JachcUJ+q5tv1b78AkyKjIfQdIG04Fnp6RU5B7liTZicc0atOOQe7q4OnPSW/27BtoI+slSdz1MSiP6w3Ml9JI7rF0clalG2wsTdkbyX0MYmOJ5miH79do2oh2jeReDnWBTO3vPAIzyf0NT9jrWDiGt26kx6cX1QtrudctaS25cwTiCHWNWyxFO8MlyKjIfYbO9mMR69+zuETuOUIdsl5rc9eafeQ+vpPs+2uS+d1Lavm4nnqeZ3peZCZI0l7blgielB3ly2vhnKU0knvkzQNs5G463SyN5D4+2zmEtXIMiGZpJPdSy89QqmN3EoHxQ+KlPRW5x4s9krcXnP09PjI/OwSW2Qf8WnLfV861ni9yP+6dCzw9j0vk/mbped3iUKelmdxH0nJ+fCe32NxDFY7wRmc35C8xe3nHP6RPVzP1LN49mrOluPSu+76UVz+W0jgQCPMDaT/q0sbZXHxl7iP3rAEgufPhmaWR3EcNwOyaOlYI3AkExg+JF+8U5G4xDmWb+zpLcT5efNNj1qZLkzupDY5LEs/afhyaL8hoRhSHlnmfrws8PYvHJnfPSLa5z9Ty4zu5VnLfZXPncCZozgdObixiDI2aPgvms5S8v/GO6sdS4igX+ezfqGfMNne28nEAEuWN5D6SsmmnUb5BwpJ6fyT3pfnyUW/tC4E7g4DRuJcjXhT7JQ/WQzv95B7Uwqh/KTTlGLGOY92Sqm1sx6XInf8AvwE+A+YqczayLvxSH8d2H+v/IKMi9+MgGnh6F45N7gaCeR69usZ0CnKPVexML5slBI9s9fmzZxn6sXdNAW8EsVmSmEn/8U3xfQmbeH7PmQaWHEVHch+fbdfFcs20DSLRzdJI7jO8Z9fVsULg6hHgJJPn3XohBds4RvLiW1FNFCofjqUPQdSV1ZXaIf76mnQJckfsyFxAEuYGBB+RxYTxPCfBBxmNH8A12FWe10RgzTz3rJY3AIgpZa9Z2oOPkNwzuc/umXcye8uvldzFZg+bOELNUnEsdIMISc9jEnwpVPofPJ4c/hcRzvuJnN9hOOdffYzwvPK8y5DnExLxf8FwLv7l/KetMUAYJXf5sop/aWno9xn8C0pyD4Rrf+cREN0qf2y8TG9wg16bUsMe5oMXkoAykaAPmRH5mKgMvXQifcXLbE/Sp7b8gB1TZpR1CXLXR74BWWLwwf+F3gdkfy41fZH7+ERt/9+9ekIPH/y89ByK6ua5fd/uNPeYnieWBfacIlTPrw0pze47CdJ9Ym4y+IvnXF2Iy3UIVh7kF8+RfOpyXN2zxFFUGZ+Wyn1Vfw9d553LDoCeW//zdTGnnTrf82wQblCxbxBOsxC2dwNa0rwyHNcHg90IK2ua3ZhyIBz5ntbfYflMaeNgGwPlwOl7O3bm2Qe++hUzb5TDru9bkMsZHQeV45vyjqmcfkntCoG7hQAJcyT3QyX3HCgjXspxP0oFzAJWgRvzjf9/6w6P5HOTO8xiuViDGH2IRKUZbffBO0fyAVfnTAo8R/13oQ7ENBJK3Ef7n+qS8DgAzXkin7LGxOY85s3/K5+tOx8bf6t7NjjeF+/eexnkbiVEUeyotKnVY8EYdSF2g/M1icmM/T3I04DBYD5iVvh/fNdzuQY7HGvjepoGNvxY0Eb42RdP8IATzUYk5fCcj3IMxpQT4XGFv33RpBxY5vc2yqt9IXCnEPCS5Q+JqWKHJKT36V3yMA/V3FajaXvSiNjXs+hXRupUbK6RjxSRf7suHHJm7To3uZuWJsoXzHxUciAOEfYCS5LgOVKR+81RplbnL+E5zM9t/G9RFMRKzesZddzm+Y7f9sqYqeg9o09N+fN7oTx+KVTn3h/1R9nx7sij7pBac4+Fk/W+KJNmINpkr05EbOU5cerZqoWptZobqZdvC29/Dq1LTmm5rvG3d9eAFhkjTCp7bc3LKI/X5P9J/SR9vgC0Aa73/UHaTHpWutM+JjukDacZvtrhO6ZPyhrL0bZcjoBWs3Jy2+p3IXD1CHiZgpDsvWzXlM5N7rB5bGvtO/uHNGMljnZgWeSekanfl0bA3POcDLSZ5Y6VskS9tUxkPrbF/7YtMQRm5ZDQCR5bytna/spfCNxKBIyyQx2GmAS5uMmLeu5OXoLc9XGm1jMwgqG42tmh6ZSYlOR+SnSr7EKgECgErhQBqr7RY35rCNhLdv1S5D72meozHKGYFs6VitzPhXTVUwgUAoXAlSHA+StL77uiT922rt0Gcqfyi6l8zzgzQEXuZwa8qisECoFC4JoQ4AgT9mJ7U9CuId0Gco8Ie+cmdvenyP0antJqYyFQCBQCF0KAev5LEsGzG7/lhdqypdogd9NeeByLsBdxtLeUc2je9+w2dh7Okcwhnk1bivM33XM00k/bZ/Z7VlPhbopqXV8IFAKFwB1G4LmJ4M39nUW0uk3dR+7iZmetg3bvC8ZxjD5YV9qc4ewZHxHGLMxxqjSbEz2LU36q+qvcQqAQKAQKgStEgNd3BIYgwVPRz+ao34auaZfoYE9JG1W1aTGnTMLjCtrxfj06l/nEBhqk6V/bETf7GG16eI/sFX0mvYtUVqkQKAQKgUKgENiJgMAQllMMifg5O3Pfr5MP61HLkLuANrG9srUmChhy5z1fqRAoBAqBQqAQuHUIUG1zFhMectdqUbeu4SdsEOncKnAx6Jntkb1IdpUKgUKgECgECoFbi4AlG2+rWv7coFH3P7PH1hZfe7Z97kKo0HO3teorBAqBQqAQKAQKgUKgECgECoF9CPx/WhvFCQvhE9MAAAAASUVORK5CYII=" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Median\n", + "\n", + "Median is that data point of given data, which have half of values lower than it and other half higher than it.\n", + "\n", + "![image.png](attachment:image.png)\n", + "\n", + "`np.median(array)` - returns median/middle value of array-list." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.5076467722525979" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.median(data_list)" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQwAAABmCAYAAAA+lK8bAAAVUklEQVR4Ae2dB6w9RRXGP3vvXeyiIhZEsbeogL1AxIJdMbEXDEoEe8EKGoMdJXaxK1HsXVGwEoKioAYbMSRiA0XU/PzPgcO+2T577957z0n27X27s7Mz386cmTltpKBAIBAIBAKBQCAQCAQCgUAgEAgEAoENQ2BnSU+NIzBYsTaw14b101lU94aSTpP03zgCgxVrAz+YRQ/asELApWEWh0t6YRyBwQq1gX02rK/OoroHSfqLpOvOojRRiEAgEJgtAheQ9FFJx0q63GxLGQULBAKBWSBwhcQsPikJ5hEUCAQCgUAtAteX9FdJr65NETcCgUAgEEgI3DMJPB8eiAQCgUAg0IbAgZLOkrRDW8K4HwgEAoHAByX9TNJVA4pAIBAIBJoQuLSkb0s6StLFmhLGvUBgIgTQzG0n6YqSLjHROyLbQghcS9IfJb21UH6RTSDQFYELSnqxJCw1MRr8naQvS7pz1wwi3eIRuG36WPiQBAUCi0IA9f2HEoO4n6S7SHptaot/lvSIRRUk3tMPgWemj3TLfo9F6kBgFAJ7Sjpd0m6VXN6d2iNWxztW7sW/M0DgnWkqeM0ZlCWKsDkIvMPNJm7uqn3TdJ0lyuPc9fg5AwQuLulLkr4lCeFnUCCwKAQOSYwBg0GYhNGVHcN4u12M8zwQuJqkkyShVl0nOt86VWZN63J+SU+QdIdK/dCWWIiFF1Tuxb9LRuBmks6WhOFWF0I4hcxjX0nPTmd+PycdXLPf/vysmus+TfX3fpIu0qVQlTSo5Xh2+8r1+HdaBLAWLuFm/vTEMEKGMe33GpT7o9PHuU+Hpy+T7DWM+y/iDIPqQ5eU9HlJv5R0lT4PRtrRCOBWQCcf44902SRPo209Y3SJIoPiCLCOZA2J81kXYhShURizoGPeWtL10kEsDfvN+Topb0b7G6RRH/Pzm0i6vaSnSDpA0qck/cHla/mjdutKMLTPpgZ3i64PRbqiCLw+fUNmoX2JZeT70/OxFOmL3gLSX0jSZyT9UBLu7V2JaFzWoTmXcImnLMhTaGiUx/L/vSQMy9rowpI+np4Le5I2tKa7z3LwC0ld2teRkYGD7951eTxdLSLnLAKY4f40je5Y3HWli0r6ouvUfGQ+dimiXEe4/LswgCen9Ji3o/kJWh4CqEiZtZ4iqattz4MSk3mgKzYzV2ahQTNBgGXImcm6rm+REJb+3HVqDHCwGC1JNr1FJgGTqiPUcD9KZblvXaK4vlAEWEoykGBr0UbIqTAJv79LiAaFCHBvcNfi55IReED6qI8cWA5Md23pwJmlBLODUoTp8HuTzKRJ42FTWRhLOC6VQn9cPqhKaRO/TTKtutzulWRXj5J0DXcg48J7+rF1D8b1xSOALIIYGDce8WoMazzTKG3PcTtJf5OEcDZHuOOb4xLquKB5IIA6HFkGbYOZYo7Y1uL4SvvxbYnfd809GNeWgwCd+8SRMTBQg2El6j/0YwpXZ1dJd5PENLVKzI54N2vmpllI9bn4f3oE0HTwbVi6ItD2hEakOtj4NmS/r+4fit/LQ8BsKhgFxsbAQLDlVa2sRxel1vxAapQwrYh2vrz2lHvzndxAUrXmJD2yJ6+Cr/6+tqSw2M0hu4RrqCpPTVy+xOvNOs9GBjrwpUpk3JDH5d1yBFlHH7pj0uzwHA5OPtIYzA7jI+KD3LtPpmuQFovc19XM5qx6LCXozG2EMyNqcdpEqErb0Jr5fRNKYThVghgJWOIYw+Bct3Yt8T7yYERiKcK7uhr6IEg17QtBg36RnueMkRlCNjQ+P0n3mDk9sVSBZ56PhTkAT9TUOTL1NYygbbnAzBVBNPlhmNdHdZ97d1xbIgKMJHzI3FRxaLEwxUYmQr4cdLaHDM2sw3PMEuxddQ3cZ4MMBEtCGAI+L4yAGKyZdSFMg3sHp+u23Bkr5/FlmOtvDN8wwDM8c8GgLZQjaWAYaDXayNSraNCu1JY47s8XgcOSOquLFWWfWhAMxcszEHjhfTgFPc018C6aHpZNlK0qlMUwzDqKqWZNYGedo8sUfIo6LipPvpEtH2CcyBeqZHvXgAmGe022MfasMWPaQcRbMVRW7IytAnETCfw7RQyMquk4JtuYfpcmP4VuYxjIKJgp5KbGb3YMgzyZiSC/MCbyvobyo/ZliZPT4JSu75T5YfBm9f10ze535qhIOuQcXYh4nZZvV3+lLvlGmgUiwNqTGBh9HLv6FG9q03ErC43WGiOObU10I0nHZfT6mJFjTm75GOOBwcE0XtMgqUfmwRKGZ6uzlqayVO9RNso/9sgtI6rvqvvfYmlSl+fXJLIIWaTp6kVsS1+eiXB7NcDO/TJm3f/pISgcUh8iKFVNx+kYJek9rqN32XE+t4ZmSWZTcczL+1iqoqU5RhKyDmQAQ+glrg50qjEHAmAsJ/uSF07y/pz/BvY230jlA6+uSzQTkpKvD8PXt4yRfokIMBryAbuOEkOLireidYC3Dc2k4Tk/3W2bYdRlg/+ClXHIjIuO1GUtX/d+ND0+EBHBiIYe5DNkaUTnN6ZZJ+AFX9NIYbtD3JEu5JeNO3V5INLMDwEces7oEQNjaA0enKbsCBLHGoflyuAb49DpOEsOYxgIUZdByzZOYuAwDNCU5JiODTKk68P8iaJmeceSZBmta+Q7mTofmbw7+8TA6PtajJ9wHkLinlsK9M0vlx4bEmuMJnvIpbNrCHhRBVoHZZQ0fwfyyZmWY3peXe5gT8DanDV9aQ9dK+siz14WVGfP4m1szFkRA669WgqKwNi+0dBZYMsr4vaUCNB5LQbGFJoLyo5KDocwZBhTrlsJK2iNsW15haAX4eYJktDiQDACUwFjJ+CtPblvO9o/flvy///F8IugQ19L8gs23FnlXboQ+ppxFVjm7GaQ1Vj4APBC28EyDGb74xaTfGM0U6rX3eeJn6URYBRlOYJUfApi6UFDomG1jT5j309dbF3N8qSOmGJ7CT/MgdmVOa7RUT6RCbzD6MgMyTMSVIswQ2YnTM151jOUujLM9bqXX1AXmGSVWFpyjwOTf+Q2MA2+cdP2mjBXcOW5qVT41bLG/4URIKIRH5D4A1MQmyKRf5coWWPf7z1lmxouPi3fTOWibEy7GVm9ly0+JTRwCAZj5uPedJ5n0BRgAIbzHrMV8sPJalXJyy+oyxsrFbmVqyf3YaLg87zErJuWZD6wEQZcQSuIAJoFVKp+45hS1TDryKlmL7ly2syBWQ0duo5MuAnjYC8MLBUZIekgnFlagM2TUtAeOserMpkRSg4ZCO72pPn+ApzsMsUodsnLL6gPAW8QiuOQ98oUCYuNrgiuzP1jk/yG8HswD5MH5QqEXIlnOCICeA6hFbhGrExUZ9X4BGOLvnfSiOQsKcfkzVr5TcmYyGYAPj+zPkQt2GR6jKDyw24Jc7Sk3VNGyEL4n0A9LHG+3rIRMJ3E/ExMHuLLtCq/WT762KwwAG87Q0R4vExZfuEnBOOwJSDalDbbEzPdhyGXtsFZFYxXupxMo7+TZAxNo3HfSu6SmMUUMSnMZLkuVBuGV8TfYBRr21uFjs6aHek+WHjif2Qi2EW0aY/oPIzEdASEqXScKdTGvnxT/EZj5O0vqBf1AR86eNWfhDaD7IJ7XdoPyzy+C0LVLumnqGPkOQIBOgsu3UzjSxHTTkYlDuIwliTyRjVLo8YMu45M3jDE8Kouz6brTK/pCPhcoGnCuxVMm6bnTfkt654xY+rCzDA3gxtaNr6XMSMfCXxofvHcEhDA5JfGUSr2JXYNrG0ZaZuEX0OqyuiGNsMac5MKmFkBjZOZRmnv22rZmU2YsxrLIVSOvHsVVavef2T/akVH/m95IzOaOpDSyKLG43UIWFQs4kiMJSTljLA4XxE9vBQhs8BcGzkCzIKji2OXqTgRbk5JzCJsRoO6EQxKztimLLvPmyC9Xn6B120pYmljzH7TIpaVwnAW+RB0ldGQpclYshFkzD6alAEGwWwC+QGNy1tewiy6lpeRntGM2Q4b4ExJGL8hPMYe40UruBQBG+JfmICTzl3SGtfaRqhSp2yFE+dNDAyETwgmqwK/vq82vwKsHTGYsoMZDOt7DiTk/I89Br+xZyAdDlJ0MiTyCMWIlUGDpaPbjMKf+2hdbHMlGMcUcT48Tmhdpn6Hf1/p36jVDWcGklJk2jK+wdh2VqpMkc8ABJB+o04dKxgk5L/FgLAGN+UZOUEfukdS/cXo1owaAk6WcTCLUip2GDZCatSxyJVmQ0xjUe8Q6MNGKhoIBzES7LedS0zB51J54gvAxfuSGdGMid4M7ocm/wF8CHIHPge560Oufa9nfArDhJgQjHA5M2dLE+dtCJTS7DDjom2xVPOm9EvH+TbOV6HryNZkNrz0CvUoANJmOhFmuX0JU3DwYnvEMYSWgHIwHWfKaQdm2k0H6fx9e86fydMfY7Y9bLOlGINBPJtHAE1W1zgZ+RwKX2VkrVvr5pgHVml0MAxRuhI6Y9bWJY89ur68JZ1ZNVK2voQJMHggXAwKBNYeAfaGsLUz016YB0FTMBZiKmT32E+CWAwsQ9DJ9xFMMXqa81SOAXW9hokxgjw7MCMeawnISIzHH2X4as9Nh6kXwkOw6ROCbu0bVVRwPRFATWYzCyT9VSENa2vTw9Ohxuy1QceG2Qw5YFD+wLeBY8zU2r6ozS6o38k914mozJAhEMehyQDK3hXnQGBlEaADm089U2r28cyR31uiy6Y2uTzmeo09Qy0AKwzj7DS76lpeIh2d2bADelM+MGs8OuMIDObUBvCkzZIJ6+goTRZ15r5MurboS9kXzfgidg//SJ2e+nE8rEd5EXTyTBeLyWq2fmZj747zNjwDh+XhcFq1odr/MAn7MHUSfnTKmOiSjlkIKtd1ITQL301xOM0yj3rWcthMxYnzwKwEPXlfYmn20DgCg5m1AeKSbCFcYs1UGMcijI9yhN6XUGp0JASiY1RqMBuCqJQ8+mhpqvUj/BvLCWwKvO3/x6oJa/5Hz04cCIxpqrKfmkficiCwmgjAIIjwAyPApLnO680Hgx1jc4G8BEtI3lfyYFvCIfEAqC/xKwhWgrDycFcuZh1dNEDIP9CuwGyGlGE1W06UeiMRQJVocRexPcCarEp43dkshNnFWLUhz+9c+KgGIqnWoen//SRhrAZhsGWMjAAuXXbPJi4BMTBQFwcFAmuNAC7UzBjoJKgFYSBVokNxH9nFkG3iqvnN+f9qwFYCtLYRAWrBZxFBedvKEvcDgckRQI0KM8Awi633mJqzLmeqTdBZgrdio0HQ0nUnZCFoS2yWsU+HCuMdSvrcfpkdHo8kgcDqIUCgEoSeMA2EfTicsfygI7Bk6TLSrl6tt5YY4S6bEBnDOGRrki1X3pWWJFNHotry4g2+wIDG7NiOLqHvSBtUEAGYAt6nyCs4sOzERgO146YQQksL9w7TaIvDiCMQAleEnrnl3Kbgtqh6EskMoTm7feEOYAfX6kwCKBvfibQMhN7bmt/I7mjrIbAe8BXh3ABXwtR6wOtn8QjxC2yGgQaJpVkdIRQ9qUAMjLr84/p5Eaju8WHfiTNaqjotn23J6NP738ymS0bEOm+p47+1RsBkEjSoP7WE29spGWzVbbC71kAtoXIMZgSOQdZkwnrf8ZviW7LPK0tvopjxDIFoiFJGrM2xmr8lQBGvnAsCRJTyjbApWjc7fJGWcPJBi0OAJQbL5l9JOtV9L5YmzJKbiGUJ32y3pkRxLxDoigCWqGiGjGlgkVpHOAthJTqrcGl1hV2j67YnB7IMHy4BbV6T2wKGeEQKw2qZyNtBgcBoBFjLmvUrTIN9MHOEyz/+NWNN5XN5x7VmBMzZj6DHLCmMuXMmonYdwdhhKny3LpqVunzieiBwDgJYtqIdsUZIjAsC5FSJEQoVbJ+o29U84v9hCNi+JQR3wt3A+wARVKlu9sD+LnzXA4a9Np4KBPIIsDmxMQwaYE4ohuDtDElTb+yTL+HmXkV+QZAnvou5BPgQDXy3uo2fTMtSclOhzf0SUfNzECBAkDEM4gIgYa8SEvemxllNH/+XQcDLL8wYi2UkDMS+GQLRqmmACUqRXxijKVOiyGXjEUAzYo2P810yiLxc0lmS2F4gaHEIePmFf6sP8MQ3q4aRNEaD/MIYjX8+fgcCgxEwD1RjGtXNlRGYfSTp8me1R8TgGq/Og8YYdqwUmVkg2z7aN0OD4on9ZLmHoDQoECiKAPuIHu0aH7uJe+I+2yscVSBauc83fjcjYPIL9lzNWWb6WCv4RsH4jQ5O37PKaOx+nAOBwQjgsYsTno1WR0pCe2KEeg4r0LfYhTgvBAFbVjB7yBlo4TFs34zzQalUxmjCBHwhn2kzX8Ku6db4iBXiwxLePd1rMuraTNSmrXWd/MLeaozBvhuCUL4byxVCOHSxBLW84hwI9EKA+B/W8P5eseZ8brq3S68cI/FYBEx+0SRoZid7+26c2WnP7C/Y9T4oEJgEAcII+oaHj4kRbtG/6RjCz56J8zgEbPbg7S9yOfqA1Xw/VKyHpW/ZxGhyecW1QKAzAgTEOcUxDbxYIfwRvpK8HuvcqVPSOBVEoE1+4V9lMxEYBqbgJyQT/pwBnn8ufgcCgxEgcJAFSKbhEWQFYmvGX6fAK+lSnBaAgMVbZcnRRuxEB6PwM0TkF0GBwGQIIIX3arrPScLhjK0NaYj7T/bmyDiHgAU26mLWjWGWbbxlTKNqS5N7R1wLBEYh8Ao3Sh2fdP9oRmiEu4/KOR7ugwBLCWQX7EzntVVNeRBMx5gF5x2aEse9QKAEAnu7RoemZHtJhyYVXc6/pMQ7I49tCDCbY1uLA1MMC+v8LA3Zw7ZNHuH33MH+oi194B4IjEaAEHz/dkxjjxQk+BhJWHsGTYeAD4xjzMKfCVjdFtOCvWJ45ojpihk5BwLnIrCdpJMdw3iZJJYmWIG2NdZzc4lfQxDYNfl9YDuB/wfnfdOZ7R+qzmW5d2Cd+9IUAzR3P64FAkURQPfPNgI2ssEo/iUJT9WgQCAQCAS2IMC+FcYwzC5jzy2p4kIgEAgEAknoZgyDM1G20PMHBQKBQCCwBQF8ETzDOK7GtXrLg3EhEAgENg8BZhP/dEwDgyDc34MCgUAgENiCAPEf8UWwWUaXDZq3ZBIXAoFAYDMQIIw9Ho/GMDAaCgoEAoFAoBYBc48+WxJ7YQQFAoFAIFCLAIZDzDBOlBRBf2thihuBQCAAAlgdwjDwWGWJEhQIBAKBQC0CBP09XRJxPnPBZ2sfjBuBQCCwmQgguwh16mZ++6h1IBAIBAKBQB8E/ge+f+4lMFEelAAAAABJRU5ErkJggg==" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Standard Deviation\n", + "\n", + "Standard deviation provides an indication of the spread of the data\n", + "\n", + "![image.png](attachment:image.png)\n", + "\n", + "If Standard deviation is Low means that most of the data points are close to mean of the data. If it is High means that the data points are spread out away from the mean, or in simple words are more deviated/distributed in a high range.
\n", + "\n", + "`np.std(array)` - returns the standard deviation." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.2709284253978271" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.std(data_list)" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAT4AAACfCAYAAABgD7XPAAASCUlEQVR4Ae2dLe8sNRSH7zdAkitRBIFEIElQWCwGicIg+ADkahQSicDhCB6FxqAxOEJCSDBLnr0cbv/DvLQ709lt+zSZ7Nt0pn1O59fT087ss4tJAhKQwGAEng1WX6srAQlI4KLw2QgkIIHhCCh8w5ncCktAAgqfbUACEhiOgMI3nMmtsAQkoPDZBiQggeEIKHzDmdwKS0ACCp9tQAISGI6Awjecya2wBCSg8NkGJCCB4QgofMOZ3ApLQAIKn21AAhIYjoDCN5zJrbAEJKDw2QYkIIHhCCh8w5ncCktAAgqfbUACEhiOgMI3nMmtsAQkoPDZBiQggeEIKHzDmdwKS0ACCp9tQAISGI6Awjecya2wBCSg8NkGJCCB4QgofMOZ3ApLQAIKn21AAhIYjoDCN5zJrbAEJKDw2QYkIIHhCCh8w5ncCktAAgqfbUACEhiOgMI3nMmtsAQkoPDZBiQggeEIKHzDmdwKS0ACCp9tQAISGI6Awjecya2wBCSg8NkGJCCB4QgofMOZ3ApLQAIKn21AAhIYjoDCN5zJrbAEJKDw2QYkIIHhCCh8w5ncCktAAgqfbUACEhiOgMI3nMmtsATWCfz111+X33777fL777+v79jwrwpfw8az6BI4ksCvv/56+eyzzy4ffPDBdXv77bevr99+++0FMewpKXw9WdO6SOBGAj///PMFofvhhx+unh5ChxB+8cUXl2fPnl0+/PDDrjxAhe/GhnLvbDRKGuuZ2y+//LKr2n/88ceFzVSPAHz//PPPohMgcu+8885lyb6ffPLJVfx47SUpfI1a8qOPPro2RnrjMzdiP7ckhBqPYuniuuWY5vk/ATrCd9999xqj+/+v89/89NNP1zb0+eefX/7+++//7cQxo41hxx6SwteoFX/88cf/GiON8ptvvrk2dhrmrRsXAMdheMPQJhp7+srvpYnycAyGUab6BL7//vvLa6+9li1+33333X+2nuuY6OyiDSCCPSSFr2Erfvnll/81SLypo2fhaPBff/311VOLhk/guyQx7Hr99dcvX331VUk2991JAN5vvvnmrAc3PTTDY7w92tNcio6LNqDwzRHyu1MJMCxJPbOPP/64yvm5MFKRnfMKlk5Mmd57772ln/2+IgGGvJ9++unuM9D5IXocb24ovPsEdziAHt8doB95SnpjhjXhkdFIayWGUJxnyTOYnpehLfuXCOX0GH6+nUDE5giL3JoYRbzxxhtXO+45zq3nr5VP4atF9sTjpjEahIZYXa1EjI+h61bPz+8Ici0vtFb9ejsu/Bny3ppiRpe1fD0lha8TaxKjCa+PIUnNZSPvv//+Zav3RyApTy8xoVabSczY0jmWpljD1+OklMJX2hoedH+EDsEL8WMFfq3ExbQ1WYGXscfTqFX2EY+LHVinV5LoSMnXa5hC4StpDQ++b8R0Qvxu6eWPqGJ4GbmxwCPO6TGWCbx48aLI+6bTnLtTA7veuo5zuXT3+UXhuw/3ameNISbiR4ztHj12DJG4UEz3JxBrPnM6IkRvabRAvLCX0IXCd/92eXgJIiCN+LHu7uwbzN96662rh7E1ARIVp3wINsMrZqVvubiY3e4hUQ8mErZuO2O2NZcvYRDaAqGQtYTgIW6UgY1OMzbEk+FyzdjxWtmO/k3hO5roAxyP4QjxmRjyMtQ5K8UqfxZU5yREj30RvFj+QrnxGnMTQsEtfD0kxIX6r4UpIqTBcDQ3RXuYW+SOgCJ6nDddGhXtJ16Z1OolKXy9WHJSjxjeRKM9a2Yuzpt7QzueXjoES4fqrBvcSpwPT2bLQ9o6ziP8jncV9lrzeiOUUCJEeHIcey78EB3O8+fPL0sbgohH3ktS+Hqx5Ew9EJS4kFiEekZgGu+Lc+Z6mXh86ZCN96wT5BhbQzPEAU/mjHrN4D38q3Q9ZspkeqLw3tIOY7rP9HOI5VwHyLmww9pGx7JWpun5Hv2zwvfoFtpRPhpqekvbGcPBENs9d5CQNwR7yfMhBoV3co/Jmx0mWc3K7WXUe81ODFWDDd5ubjrCLrnnamE/ha8FK+0oIwKRxm221t/tONU1a3gWDFlvTenFPRfrw8PDg50btt16zkfIF7eGrXUaMSxF/PDQclN0JrmeeO5xW91P4WvVcgXljntsw1OoKRgRJN97i1M8b5Bhb5q42JkAmBuypfttvYcJ3vCRW4kHNi0fHVTYZ8nLJU90LKUPfojYaU9xuinDks8KXwmthveNC4aLi4usVgrhW5uVzDl3KtYh1AzdWZ6zx5uMcyOcTMAwvDxi41hRzjhHyWvN+B7lUPieWqNp4WPIQy9LI6bR9bLG6KmJjvkUMZ49XklOSUJg93p8BNPDA4rhGTOTJQH9nPI+yj4R36OOS4n2HkxKRdah7lOqTQofwx1cdtx9Xtlipov3CuBTI4f3tBY7eprj9k8hsEd4ZbEEg3V+iB/i0GvKie+FHRG/0hlWYrvkqx3jbcU+zQlfzFRiwNT4vA9vA0FU/F42QeJFNHiGoGekGFId4ZnF0hjKv+YJnVGvmufIje/RqcMifQo2nnFOvDOujb0hiJoczjx2c8KH4dZWrLOok8ZxxIV3piFqnIvZUW4foyMomQHcUxYuQvgf4Z2ls7tHeJB76lUzbxrfo85LKbzCtG2TF1HbShF7LR0ibx231d+bE764D3XpwopYBotfz7rYH9X4MTNaczJjWvfwXrYWH0/zzX0mHomIspXcpTB3rEf+LkSJei61WQQrWKRxWjzh9PNSPeNWuF4Wey/VM/f75oQvGgleXzrUjQpH78kyiB5uY4p6lb7G0OYePXysG5yzT249GKLj4cTwjou+14s2HuqwVEfCNizWDuGLjgy+fLeVEFP2Iw5ueklgm9qDkULMELeli4AgOEYmDrLnwnuwahcVJ8T/XsPD8MrX1qOtVSi8Rmyc3r9KvXpLUdcQtWksFoZ0JLyGBx93q9C5pcPeJTbk5fh0IqaXBJoTvjXD0bOFS793OcXaeR75tzMaOYK0FlCPON8ts8h4N1yk1CNSeDtrsd3Yt7XXmKlF3GizeGV4unQezGZT5/DwYEOHzr685k74xIxuzpC4NX63lrcr4QsD5zaIW6E9aj4EiSF+bW8Xz2FN1GJoVXp3AR46F/30Ao1hO4K45Ok/qk22yhWhm7TNUkfEbmllAr+XcMAOiKXpFYFuhI+LhQuD4cCoQ9zwBkouildNIe9dDD23zhEhh/BWco7OBTo3PE8D+7158hHfW+tIctgt7RP2cv3eU0JdCF9cGLHC/2kVx/gUkwDpEPHomrPUgqEYnctWiqUouTbhmGvxqljKccRs8VbZz/o9je/Vslt4lCNP9M3Zs3nhC0+vx8D3nMHmvotFwzUZ4DnE3TG554nQw9raNOqDaC8tT4r6xrHw6u8xUx3lOPIVjtSHrcYoBa+cY8950UfWo8VjNS18IXrT3pIYEz3dCL1ceLs5i1hvaaCIVgxb4yLdErL0PAzl1kQNQWOIvpVi0oMypPtzcec+5n7rHGf/Hl56Gt87sgwctycP+Ug2zQofM4fM4M7FmhhC9DgDODV89OjMAB6VEDU6EryEWD4Rgsdr6bliODedsKC82JAhbG6K2WLKgdBTziMeUZV7/qP3Q7CpS434XswWz10fR9ejxeM1KXxxAeAtEOzmIk03Llh6054TQyMmA7hw6NV5z+stG14Zs34ca2vjgipN4ZVOJzq4KPHOSxJ5EAo8erapt19yrHvuy2gkWE+57C1XTGi0ymZv/XPyNyd8IXrRaJZea/SiOUDP2ieGSUv1r/X90hKLrXpHWEIP5BUpBO9oHuFh9xIHfUXr2Hd3ET7EiyFTzNRtXaSpEfHyiBlFjz/3yu/0ej0nBIiL5sytJLY3x56L8mjvZu48I38n4zzrnyp8uN65YpeKYSp8edVyLwlIQALLBE4Tvgi2Imh4ZAx96J0QNT6H0PF96sWUxoCWq5r/C/EXZg4JnB+5EcyuNfuaXzv3lIAEThG+iO8gbkveG0Nffn+ESQkmDhCouWH0nu8QeNdUedFJ4P4Eqgsf3lvqzS1VOSYteO5ajcWcS+f1ewlIYDwC1YUvZh/xdtYS8T8EkrsDEEvTPAE8Rlju8TzN+3IpjBxu50AbzL2DZ74l3/fbqsKXentba4rC4yMOtnf28L5I656dizU8aF+31x3KqB6j3Puw614Rtx29qvCxlo6GxwTB1vA17sVMb0e6rUrmkoAEJLBOoKrw4Q4jfFvDXIrIfYXs+wiTG5SH2WTWyh29rZvDXyUggTMIVBM+PDy8N8Rs6xlqJUPiM6Aw1KbcNTY92jMs6DkksE6gmvBxWh4UgHgsLWGJosWQOOc5b5Gn9itLcIg7Hr1txTpr18vjS0ACl0tV4YsZXURkKeHt8bh0BNLbmZYo+b0EJHAkgarCh6eHoC3drcBwOLxCPCuTBCQggTMIVBU+KhBe3/ShAXh3EQPcGgqfAcJzSEAC4xCoLnyg5L8UwvOLp6vEZ9fsjdPYzqopI4m52fjc+77Jy74cJ7aY5c89xll19Ty3EThF+CgaAkesj4cV4OGN8Fj420xirr0ECK3QsU437gra6mgjPBN5pw9oZQG5qX0Cpwlf+6isQSsE8NjiuXRTEdxaWoWHF08Niidcs9qATptwjR12K61gvZwK3zoff22cAIvnI86MF4eY5Sbuiy7ZP/e47nd/Agrf/W1gCSoRwDvj3m88tVg9gPjlTqYhmD5GrJJx7nxYhe/OBvD09QgwZI1/cUv/wzb37xx5RNp0NUK90nrkMwkofGfS9lynEuCOoPg7TLy/9G8PEMW1xO/8Yx3xQlN/BBS+/mxqjf4lgOilf4cZy6oY7vJ+LTGZkesZrh3H3x6TgML3mHaxVDsJsN6O/wtOh6p4cbFMhd/WvDnjezsN8ODZFb4HN5DFu41AGt9LjxD/7YIArj1B2PheSq2/9wpffza1RpfLdTY24nspEIaw4fUtPSIs4nuu2UvJ9fVe4evLntbmXwKs30vjewGGBcqxMBkBnHtMGA/MML4XxPp8Vfj6tOvQtSK+F+v35kCwNi+8vrlb0IzvzVHr6zuFry97WpvL5XrLWazfmwPC/bpxDy6vDG3ThEeYToqkv/m+DwIKXx92tBYJATy6ufhessv1GZHh9fHEoEiIIH+OtTbjG/v62i4Bha9d21nyBQJL8b10dzy6ED6ELh43RXzvkf4CIS2z748joPAdx9IjPQCBrfheWsT4Zz8EMJ4AbnwvJdTve4WvX9sOWbOl9XtzMBC78PrCy3P93hyp/r5T+Pqz6dA1yonvBSC8Q4a5IX4IIRMbrt8LQv2+Knz92nbImuXE91Iw8demiB9LYMhv6p+Awte/jYepYUl8L6Dw96axtAXxW7uNLfL42j4Bha99G1qDfwkwU/v8+fNiHkxoxHDX9XvF+JrMoPA1aTYLPSXAomSesszC5dIYHbetIXw8fy+WtUyP7+e+CCh8fdlzqNowg8tCZWZk0+EqsTq+J36Xm3hgwdai59xjud/jE1D4Ht9GlnCBAMLHv6hx5wUixz+oMavL5xcvXhTF6/gfjrkHFiyc2q8bJ6DwNW5Aiy8BCZQTUPjKmZlDAhJonIDC17gBLb4EJFBOQOErZ2YOCUigcQIKX+MGtPgSkEA5AYWvnJk5JCCBxgkofI0b0OJLQALlBBS+cmbmkIAEGieg8DVuQIsvAQmUE1D4ypmZQwISaJyAwte4AS2+BCRQTkDhK2dmDglIoHECCl/jBrT4EpBAOQGFr5yZOSQggcYJKHyNG9DiS0AC5QQUvnJm5pCABBonoPA1bkCLLwEJlBNQ+MqZmUMCEmicgMLXuAEtvgQkUE5A4StnZg4JSKBxAgpf4wa0+BKQQDkBha+cmTkkIIHGCSh8jRvQ4ktAAuUEFL5yZuaQgAQaJ6DwNW5Aiy8BCZQTUPjKmZlDAhJonIDC17gBLb4EJFBOQOErZ2YOCUigcQIKX+MGtPgSkEA5AYWvnJk5JCCBxgkofI0b0OJLQALlBBS+cmbmkIAEGieg8DVuQIsvAQmUE1D4ypmZQwISaJyAwte4AS2+BCRQTkDhK2dmDglIoHECCl/jBrT4EpBAOQGFr5yZOSQggcYJKHyNG9DiS0AC5QQUvnJm5pCABBonoPA1bkCLLwEJlBP4B8ZqmYMbLSBDAAAAAElFTkSuQmCC" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Variance\n", + "\n", + "Variance is another number which represents spreadness of data points. It's formula is square of standard deviation.\n", + "\n", + "![image.png](attachment:image.png)\n", + "\n", + "`np.var(array)` - returns the variance of array-list." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.07340221168854594" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.var(data_list)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Min/Max\n", + "\n", + "Returns the minimum/maximum value of array. Let's look at an example. " + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.02789257109099408" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.min(data_list)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.9285864855860084" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.max(data_list)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Argmin/Argmax\n", + "\n", + "Let's say we want to find the index of the element with the min or max value; we can do that using the `argmin` or `argmax` command." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "7" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.argmax(data_list)" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "8" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.argmin(data_list)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Cumsum\n", + "\n", + "Returns the cumulative sum of the array, taking 1st addition with 0." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([0.72686874, 0.92161255, 0.97222531, 1.32678418, 1.93794286,\n", + " 2.49347359, 2.94088991, 3.8694764 , 3.89736897, 4.59402004,\n", + " 4.91320002, 5.39594821, 6.20909104, 6.32402138, 7.09172928,\n", + " 7.62427463])" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.cumsum(data_list)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Cumprod\n", + "\n", + "Returns the cumulative product of the array, taking 1st product with 1." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([7.26868735e-01, 1.41553191e-01, 7.16439728e-03, 2.54020062e-03,\n", + " 1.55246566e-03, 8.62442384e-04, 3.85870794e-04, 3.58314404e-04,\n", + " 9.99431000e-06, 6.96254683e-06, 2.22230550e-06, 1.07281396e-06,\n", + " 8.72350983e-07, 1.00259595e-07, 7.69700829e-08, 4.09900599e-08])" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.cumprod(data_list)" + ] + }, + { + "attachments": { + "image-2.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Some more common methods\n", + "\n", + "We encourage you to try out these functions. They will come to good use in several real-world data science problems you encounter. \n", + "\n", + "![image-2.png](attachment:image-2.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Array and Set operations\n", + "\n", + "## Unique\n", + "\n", + "`np.unique(array)` - returns the unique values of array in sorted order" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array(['apple', 'banana', 'mango', 'orange', 'kiwi', 'mango', 'lemon'],\n", + " dtype='\n" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array(['apple', 'banana', 'capsicum', 'kiwi', 'lemon', 'mango', 'orange',\n", + " 'spinach', 'tomato'], dtype='\n", + "Returns the sum of all diagonal elements." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "16" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.trace(Mat1)" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## det (determinant)\n", + "![image.png](attachment:image.png)\n", + "Returns the determinant of matrix." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "23.999999999999993" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "linalg.det(Mat1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### eig (Eigen value) :\n", + "\n", + "Returns eigen values and eigen vectors of the matrix." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(array([15.90625326+0.j , 0.04687337+1.22745405j,\n", + " 0.04687337-1.22745405j]),\n", + " array([[ 0.23498454+0.j , 0.25910807-0.26656981j,\n", + " 0.25910807+0.26656981j],\n", + " [ 0.56518307+0.j , -0.74337498+0.j ,\n", + " -0.74337498-0.j ],\n", + " [ 0.79079097+0.j , 0.52232979+0.19070601j,\n", + " 0.52232979-0.19070601j]]))" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "linalg.eig(Mat1)" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## inv (inverse)\n", + "![image.png](attachment:image.png)\n", + "\n", + "Used to calculate inverse of a matrix.
" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 0.58333333, 0.25 , -0.33333333],\n", + " [-1.79166667, -0.125 , 0.66666667],\n", + " [ 1.33333333, 0. , -0.33333333]])" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "linalg.inv(Mat1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## pinv(Pseudo-inverse)\n", + "\n", + "Used to calculate inverse of a matrix, using singular-value decomposition. " + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 5.83333333e-01, 2.50000000e-01, -3.33333333e-01],\n", + " [-1.79166667e+00, -1.25000000e-01, 6.66666667e-01],\n", + " [ 1.33333333e+00, 1.06466550e-15, -3.33333333e-01]])" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "linalg.pinv(Mat1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### qr (QR decomposition):\n", + "Decomposting a single matrix into two Q and R matrix.
\n", + "Q is orthogonal matrix, which means determinant of Q is 0.
\n", + "R is upper triangular matrix, means all elements present below diagonal are 0." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "Q, R = linalg.qr(Mat1)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Q is :\n", + "[[-1.23091491e-01 2.08978502e-01 -9.70142500e-01]\n", + " [-8.61640437e-01 -5.07519219e-01 -5.88802401e-16]\n", + " [-4.92365964e-01 8.35914008e-01 2.42535625e-01]]\n", + "R is :\n", + "[[-8.1240384 -9.35495331 -9.10877033]\n", + " [ 0. 4.06015375 5.61256548]\n", + " [ 0. 0. -0.72760688]]\n" + ] + } + ], + "source": [ + "print(\"Q is :\")\n", + "print(Q)\n", + "print(\"R is :\")\n", + "print(R)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "QR decomposition is used for solving linear least square problems (regression problems). " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## svd (Singular-value decomposition)\n", + "In SVD, a matrix is decomposed to 3 matrix.
\n", + "U is a Unitary array(s).
\n", + "S is a vector/array, containing eigenvalues.
\n", + "Vh is a Unitary array(s)." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "U, S, Vh = linalg.svd(Mat1)" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "U is\n", + "[[-0.21895023 0.25194679 -0.94264713]\n", + " [-0.6126121 -0.78743835 -0.0681708 ]\n", + " [-0.75945192 0.56255102 0.32675546]]\n", + "S is\n", + "[16.49711482 3.56065557 0.4085764 ]\n", + "Vh is\n", + "[[-0.45735528 -0.61763457 -0.63980754]\n", + " [-0.84532681 0.07854498 0.5284442 ]\n", + " [-0.27613174 0.78253321 -0.55802602]]\n" + ] + } + ], + "source": [ + "print(\"U is\")\n", + "print(U)\n", + "print(\"S is\")\n", + "print(S)\n", + "print(\"Vh is\")\n", + "print(Vh)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## solve\n", + "\n", + "The `solve` function is used to solve a linear matrix equation\n", + "\n", + "> ax = b\n", + "\n", + "Here, matrix `a` and matrix `b` are know, and the matrix `x` needs to be found out, or one can also say that the equation is to be solved to find `x` such that `ax = b` is satisfied." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 10.37728873, -29.73002839, 21.85996095])" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "linalg.solve(Mat1,S)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Linear algebra module within NumPy provides several ready to use functions. You can find the complete list of such functions on the NumPy website. \n", + "\n", + "https://numpy.org/doc/stable/reference/routines.linalg.html" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/NumPy/Lesson 8 - Arrays in Files.ipynb b/NumPy/Lesson 8 - Arrays in Files.ipynb new file mode 100644 index 0000000..3113e56 --- /dev/null +++ b/NumPy/Lesson 8 - Arrays in Files.ipynb @@ -0,0 +1,237 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# File Input and Output with Arrays\n", + "\n", + "We will use a lot of arrays in our data science projects. Often we have too many arrays, and we need to save them to a file, so that we can work on them later without having to recompute them. \n", + "\n", + "NumPy provides some really convenient methods for saving arrays to files. Let's take a look at these." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import numpy as np\n", + "arr = np.arange(10)\n", + "arr" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Save" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "np.save('my_array', arr)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The above line of code saves the entire array to a file named `my_array`. The file is created in the same location where this Python program is running. The file location in other terms can be described as `./my_array`. \n", + "\n", + "If you check the folder, you will see that a new file has got created with name `my_array.npy`. The `.npy` is the extension name created by NumPy for files it creates. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "arr2 = np.load('my_array.npy')\n", + "arr2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There we go. We got the previous saved array back from the file. \n", + "\n", + "Keep in mind that you need not specify the file extension when using the `save` function. However, you must specify the complete file name, that is the name with the extension when using the `load` function. This is why we call `save` with `my_array`, but we have to call `load` with `my_array.npy`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Saving Multiple Arrays\n", + "\n", + "We can save multiple arrays in a single file. The arrays are just passed on as additional keyboard arguments. Let's take a look at how this done. " + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "arr1 = np.arange(10)\n", + "arr2 = np.arange(5)\n", + "\n", + "np.savez('my_arrays', a=arr1, b=arr2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note: We use a different function here. We are using `savez` instead of `save`. The `savez` function creates an uncompressed archive of the arrays. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "arrays = np.load('my_arrays.npz')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note, the extension of the saved file is changed to `npz` instead of the previous `npy`. The `load` method in this case loads both the arrays. \n", + "\n", + "So how do we access the individual arrays that we had saved? We can do this by using the keys we specified while saving them. " + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "First: [0 1 2 3 4 5 6 7 8 9]\n", + "Second: [0 1 2 3 4]\n" + ] + } + ], + "source": [ + "first = arrays['a']\n", + "second = arrays['b']\n", + "print('First:', first)\n", + "print('Second:', second)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Compressed Data\n", + "\n", + "NumPy provides an extra function called `savez_compressed`. This works same as the `savez` function, but the data is first compressed and then saved to the file. \n", + "\n", + "If you want to save disk space, and your data is in the nature for it to get compressed well, then you can use the `savez_compressed` function to save on disk space. However, compressing and uncompressing can take time in case of large files. So time to save and load the arrays might take longer when using the `savez_compressed`, but can help with saving disk space. " + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "np.savez_compressed('my_arrays', a=arr1, b=arr2)" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "arrays = np.load('my_arrays.npz')" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "First: [0 1 2 3 4 5 6 7 8 9]\n", + "Second: [0 1 2 3 4]\n" + ] + } + ], + "source": [ + "first = arrays['a']\n", + "second = arrays['b']\n", + "print('First:', first)\n", + "print('Second:', second)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}