{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Интеллектуальный анализ данных. Введение в анализ данных\n",
    "\n",
    "## Семинар 3. Библиотека Numpy\n",
    "\n",
    "### 0. Полезная информация\n",
    " - [Страница курса на вики](http://wiki.cs.hse.ru/Майнор_Интеллектуальный_анализ_данных/Введение_в_анализ_данных)\n",
    " - [Страница семинаров на вики](http://wiki.cs.hse.ru/Майнор_Интеллектуальный_анализ_данных/Введение_в_анализ_данных/ИАД-11,12)\n",
    " - [Таблица с оценками](https://docs.google.com/spreadsheets/d/1jZL_-ELf0Ogj2XHa6VVbkg8vrInycv2-Z9UR5keLDfM/edit?usp=sharing)\n",
    " - Почта курса *hse.minor.dm@gmail.com* (Формат темы: \"[ИАД-NN] - Вопрос - Фамилия Имя Отчество\")\n",
    " - Виртуальная машина для майнора (подробности см. на странице семинара)\n",
    " - **Подписаться на рассылку**: написать пустое письмо на hse-minor-datamining-2+subscribe@googlegroups.com\n",
    " - Первое ДЗ!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. Что было в прошлый раз\n",
    "\n",
    "**Типы ответов**\n",
    " - вещественные ответы (регрессия)\n",
    " - конечное число ответов (классификация: бинарная/многоклассовая/пересекающиеся классы)\n",
    " - временной ряд\n",
    " - ранжирование\n",
    " - отсутствие ответа (кластеризация)\n",
    " \n",
    "**Типы признаков**\n",
    " - бинарный ({0, 1})\n",
    " - вещественный\n",
    " - категориальный (неупорядоченное конечное множество)\n",
    " - порядковый (упорядоченное конечное множество)\n",
    " - множественные признаки\n",
    " \n",
    "**Обобщающая способность**\n",
    " - недообучение\n",
    " - переобучение\n",
    " \n",
    "**Задачи анализа данных**\n",
    " - медицинская диагностика: объект — пациент, ответ — диагноз, классификация с пересекающимися классами; признаки: бинарные (пол), порядковые (тяжесть состояния), вещественные (вес)\n",
    " - кредитный скоринг\n",
    " - предсказание оттока клиентов\n",
    " - стоимость недвижимости\n",
    " - прогнозированние продаж (временные ряды)\n",
    " - рекомендательные системы"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Numpy \n",
    "### Полезная информация\n",
    " - [Все numpy функции](http://docs.scipy.org/doc/numpy-1.10.0/reference/)\n",
    " - [100 заданий по Numpy](http://www.labri.fr/perso/nrougier/teaching/numpy.100/)\n",
    " - Попробуйте в ipython notebook:\n",
    "  - **'np.ar' + [Tab]** (autocompletion)\n",
    "  - **'np.arange()' + [Shift+Tab]** (docstring)\n",
    "  - **?np.arange** (object description)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Создание веторов:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(4,)\n",
      "(1, 4)\n",
      "(3, 2, 2)\n"
     ]
    }
   ],
   "source": [
    "X = np.array([1, 2, 3, 4])\n",
    "print X.shape\n",
    "\n",
    "X = np.array([[1, 2, 3, 4]])\n",
    "print X.shape\n",
    "\n",
    "X = np.array([ [[1, 2],\n",
    "               [3, 4]], \n",
    "              [[1, 2],\n",
    "               [3, 4]],\n",
    "              [[1, 2],\n",
    "               [3, 4]] ])\n",
    "print X.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = np.arange(15)\n",
    "A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3,  4],\n",
       "       [ 5,  6,  7,  8,  9],\n",
       "       [10, 11, 12, 13, 14]])"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = np.arange(15).reshape(3, 5)\n",
    "A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Пустой массив, массив без инициализации"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0,)"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = np.array([])\n",
    "X.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "total size of new array must be unchanged",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-27-6a2129fececd>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mX\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0marray\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m5\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m5\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mValueError\u001b[0m: total size of new array must be unchanged"
     ]
    }
   ],
   "source": [
    "X = np.array([]).reshape(2, 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  4.94065646e-324,   9.88131292e-324],\n",
       "       [  1.48219694e-323,   1.97626258e-323]])"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = np.empty((2, 2))\n",
    "X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Что еще: \n",
    "- *np.ndarray.dtype*, *np.astype()* [data type conversion]\n",
    "- *np.asarray()*, *np.ndarray.tolist()* [object type conversion]\n",
    "-  **Syntax{np.arange()}**: *np.linspace()*, *np.logspace()*;  **Syntax{np.empty()}**: *np.ones()*, *np.eye()*; *np.diag()* [creation]\n",
    "- *np.empty_like()*, *np.zeros_like()*, *np.ones_like()*, ***np.copy()*** [create_like, **deep copy** *vs* **view**]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Изменение размерностей"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  5, 10],\n",
       "       [ 1,  6, 11],\n",
       "       [ 2,  7, 12],\n",
       "       [ 3,  8, 13],\n",
       "       [ 4,  9, 14]])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2],\n",
       "       [3, 4, 5]])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B = np.arange(6).reshape(2, 3)\n",
    "B"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Конкатенация:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  5, 10],\n",
       "       [ 1,  6, 11],\n",
       "       [ 2,  7, 12],\n",
       "       [ 3,  8, 13],\n",
       "       [ 4,  9, 14],\n",
       "       [ 0,  1,  2],\n",
       "       [ 3,  4,  5]])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.concatenate((A.T, B), axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3,  4,  0,  3],\n",
       "       [ 5,  6,  7,  8,  9,  1,  4],\n",
       "       [10, 11, 12, 13, 14,  2,  5]])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.concatenate((A, B.T), axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "В чем разница между *concatenate* и *vstack*/*hstack*? [Stackoverflow!](http://stackoverflow.com/questions/33356442/when-should-i-use-hstack-vstack-vs-append-vs-concatenate-vs-column-stack)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Изменение размера матрицы (количество элементов должно оставаться тем же)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2],\n",
       "       [3, 4, 5]])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.arange(6).reshape(2, 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Если задать один один из параметров равным -1, то он будет вычислен автоматически"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2],\n",
       "       [3, 4, 5]])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.arange(6).reshape(2, -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Вытягивание в вектор"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.ravel(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  5, 10,  1,  6, 11,  2,  7, 12,  3,  8, 13,  4,  9, 14])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.ravel(A, order='F')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "?np.ravel"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[flatten vs ravel](http://stackoverflow.com/questions/28930465/what-is-the-difference-between-flatten-and-ravel-functions-in-numpy)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Что еще: \n",
    "- *np.swapaxes()*, *np.transpose()* [reshape]\n",
    "- *np.newaxis* [broadcasting], *np.split()*, *np.tile()*, *np.repeat()* [broadcast]\n",
    "- **Syntax{np.concatenate()}**: *np.vstack()*, *np.hstack()*; *np.append()* [concatenate]\n",
    "- *np.insert()*, *np.delete()*, *np.resize()*\n",
    "- *np.unique()*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Выборки элементов"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 138,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2, 3, 4],\n",
       "       [5, 6, 7, 8, 9]])"
      ]
     },
     "execution_count": 138,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = np.arange(10).reshape(2, 5)\n",
    "A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Применить одно условие достаточно просто"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 135,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([5, 6, 7, 8, 9])"
      ]
     },
     "execution_count": 135,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A[A > 4]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Если нужно сделать несколько условий?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-44-2852d2c8b339>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mA\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mA\u001b[0m \u001b[0;34m>\u001b[0m \u001b[0;36m4\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mA\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m6\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
     ]
    }
   ],
   "source": [
    "A[A > 4 and A < 6]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([5])"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A[np.logical_and(A > 4, A < 8)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Чтобы найти все ненулевые элементы матрицы:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 136,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([0, 0, 0, 0, 1, 1, 1, 1, 1]), array([1, 2, 3, 4, 0, 1, 2, 3, 4]))"
      ]
     },
     "execution_count": 136,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A.nonzero()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3, 4, 5])"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A[A.nonzero()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0, 10, 10, 10, 10],\n",
       "       [10, 10, 10, 10, 10]])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A[A.nonzero()] = 10\n",
    "A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "A = np.arange(10).reshape(2, 5)\n",
    "B = np.arange(9, -1, -1).reshape(2, 5) # на заметку slicing через аргументы: start = 9, stop=-1, step=-1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[9, 8, 7, 6, 5],\n",
       "       [4, 3, 2, 1, 0]])"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[9, 8, 7, 6, 5],\n",
       "       [5, 6, 7, 8, 9]])"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.where(A>B, A, B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Использование для индексации"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "ind = np.where(A>5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([6, 7, 8, 9])"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A[ind[0], ind[1]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3,  4],\n",
       "       [ 5, -3, -3, -3, -3]])"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A[ind[0], ind[1]] = -3\n",
    "A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Что еще: \n",
    "- **Syntax{np.logical_and()}**: *np.logical_and()*, *np.logical_not()* [&&, ||, ~]\n",
    "- *np.all*, *np.any()* [bool array check]\n",
    "- *np.isnan()*, *np.isinf()* [NaNs, Infs check]\n",
    "- *np.allclose()*, *np.isclose()*, *np.equal()* **[float comparison]**"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Matrix\n",
    "Ещё один тип данных в NumPy — matrix. Является производным классом от ndarray, в связи с чем можно использовать все методы и функции, применимые к array. Однако:\n",
    " - matrix — строго 2мерные;\n",
    " - матричное умножение осуществляется через * (в отличие от dot для array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "A = np.arange(0, 4).reshape(2, 2)\n",
    "B = np.arange(3, 7).reshape(2, 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  4],\n",
       "       [10, 18]])"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A * B"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "a = np.matrix(A)\n",
    "b = np.matrix(B)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "matrix([[ 5,  6],\n",
       "        [21, 26]])"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a * b"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Цель\n",
    "\n",
    "**Зачем** нам нужен NumPy, если есть вложенные списки/кортежи и циклы?\n",
    "\n",
    "Причина заключается в скорости работы:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Примечание: нет необходимости пользоваться модулем **random**, все есть в **numpy.random**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "A_quick_arr = np.random.normal(size = (100000,))\n",
    "B_quick_arr = np.random.normal(size = (100000,))\n",
    "\n",
    "A_slow_list, B_slow_list = list(A_quick_arr), B_quick_arr.tolist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "N_repeat = 100"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.0429953\n"
     ]
    }
   ],
   "source": [
    "start = time.clock()\n",
    "for i in range(N_repeat):\n",
    "    ans = 0\n",
    "    for i in range(100000):\n",
    "        ans += A_slow_list[i] * B_slow_list[i]\n",
    "print (time.clock() - start) / N_repeat # время выполнения в секундах"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.03926768\n"
     ]
    }
   ],
   "source": [
    "start = time.clock()\n",
    "for i in range(N_repeat):\n",
    "    ans = sum([A_slow_list[i] * B_slow_list[i] for i in range(100000)])\n",
    "print (time.clock() - start)/ N_repeat"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.0005213\n"
     ]
    }
   ],
   "source": [
    "start = time.clock()\n",
    "for i in range(N_repeat):\n",
    "    ans = np.sum(A_quick_arr * B_quick_arr)\n",
    "print (time.clock() - start)/ N_repeat"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Дома посмотреть:\n",
    "- Все \"Что еще\"\n",
    "- Модули **np.random**, **time** (Также полезно знать **%timeit**)\n",
    "- **Indexing** ([basic](http://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html), [advanced](http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html))\n",
    "- Statistic funcs: *amin()*, *amax()*, *mean()*, *std()*\n",
    "- Math funcs: *sum()*, *prod()*, *cumsum()*, *cumprod()*, *diff()*\n",
    "- Можно короче? Да: *a.min()* и *np.min(a)*\n",
    "- *nansum()*, *nanmin()*, *nanmax()*, *nanmean()*\n",
    "- **Читайте Docstrings**: See Also раздел, Notes раздел\n",
    "- **[Numpy Performance tricks](http://nbviewer.jupyter.org/gist/rossant/4645217)** **(!)**"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}