{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Group numeric values into categories (bins)\n", "\n", "We often have a column with many numeric values and we want into group them to bins or buckets, such as age groups or value tiers. In Excel we can do it using and of the following options: \n", "\n", "- Data Menu -> Data Analysis -> Histogram or Rank and Percentile\n", "- _VLOOKUP_ (`=LOOKUP(A1,{0,7,14,31,90,180,360},{\"0-6\",\"7-13\",\"14-30\",\"31-89\",\"90-179\",\"180-359\",\">360\"})`, for example), or \n", "- _IF_ (`=if(b2>30,\"large\",if(A1>20,\"medium\",if(A1>=10,\"small\",if(A1<10,\"tiny\",\"\"))))`, for example)\n", "- _INDEX_ (`=INDEX({\"Small\",\"Medium\",\"Large\"},LARGE(IF(A1>{0,11,21},{1,2,3}),1))`, for example)\n", "\n", "\n", "[![Open In Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/aiola-lab/from-excel-to-pandas/blob/master/notebooks/02.08_group_values.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cut Function\n", "\n", "Since this is a common need, Pandas has built-in functions _cut_ and _qcut_ that make it more flexible and accurate. We start with importing pandas." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Real-life example\n", "\n", "Let's analyze data regarding the impact of weatehr information on numbers of bike rentals. We know that when it is too cold or too hot, people rent less bike. But what is the cutoff temperature value? When we listen to the next 10 days weather forecast, how should we prepare our bike rental fleet to meet the expected demand?\n", "\n", "### Loading Data\n", "\n", "As usual, let's load a dataset to work on. We will use the dataset that is used before, \"Bike Share\". This is a data set about the demand of bike share service in Seoul. Please note that we need to modify the default encoding of _read_csv_ from 'UTF-8' to 'latin1'." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateRented Bike CountHourTemperature(°C)Humidity(%)Wind speed (m/s)Visibility (10m)Dew point temperature(°C)Solar Radiation (MJ/m2)Rainfall(mm)Snowfall (cm)SeasonsHolidayFunctioning Day
001/12/20172540-5.2372.22000-17.60.00.00.0WinterNo HolidayYes
101/12/20172041-5.5380.82000-17.60.00.00.0WinterNo HolidayYes
201/12/20171732-6.0391.02000-17.70.00.00.0WinterNo HolidayYes
301/12/20171073-6.2400.92000-17.60.00.00.0WinterNo HolidayYes
401/12/2017784-6.0362.32000-18.60.00.00.0WinterNo HolidayYes
.............................................
875530/11/20181003194.2342.61894-10.30.00.00.0AutumnNo HolidayYes
875630/11/2018764203.4372.32000-9.90.00.00.0AutumnNo HolidayYes
875730/11/2018694212.6390.31968-9.90.00.00.0AutumnNo HolidayYes
875830/11/2018712222.1411.01859-9.80.00.00.0AutumnNo HolidayYes
875930/11/2018584231.9431.31909-9.30.00.00.0AutumnNo HolidayYes
\n", "

8760 rows × 14 columns

\n", "
" ], "text/plain": [ " Date Rented Bike Count Hour Temperature(°C) Humidity(%) \\\n", "0 01/12/2017 254 0 -5.2 37 \n", "1 01/12/2017 204 1 -5.5 38 \n", "2 01/12/2017 173 2 -6.0 39 \n", "3 01/12/2017 107 3 -6.2 40 \n", "4 01/12/2017 78 4 -6.0 36 \n", "... ... ... ... ... ... \n", "8755 30/11/2018 1003 19 4.2 34 \n", "8756 30/11/2018 764 20 3.4 37 \n", "8757 30/11/2018 694 21 2.6 39 \n", "8758 30/11/2018 712 22 2.1 41 \n", "8759 30/11/2018 584 23 1.9 43 \n", "\n", " Wind speed (m/s) Visibility (10m) Dew point temperature(°C) \\\n", "0 2.2 2000 -17.6 \n", "1 0.8 2000 -17.6 \n", "2 1.0 2000 -17.7 \n", "3 0.9 2000 -17.6 \n", "4 2.3 2000 -18.6 \n", "... ... ... ... \n", "8755 2.6 1894 -10.3 \n", "8756 2.3 2000 -9.9 \n", "8757 0.3 1968 -9.9 \n", "8758 1.0 1859 -9.8 \n", "8759 1.3 1909 -9.3 \n", "\n", " Solar Radiation (MJ/m2) Rainfall(mm) Snowfall (cm) Seasons \\\n", "0 0.0 0.0 0.0 Winter \n", "1 0.0 0.0 0.0 Winter \n", "2 0.0 0.0 0.0 Winter \n", "3 0.0 0.0 0.0 Winter \n", "4 0.0 0.0 0.0 Winter \n", "... ... ... ... ... \n", "8755 0.0 0.0 0.0 Autumn \n", "8756 0.0 0.0 0.0 Autumn \n", "8757 0.0 0.0 0.0 Autumn \n", "8758 0.0 0.0 0.0 Autumn \n", "8759 0.0 0.0 0.0 Autumn \n", "\n", " Holiday Functioning Day \n", "0 No Holiday Yes \n", "1 No Holiday Yes \n", "2 No Holiday Yes \n", "3 No Holiday Yes \n", "4 No Holiday Yes \n", "... ... ... \n", "8755 No Holiday Yes \n", "8756 No Holiday Yes \n", "8757 No Holiday Yes \n", "8758 No Holiday Yes \n", "8759 No Holiday Yes \n", "\n", "[8760 rows x 14 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bike_share_data = (\n", " pd\n", " .read_csv(\n", " 'https://archive.ics.uci.edu/ml/machine-learning-databases/00560/SeoulBikeData.csv', \n", " encoding='latin1'\n", " )\n", ")\n", "bike_share_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Simple Data Visulizations\n", "\n", "* Start with the table above\n", "* Create Histograms for numeric columns\n", "* Plot the histograms" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "(\n", " bike_share_data\n", " [['Rented Bike Count']]\n", " .plot(\n", " kind='hist', \n", " alpha=0.5, \n", " title='Rented Bike Count'\n", " )\n", ");" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "(\n", " bike_share_data\n", " [['Temperature(°C)']]\n", " .plot(\n", " kind='hist', \n", " alpha=0.5, \n", " title='Temperature(°C)'\n", " )\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating bins with _Cut_\n", "\n", "* Start with the table above\n", "* Focus on the temperature column\n", "* Create 5 bins" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 (-6.36, 5.08]\n", "1 (-6.36, 5.08]\n", "2 (-6.36, 5.08]\n", "3 (-6.36, 5.08]\n", "4 (-6.36, 5.08]\n", " ... \n", "8755 (-6.36, 5.08]\n", "8756 (-6.36, 5.08]\n", "8757 (-6.36, 5.08]\n", "8758 (-6.36, 5.08]\n", "8759 (-6.36, 5.08]\n", "Name: Temperature(°C), Length: 8760, dtype: category\n", "Categories (5, interval[float64, right]): [(-17.857, -6.36] < (-6.36, 5.08] < (5.08, 16.52] < (16.52, 27.96] < (27.96, 39.4]]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", " pd\n", " .cut(\n", " bike_share_data\n", " ['Temperature(°C)'], \n", " 5\n", " ) \n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see the interval fo each of the bins:\n", "`[(-17.857, -6.36] < (-6.36, 5.08] < (5.08, 16.52] < (16.52, 27.96] < (27.96, 39.4]]` that are in order and split to more or less equal sizes from temperature values perspective. Let's see how well they split the data:\n", "\n", "* Split the temperature value into 5 bins\n", "* Count the number of records in each bin\n", "* Sort the values by the temperature range (the index of the series)\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(-17.857, -6.36] 529\n", "(-6.36, 5.08] 1989\n", "(5.08, 16.52] 2428\n", "(16.52, 27.96] 2923\n", "(27.96, 39.4] 891\n", "Name: Temperature(°C), dtype: int64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", " pd\n", " .cut(\n", " bike_share_data\n", " ['Temperature(°C)'], \n", " 5\n", " )\n", " .value_counts()\n", " .sort_index()\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we see that the bins are in order when we sort them, and not by alphabetic order. More importanty, we see that they are not equal in size of records, and not based on any other meaningful split.\n", "\n", "### Setting the bins' limits\n", "\n", "We can set the bins to be more meaningful by setting the limits explicitly. As experts in bicycles, we might know the temperature ranges that are suitable for different accessories and clothings. Let's fix the ranges based on this domain knowledge:\n", "\n", "* Split the temperature value into 5 bins based on bicycle professional ranges\n", "* Count the number of records in each bin\n", "* Sort the values by the temperature range (the index of the series)\n", "\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(-18, 0] 1454\n", "(0, 8] 1731\n", "(8, 16] 1651\n", "(16, 24] 2148\n", "(24, 40] 1776\n", "Name: Temperature(°C), dtype: int64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", " pd\n", " .cut(\n", " bike_share_data\n", " ['Temperature(°C)'], \n", " bins=[-18, 0, 8, 16, 24, 40]\n", " )\n", " .value_counts()\n", " .sort_index()\n", ")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding meaningful labels\n", "\n", "* Create a new column in the table for the temperature ranges\n", "* Split the temperature value into 5 bins based on bicycle professional ranges\n", "* Count the number of records in each bin\n", "* Add human readable labels to the bins\n", "* Sort the values by the temperature range (the index of the series)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Below Freezing 1454\n", "Freezing 1731\n", "Cold 1651\n", "Warm 2148\n", "Sizzling 1776\n", "Name: Temperature Range, dtype: int64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bike_share_data['Temperature Range'] = (\n", " pd\n", " .cut(\n", " bike_share_data\n", " ['Temperature(°C)'], \n", " bins=[-18, 0, 8, 16, 24, 40],\n", " labels=['Below Freezing', 'Freezing', 'Cold', 'Warm','Sizzling']\n", " )\n", " \n", ")\n", "(\n", " bike_share_data\n", " ['Temperature Range']\n", " .value_counts()\n", " .sort_index()\n", ")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that the order is still the right order and not alphabetical, and the split is more balanced in the number of records.\n", "\n", "### Boxplot for each bin\n", "\n", "Now we can take every group of records and calculate and plot their box-plot showing the mean and the different quantiles of the different groups." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.7/site-packages/numpy/core/_asarray.py:102: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.\n", " return array(a, dtype, copy=False, order=order)\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "(\n", " bike_share_data\n", " [['Rented Bike Count','Temperature Range']]\n", " .boxplot(by='Temperature Range')\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is wrong with Historgram\n", "\n", "As much as historgrams are popular and simple to plot, they have many limitation:\n", "- It depends (too much) on the number of bins.\n", "- It depends (too much) on variable’s maximum and minimum.\n", "- It doesn’t allow to detect relevant values.\n", "- It doesn’t allow to discern continuous from discrete variables.\n", "- It makes it hard to compare distributions.\n", "\n", "We will use a couple of other plot option to see the data distributions more accurately.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Kernel Density Estimator (KDE)\n", "\n", "This method is calculating and plotting the estimation of the data distribution, and it is part of the Pandas built-in functions:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "(\n", " bike_share_data\n", " ['Rented Bike Count']\n", " .plot\n", " .kde(\n", " title='Rented Bike Count',\n", " grid=True\n", " )\n", ");" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "(\n", " bike_share_data\n", " ['Temperature(°C)']\n", " .plot\n", " .kde(\n", " title='Temperature(°C)', \n", " grid=True\n", " )\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cumulative Distribution Function (CDF)\n", "\n", "The second option is using counts of the different percentiles of the data and using a cumulative plot makes it easy to find the value of each percentile and calculate the percentage of data points between every given percentiles." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/opt/conda/lib/python3.7/site-packages/secretstorage/dhcrypto.py:16: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead\n", " from cryptography.utils import int_from_bytes\n", "/opt/conda/lib/python3.7/site-packages/secretstorage/util.py:25: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead\n", " from cryptography.utils import int_from_bytes\n", "Requirement already satisfied: statsmodels in /opt/conda/lib/python3.7/site-packages (0.11.0)\n", "Requirement already satisfied: scipy>=1.0 in /opt/conda/lib/python3.7/site-packages (from statsmodels) (1.4.1)\n", "Requirement already satisfied: numpy>=1.14 in /opt/conda/lib/python3.7/site-packages (from statsmodels) (1.20.3)\n", "Requirement already satisfied: pandas>=0.21 in /opt/conda/lib/python3.7/site-packages (from statsmodels) (1.3.5)\n", "Requirement already satisfied: patsy>=0.5 in /opt/conda/lib/python3.7/site-packages (from statsmodels) (0.5.1)\n", "Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.7/site-packages (from pandas>=0.21->statsmodels) (2.8.1)\n", "Requirement already satisfied: pytz>=2017.3 in /opt/conda/lib/python3.7/site-packages (from pandas>=0.21->statsmodels) (2019.3)\n", "Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from patsy>=0.5->statsmodels) (1.14.0)\n", "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install statsmodels" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "from statsmodels.distributions.empirical_distribution import ECDF\n", "import matplotlib.pyplot as plt\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ecdf = ECDF(bike_share_data['Temperature(°C)'])\n", "plt.plot(ecdf.x, ecdf.y)\n", "plt.grid(True)\n", "plt.title('Temperature(°C)'); " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEICAYAAABPgw/pAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3deXxU9b3/8dcnGwECYQ/7KooooARBa6viilqX1qXUSt21C+2vt5v23l6vte1trbe3t61ebbWKbVVcWi1arFsBtVzZZBFk37cQtoQEss/n98cc7BiTEMLMnMzk/Xw85sFZvnPmPSfhkzPfc+Z8zd0REZHUlxF2ABERiQ8VdBGRNKGCLiKSJlTQRUTShAq6iEiaUEEXEUkTKuiSdszsHDPbdgzPf8XMbgimbzSzd+KXTiRxVNClRcxsk5lVmFm5mRWZ2TQzy4vTtqeZ2Y/isa1Gtu9mdjDIvsfMnjazLofXu/vF7v5EAl43x8zuMbO1wetvMrPHzGxwvF+r3use0x84SR0q6HIsLnP3POAU4FTgeyHnORpjguxDga7APUl4zeeBy4HrgHxgDLAIOC8Jry1tgAq6HDN3LwJeJVrYATCzdmb2X2a2xcx2mdnDZtY+WHeOmW0zs2+ZWbGZ7TSzm4J1twNfAL4bHEG/FCzva2Z/MrPdZrbRzL4e81rtg6P6/Wb2AXDaUWQ/AMwARsZsb7aZ3dpQezO738zeMbP8YP5mM1sZvParZjaokeedD1wAXOHuC9y91t1L3f1Bd/9dzHucYWb7zGydmd0W8/yPfGqpf9QdHO1/28yWmVmpmT1jZrlm1hF4Begb7M9yM+vb3P0jqUUFXY6ZmfUHLgbWxSy+DzieaJE/DugH3B2zvjfRo9R+wC3Ag2bW1d1/CzwJ/Mzd89z9MjPLAF4ClgbtzwO+YWYXBdv6D2BY8LgIuOEosncFrgTePUK7DDN7BBgNXOjupWZ2JfCvwGeBnsDbwNONbOJ8YL67b23iZZ4GtgF9gauB/zSzozl6vxaYBAwJct7o7geJ/mx2BPszz913HMU2JYWooMuxeNHMyoCtQDHRwoqZGXAb8C/uvs/dy4D/BCbHPLcGuNfda9x9JlAOnNDI65wG9HT3e9292t03AI/EbO9a4MfBa20FftWM7O+ZWQmwBxgI/KaJttlEi203ot1Mh4LldwA/cfeV7l4bvMdTGjlK7w7sbOwFzGwA8EngTnevdPclwKPAlGa8l8N+5e473H0f0T+ApxzpCZJessIOICntSnd/w8zOBp4CegAlRI9WOwCLorUdAAMyY567NyiChx0CGjupOohol0FJzLJMokfEED2ijT3y3dyM7GPdfZ2ZZQNfAd42s5HuXtlA2+OI9nePd/fqerl+aWY/j1lmRD9F1M+wl+gnlsb0BQ7/8Yt9H+Oa8V4OK4qZPhRsU9oQHaHLMXP3OcA04L+CRXuACuAkd+8SPPKDk5DN2mS9+a3AxphtdXH3Tu5+SbB+JzAgpv3Ao8heQ/RIeAhwciPNVgI3Aa+YWeyniK3AHfVytXf3uQ1s4w1gfNA91ZAdQDcz61TvfWwPpg8S/SN5WO8m39hH6ZaqbYQKusTL/wAXmNkp7h4h2iXyCzPrBWBm/WL6vI9kF9GrTw6bDxwwszuDE6CZZnaymR0++fks8D0z6xoUzK81N7SZZRIt1hXAhsbaufvTRPvL3zCzYcHih4PXPSnYVr6ZXdPI898AXgdeMLNCM8sys05m9iUzuznoKpoL/CQ4mTma6LmFJ4NNLAEuMbNuZtYb+EZz3yPR/dn98IlcSV8q6BIX7r4b+D3w78GiO4meJH3XzA4QPUJtrI+8vt8BI82sxMxedPc64DKifcIbiX4CeJToSVWAHxDtntgIvAb8oRmvsdTMyoH9RE+ifiboe27qPT4B3Av83cwGu/sLRE/+Tg/e43KiJyAbczUwE3gGKA3ajyO6bwA+DwwmerT+AvAf7v56sO4PRE8Kbwre4zPNeI+Hc68ieg5gQ7BP1RWTpkwDXIiIpAcdoYuIpAkVdBGRNKGCLiKSJlTQRUTSRGhfLOrRo4cPHjy4Rc89ePAgHTt2jG+gBFHWxEiVrKmSE5Q1UeKdddGiRXvcvWeDK909lEdhYaG31KxZs1r83GRT1sRIlaypktNdWRMl3lmBhd5IXVWXi4hImlBBFxFJEyroIiJpQgVdRCRNqKCLiKSJIxb0YBDbYjNb3sh6M7NfBUNmLTOzsfGPKSIiR9KcI/RpRIe1aszFwPDgcTvw0LHHEhGRo3XELxa5+1tmNriJJlcAvw+uj3zXzLqYWR93b3S4LRGRRHJ3auqcmroItXVOdV2E2kiEmlqnJhL5yPKa2gi1kaBNnVMXic7X1X94I8vqHAdib1zrwZgi7rBpUzXv1azhwwXAeScWMGZAl7i/72bdPjco6C+7+8dGdDGzl4Gfuvs7wfybRMdFXNhA29uJHsVTUFBQOH369BaFLi8vJy+vuYPfhEtZEyNVsqZKTmh51rqIUx2Bqjqnug6q6qA6mK6JODURqIlAbcSpjUTXV9VFp+siUOse/PvR+TrnwzZ1Hm1/uE11XR1ORnS9RzMcXlfn0UdrZcCUkTmcOzC7Rc+fOHHiIndvcGjCeHz13xpY1uDu9OiI7r8FGDdunJ9zzjktesHZs2fT0ucmm7ImRqpkbe05a+si7D1YTVFpJYvmLqR710HsKquivLKWipo6KmrqqKyu+3C6orqOypo6DgXLKmvqqDmG6pmdaWRnZpCVYeRkZZCVkUF2VnRZdkYG2TlGVkYGHTIzyAraZmcaJfv20qd3QfT5wXOyMjKCbUTbxU5/+Dox09nBNnOC18/KzCAnM4PMDCM708jIMDLNyMyIPrIyYpZlfnRdRjB2bmwxPDyc7pw5c5L2OxCPgr6Nj47n2J/oiCsiEjJ3Z095NVv3H2LtrjI27DnIjpJKdpZUsGjLfj72AX3xB2RlGHm5WbTPzqR9Tmb03+xM8tpl0SOvHe2zM+mQk0luvfW5OZl0iFnWLjuD3OxMcjIzaJcVLbA5WRnkZmWSl5tFVoYRM4j4UYn+oTz12HdQmolHQZ8BTDWz6cAEoFT95yLhOFBZw5ItJSzeUsJ7W/bz/vZS9h2s/nB9TmYGvfNz6ZOfyxVj+tIuK5OT++fTu3Mu29cu59PnfZJuHXLIyGhZoZVwHbGgm9nTwDlADzPbBvwHkA3g7g8THSPxEqLjRx4iOuCuiCRYJOJ8sPMA727Yy9JtpazdVcba4nLqIo4ZHN+rE+ef2IsT+3RmYLcODOuZx8BuHRot1rOLV9Ijr12S34XEU3Oucvn8EdY78NW4JRKRBpVX1bJ13yHmb9zH3PV7mLdxHyWHagDo16U9xxfkccHIAk4f2p3R/fPplNuyk26SukK7H7qINMzdWb+7nIWb9vPelv2s332QDbvL2R8Ub4D+Xdtz4cgCzhjWnTOG9qB3fm6IiaW1UEEXCVlNXYT3t5cyf+M+5m3Yy+KtJR8eeXftkM0JvTtx4cjeDOnZkT75uYwd2JUB3TqEnFpaIxV0kSQrOVTNqyuKWFVUxqqdZSzZWkJFTR0Aw3p25KKRvRnVP59PDOvOkB4dW3wliLQ9KugiCVZVW8fc9XtZuGkfizbv590N+z5cN2ZAFz532gDGD+nGaYO70bOTTkpKy6mgiyRAZU0d76zdw6NLK/narDcoq6olM8M4qW9nbjpzMJeM6kPhwK66PFDiSgVdJE52llbwxspiZq8q5h/r91BZE6FjNlw6pj8XjuzNJ47rTocc/ZeTxNFvl8gxOFRdy9tr9zDz/Z38ddlOaiPOgG7t+dy4AZwzohc121Zw4Xljwo4pbYQKushRcncWby3hodnreWvNbqpqI3TOzeILEwZy/emDOK5X3ocnMmfv/CDktNKWqKCLNENVbR3zN+7jzZXF/H1VMVv2HSI3O4OJJ/RiyhmDGD+kG9mZGgBMwqWCLtKIHSUVzFmzm7fW7OadtXsoq6qlXVYGZx7Xgy+dPYzLxvTRtzGlVVFBFwmUV9WyeMt+5qzezZw1u1lbXA5A3/xcLh3dhwtGFvCJYT1on5MZclKRhqmgS5sViUT7wl9auoMlW0tYtq2EiEfvSDhhaDc+d9oAzj6+50f6xEVaMxV0aXM27jnI6x8U8dS8LWzae4isDOOUAV346sTjGDuoKxOGdNPlhZKS9FsrbUJRaSVPzdvMy8t2smHPQQBOGdCFn587nAtOKqCz+sIlDaigS9rauu8Qb67cxYtLol0qAKcP7cYNnxjMuSN66QZXknZU0CWtVNdGeGHxNqbN3czKnQcAGNqzI984fzhXnNKPIT06hpxQJHFU0CUtrN9dzuP/2MjM94vYd7Ca43rl8f1LT+T8EwsYrCIubYQKuqQsd+e9LSU8MXcTM5ZGxyUf1S+fn187hnOO76krU6TNUUGXlLP/YDV/WVfNL5b/g6XbSsnJyuCGMwbx1YnH0auzRu6RtksFXVJGUWkl0xds4Xdvb6SsqpbuHSu457KRXFXYX9/YFEEFXVJA8YFK7n91NS8s3k5txDnr+J6c1bWMWz9zXtjRRFoVFXRptXYdqOTnr63m+UXbcOC68QO59VNDGdKjI7Nnzw47nkiro4Iurc47a/fw3KKtvLK8iEjEuf70QXzxjMEc1ysv7GgirZoKurQK0StW9vPA39cxa/VuOuVmce24/tzyyaG6dlykmVTQJVSRiPPMwq38/LU17CmvIjPD+M5FJ3DLJ4eQm627GoocDRV0CcXBqlpeWrqDR97ewPrdBz8cPPn6CYPI76ArVkRaQgVdkqrkUDX/O3s9T83bQnlVLYO7d+C+q0Zx7bgB+iKQyDFSQZekcHcenrOBh2avo6yqlotP7s1NZw5h3KCuKuQicaKCLgl3oLKGO59fxivLizj7+J58d9IJnNQ3P+xYImlHBV0Sxt15duFWfvXmOnaUVvD184bzjfOGk5GhI3KRRFBBl4TYW17F3TNW8NdlOxnVL5+fXjWKTw3vGXYskbTWrIJuZpOAXwKZwKPu/tN66wcCTwBdgjZ3ufvMOGeVFODu/GXJDn7w0goOVNbyzQuO52vnHqd+cpEkOGJBN7NM4EHgAmAbsMDMZrj7BzHNvg886+4PmdlIYCYwOAF5pRWrqK7jy08uYvbq3YwZ0IUfXXEyo/qrr1wkWZpzhD4eWOfuGwDMbDpwBRBb0B3oHEznAzviGVJav017DvKlPy5iVVEZN585hH+9ZARZmRlhxxJpU8zdm25gdjUwyd1vDeanABPcfWpMmz7Aa0BXoCNwvrsvamBbtwO3AxQUFBROnz69RaHLy8vJy0uN+3q0haybSuv470VV1Llz88ntKCxI/KmZVNmvqZITlDVR4p114sSJi9x9XIMr3b3JB3AN0X7zw/NTgF/Xa/NN4FvB9BlEj94zmtpuYWGht9SsWbNa/NxkS/esL7y3zU/4/kwf/+PXfU3RgfiHakSq7NdUyemurIkS76zAQm+krjbnUGobMCBmvj8f71K5BZgU/IH4PzPLBXoAxc3YvqQgD74odN/fVjGmfz4PXV9I3y7tw44l0qY1p6AvAIab2RBgOzAZuK5emy3AecA0MzsRyAV2xzOotB6b9x7k7r+sYM6a3Zw3ohcPXDeW9jm6kZZI2I5Y0N291symAq8SvSTxMXdfYWb3Ej30nwF8C3jEzP6F6AnSG4OPBpJmZq0q5itPvocZ3HXxCO44a6guSRRpJZp19sqj15TPrLfs7pjpD4Az4xtNWpt5G/bylSffY1D3Djx6wzj6d+0QdiQRiaFvikqzPDF3E/e+/AF9u+Ty+5vH06tzbtiRRKQeFXRpUm1dhB/9dSXT5m7ipL6defzG01TMRVopFXRp0k9fWcW0uZuYfNoAfnDFSbTL0slPkdZKBV0a9cs31vLoOxuZfNoAfnrV6LDjiMgRqKDLxxyqruU7zy3jr+/v5JJRvfnhlSeHHUlEmkEFXT6isqaOGx9bwPxN+7jj7KF896IRZOr+5SIpQQVdPlRd59zxh0XM37SPn352FJPHDww7kogcBRV0AWB3WRX3L6hkbckh7rlspIq5SApSQRf2lFfx+UfeZVNphP++dgyfHds/7Egi0gK6YXUbV1ZZw02PL2Dz3oN8fWw7FXORFKYj9DasorqOW55YyPvbS3ngulPJ27cm7Egicgx0hN5GVddG+NrTi5m/cR8/uvJkPj26b9iRROQY6Qi9Ddq67xBfe3oxS7aW8J2LTuD60weFHUlE4kAFvY0prajhhsfms62kgvuuGsXnTtPVLCLpQgW9DamsqePLf1zEln2HmHbTeD45vEfYkUQkjlTQ24hIxPn2c0uZu34v9101SsVcJA3ppGgbUBdx7vrzMl5etpOpE49TN4tImtIReppbufMA//7ichZu3s9NZw7mWxceH3YkEUkQFfQ0NmPpDr77/FJyszP58WdO5gsTdDWLSDpTQU9D1bURfv76an4zZwNj+ufzyA3j6NVJowyJpDsV9DRTUxfhtt8vZM6a3Xz21H7cd/VosjN1qkSkLVBBTyPlVbV89cn3mLNmN/92yYncdtbQsCOJSBKpoKeJHSUV3DxtAat3lfGDy0/ihk8MDjuSiCSZCnoa2LTnIF98bD5FpZX8cvKpXD5G92URaYtU0FPc6qIyrnpoLgb8/pbxnD60e9iRRCQkKugpbHdZFTdPW0BOVgZ/+vInGNKjY9iRRCREuvwhRVXW1HHr7xey60Alv5lSqGIuIjpCT0Xuzl1/WsbSrSX8z+dO4bTB3cKOJCKtgI7QU9Dv3tnIi0t28NWJw7jy1H5hxxGRVkJH6CnE3fnJK6v47VsbOHdEL/7lfN2XRUT+qVlH6GY2ycxWm9k6M7urkTbXmtkHZrbCzJ6Kb0ypqYvwrWeX8tu3NnDpqD48dP1YsvQNUBGJccQjdDPLBB4ELgC2AQvMbIa7fxDTZjjwPeBMd99vZr0SFbituvNPy/jz4u3ccfZQ7rxoBBkZFnYkEWllmnOINx5Y5+4b3L0amA5cUa/NbcCD7r4fwN2L4xuzbXtp6Q7+/N52vnLOML538Ykq5iLSIHP3phuYXQ1Mcvdbg/kpwAR3nxrT5kVgDXAmkAnc4+5/a2BbtwO3AxQUFBROnz69RaHLy8vJy8tr0XOT7VizLtpVy/8uqWJIfgZ3js8lO4HFvC3t12RJlZygrIkS76wTJ05c5O7jGlzp7k0+gGuAR2PmpwC/rtfmZeAFIBsYQrRrpktT2y0sLPSWmjVrVoufm2zHkvW9zfv8hO/P9It+McdLK6rjF6oRbWW/JlOq5HRX1kSJd1ZgoTdSV5vT5bINGBAz3x/Y0UCbv7h7jbtvBFYDw5v150YatGnPQab8bj757bP5zZRCOudmhx1JRFq55hT0BcBwMxtiZjnAZGBGvTYvAhMBzKwHcDywIZ5B25Kq2jq+Pn0xBjx7xxkM6q5vgYrIkR2xoLt7LTAVeBVYCTzr7ivM7F4zuzxo9iqw18w+AGYB33H3vYkKnc4iEee7zy9j2bZS7rn8JBVzEWm2Zn2xyN1nAjPrLbs7ZtqBbwYPOQYPzVnPX4JvgV5V2D/sOCKSQvTNlFZk1qpi7n91NRedVMC3Lzwh7DgikmJU0FuJt9bs5qtPvceg7h24/5oxmOlacxE5OirorcDSrSXc+sRCBnbrwLN3nKErWkSkRVTQQ7Z2Vxk3T1tA147ZPH7TaRR0zg07koikKN1tMUSri8q4/IF3qI04M7/+Kfrktw87koikMBX0kJRW1HDT4/Npl5XBH244jRN6dwo7koikOBX0ELg7339xOTsPVDL9ttMZP0QjDonIsVMfegh++eZaXlq6g6kTj2PC0O5hxxGRNKGCnmSvf7CL/3ljLZNO6s03L9CIQyISPyroSbRxz0G+/dxShvboyM+uGa1rzUUkrlTQk+Qf6/Zw5YP/IBJxfvX5U3WtuYjEnU6KJsHqojK+8uR7dOuYw+M3nsbgHrrhlojEn47QE2z3oQjXPfIuEXcevWGcirmIJIyO0BNoy95D/HR+JdVk8vRtpzOsZ2oMmSUiqUlH6AmyrriMyb/9Pw7WONNuGs/J/fLDjiQiaU4FPQGKSiu5adoCKmrquHN8LoWDuoYdSUTaABX0ONu89yCXPfAOe8qqefj6QobkZ4YdSUTaCBX0OCqrrOGmxxdQUV3Hk7dN0LdARSSpdFI0Tiqq65j61GI27DnII18cx9iB6mYRkeTSEXocuDv/+sL7zFmzmx9eeTIXjCwIO5KItEEq6HEwbe4mXli8ndvPGsqU0weFHUdE2igV9GO0qugAP5m5irOO78l3L9LAziISHhX0Y1BWWcPUpxbToV0m9189mqxM7U4RCY9Oih6D7z6/jPW7y3n8Ro0FKiLh0yFlC83bsJdXlhdx4ycGc84JvcKOIyKigt4SpRU1fOf5ZXTvmMM3ztMgFSLSOqjLpQV+8foatuw7xB9vmUB+B93XXERaBx2hH6V1xWX88d3NXDqqD58c3iPsOCIiH1JBPwruzk9mriIjw/jeJSPCjiMi8hEq6Efhkbc38OaqYr51wfH079oh7DgiIh+hgt5Muw5Ucv+rqzn7+J7c9qmhYccREfmYZhV0M5tkZqvNbJ2Z3dVEu6vNzM1sXPwitg73v7qamjrn+5eeSEaGhR1HRORjjljQzSwTeBC4GBgJfN7MRjbQrhPwdWBevEOGbfn2Up5ftI1rx/VneEGnsOOIiDSoOUfo44F17r7B3auB6cAVDbT7IfAzoDKO+VqF/5y5ks65Wdw5SSdCRaT1MndvuoHZ1cAkd781mJ8CTHD3qTFtTgW+7+5Xmdls4NvuvrCBbd0O3A5QUFBQOH369BaFLi8vJy8vOQMur9xbx30LKrn2hGwuGZJz1M9PZtZjpazxlyo5QVkTJd5ZJ06cuMjdG+7WdvcmH8A1wKMx81OAX8fMZwCzgcHB/Gxg3JG2W1hY6C01a9asFj/3aEQiEb/moble+MPX/GBVTYu2kays8aCs8ZcqOd2VNVHinRVY6I3U1eZ0uWwDBsTM9wd2xMx3Ak4GZpvZJuB0YEY6nBidvXo38zft40tnD6NDjr5UKyKtW3MK+gJguJkNMbMcYDIw4/BKdy919x7uPtjdBwPvApd7A10uqcTdue9vqxjYrQPXa9AKEUkBRyzo7l4LTAVeBVYCz7r7CjO718wuT3TAsCzYtJ9VRWXc+qkh5GZnhh1HROSImtWP4O4zgZn1lt3dSNtzjj1W+H771gY652bx2bH9w44iItIs+qZoA1bsKOWNlbu4/vRB5LVT37mIpAYV9AY88Pd15LXL4o6zhoUdRUSk2VTQ69leUsHfVhRxdWF/3etcRFKKCno90/6xEQNu+eSQsKOIiBwVFfQYFdV1PL9oGxNP6MWAbro9roikFhX0GC8t28H+QzVcf4auOxeR1KOCHuPZBVsZ2qMj5xzfM+woIiJHTQU9sOtAJQs37+fTo/tgpvudi0jqUUEPPL9oGwCXju4bchIRkZZRQQciEeepeVuYMKQbJ/TWABYikppU0IG31+1he0kFk8cPOHJjEZFWSgUdePLdzbTPzuTCkb3DjiIi0mJtvqCXVtQwe/Vurjy1Hx113xYRSWFtvqA/u2Ar1XURrhs/MOwoIiLHpE0XdHfnj/M2M6pfPqP654cdR0TkmLTpgr62uJzNew/xmVP7hR1FROSYtemC/sr7RQBcMLIg5CQiIseuzRZ0d+fPi7cxun++bsQlImmhzRb0VUVlbN57iGvH6dpzEUkPbbagv7I82t0ycUSvkJOIiMRHmyzolTV1PLNgC2ce151+XdqHHUdEJC7aZEF/btE2dh2o0qhEIpJW2mRBf2reFob3ymPiCepuEZH00eYK+rriclbuPMBnx/bXfc9FJK20uYL+9trdAFx4kq49F5H00uYK+hsrdzG4eweG9cwLO4qISFy1qYK+o6SCf6zbq2+GikhaalMF/YXF2wH4zKn9Q04iIhJ/baqgv7aiiJF9OjOyb+ewo4iIxF2bKeglh6pZtr2UiSN6hh1FRCQhmlXQzWySma02s3VmdlcD679pZh+Y2TIze9PMBsU/6rGZs2Y37ujacxFJW0cs6GaWCTwIXAyMBD5vZiPrNVsMjHP30cDzwM/iHfRYvf7BLvLbZzNmQJewo4iIJERzjtDHA+vcfYO7VwPTgStiG7j7LHc/FMy+C7Sqs477Dlbz2opdXDamD9mZbaaXSUTaGHP3phuYXQ1Mcvdbg/kpwAR3n9pI+weAInf/UQPrbgduBygoKCicPn16i0KXl5eTl9f868j/trGG6aurueeMXAbnZ7boNVvqaLOGSVnjL1VygrImSryzTpw4cZG7j2twpbs3+QCuAR6NmZ8C/LqRttcTPUJvd6TtFhYWekvNmjXrqNpf8/Bcn/hfR/eceDnarGFS1vhLlZzuypoo8c4KLPRG6mpz+h+2AbGjQPQHdtRvZGbnA/8GXO7uVc39a5NoxWWVzN+4j0tO7hN2FBGRhGpOQV8ADDezIWaWA0wGZsQ2MLNTgd8QLebF8Y/Zci8v3QnAp8eooItIejtiQXf3WmAq8CqwEnjW3VeY2b1mdnnQ7H4gD3jOzJaY2YxGNpdUkYjz1PwtDOvZkRG99WUiEUlvWc1p5O4zgZn1lt0dM31+nHPFxVtrd7OuuJwff+bksKOIiCRcWl/D98d3N9O9Yw5XF7aqqyhFRBIibQt6eVUtb6ws5tLRfWiXldxLFUVEwpC2Bf2peZsBuOKUviEnERFJjrQs6JGI88TczZzUtzOFg7qFHUdEJCnSsqC/u3Ev20squPVTQ8KOIiKSNGlZ0F9etpPc7AwuGNk77CgiIkmTdgX9YFUtLy3ZwbkjepHXrllXZYqIpIW0K+ivLC+irKqWL54xOOwoIiJJlVYFvS7iPPr2Bnp3zmX8YJ0MFZG2Ja0K+svLdrCqqIxvXnA8GRkWdhwRkaRKq4L+12U76dYxh6v0zVARaYPSpqAXH6hk1upiLh/Tl0wdnYtIG5Q2Bf2lZTupqXM+d9qAIzcWEUlDaVHQ3Z3nFm7lxD6dGdG7U9hxRERCkRYFfV1xOauKyvj06D6Yqe8Wv3cAAAeCSURBVLtFRNqmtCjof1teBOhGXCLStqV8QY9EnGcWbuWUAV3o37VD2HFEREKT8gV93sZ9bNtfwXXjB4YdRUQkVClf0Ge+v5OcrAwuHqUbcYlI25bSBb2mLsKMpTu44MQCOuVmhx1HRCRUKV3Q567fS2lFDZeM6hN2FBGR0KV0QX9u4Vby22dz3om9wo4iIhK6lC3oBypreHNlMRedVEButgaBFhFJ2YL+3MJtVNTUcXWhvuovIgIpXdC3MqxnR8YP0X3PRUQgRQv6st21rCoq4wsTBoUdRUSk1Ui5gl5cVsm0FdX069Ke6yboy0QiIoel3CjKM5bsYF+l88BnR+hkqIhIjJQ7Qq+LOADnjtCliiIisVKuoIuISMNU0EVE0kSzCrqZTTKz1Wa2zszuamB9OzN7Jlg/z8wGxzuoiIg07YgF3cwygQeBi4GRwOfNbGS9ZrcA+939OOAXwH3xDioiIk1rzhH6eGCdu29w92pgOnBFvTZXAE8E088D55nGghMRSSpz96YbmF0NTHL3W4P5KcAEd58a02Z50GZbML8+aLOn3rZuB24HKCgoKJw+ffpRB35vVy1vb6nky2M7kpPZ+v9mlJeXk5eXF3aMZlHW+EuVnKCsiRLvrBMnTlzk7uMaXOnuTT6Aa4BHY+anAL+u12YF0D9mfj3QvantFhYWekvNmjWrxc9NNmVNjFTJmio53ZU1UeKdFVjojdTV5nS5bANi74DVH9jRWBszywLygX3N+WsjIiLx0ZyCvgAYbmZDzCwHmAzMqNdmBnBDMH018PfgL4mIiCTJEb/67+61ZjYVeBXIBB5z9xVmdi/RQ/8ZwO+AP5jZOqJH5pMTGVpERD6uWfdycfeZwMx6y+6Oma4k2tcuIiIh0TdFRUTShAq6iEiaUEEXEUkTKugiImniiN8UTdgLm+0GNrfw6T2APUds1Tooa2KkStZUyQnKmijxzjrI3Xs2tCK0gn4szGyhN/bV11ZGWRMjVbKmSk5Q1kRJZlZ1uYiIpAkVdBGRNJGqBf23YQc4CsqaGKmSNVVygrImStKypmQfuoiIfFyqHqGLiEg9KugiImki5Qr6kQasDiHPJjN738yWmNnCYFk3M3vdzNYG/3YNlpuZ/SrIvszMxiY422NmVhyMKHV42VFnM7MbgvZrzeyGhl4rQVnvMbPtwb5dYmaXxKz7XpB1tZldFLM84b8fZjbAzGaZ2UozW2Fm/y9Y3qr2bRM5W91+NbNcM5tvZkuDrD8Ilg+x6MDzay06EH1OsLzRgekbew9JyDrNzDbG7NdTguXJ+/k3NvJFa3wQvX3vemAokAMsBUaGnGkT0KPesp8BdwXTdwH3BdOXAK8ABpwOzEtwtrOAscDylmYDugEbgn+7BtNdk5T1HuDbDbQdGfzs2wFDgt+JzGT9fgB9gLHBdCdgTZCpVe3bJnK2uv0a7Ju8YDobmBfsq2eBycHyh4EvB9NfAR4OpicDzzT1HpKUdRpwdQPtk/bzT7Uj9OYMWN0axA6a/QRwZczy33vUu0AXM+uTqBDu/hYfHznqaLNdBLzu7vvcfT/wOjApSVkbcwUw3d2r3H0jsI7o70ZSfj/cfae7vxdMlwErgX60sn3bRM7GhLZfg31THsxmBw8HziU68Dx8fJ82NDB9Y+8hGVkbk7Sff6oV9H7A1pj5bTT9C5oMDrxmZossOgg2QIG774TofyqgV7C8NeQ/2mxhZ54afEx97HAXRhOZkp41+Kh/KtGjtFa7b+vlhFa4X80s08yWAMVEi9t6oMTdaxt43Q8zBetLge5hZXX3w/v1x8F+/YWZtauftV6muGdNtYJuDSwL+7rLM919LHAx8FUzO6uJtq0x/2GNZQsz80PAMOAUYCfw82B5q8hqZnnAn4BvuPuBppo2sCxpeRvI2Sr3q7vXufspRMctHg+c2MTrtqqsZnYy8D1gBHAa0W6UO5OdNdUKenMGrE4qd98R/FsMvED0F3HX4a6U4N/ioHlryH+02ULL7O67gv84EeAR/vnROfSsZpZNtEg+6e5/Dha3un3bUM7WvF+DfCXAbKL9zV0sOvB8/ddtbGD6sLJOCrq43N2rgMcJYb+mWkFvzoDVSWNmHc2s0+Fp4EJgOR8dNPsG4C/B9Azgi8FZ79OB0sMf0ZPoaLO9ClxoZl2Dj+YXBssSrt75hc8Q3beHs04OrnQYAgwH5pOk34+gr/Z3wEp3/++YVa1q3zaWszXuVzPraWZdgun2wPlE+/xnER14Hj6+TxsamL6x95DorKti/pgb0b7+2P2anJ//sZxRDeNB9IzxGqL9a/8WcpahRM+oLwVWHM5DtC/vTWBt8G83/+fZ8QeD7O8D4xKc72miH6lriB4N3NKSbMDNRE8urQNuSmLWPwRZlgX/KfrEtP+3IOtq4OJk/n4AnyT60XgZsCR4XNLa9m0TOVvdfgVGA4uDTMuBu2P+j80P9s9zQLtgeW4wvy5YP/RI7yEJWf8e7NflwB/555UwSfv566v/IiJpItW6XEREpBEq6CIiaUIFXUQkTaigi4ikCRV0EZE0oYIuIpImVNBFRNLE/wfwh0WcCaYKjAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ecdf = ECDF(bike_share_data['Rented Bike Count'])\n", "plt.plot(ecdf.x, ecdf.y)\n", "plt.grid(True)\n", "plt.title('Rented Bike Count'); " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Group by Quantiles using _qcut_\n", "\n", "When using a target score such as grades or number of rentals, it makes sense to use the split using the Quantiles of the scores. " ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Low 2194\n", "Medium 2186\n", "High 2190\n", "Very High 2190\n", "Name: Usage Level, dtype: int64" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bike_share_data['Usage Level'] = (\n", " pd\n", " .qcut(\n", " bike_share_data['Rented Bike Count'], \n", " q=4,\n", " labels=['Low', 'Medium', 'High', 'Very High']\n", " )\n", ")\n", "(\n", " bike_share_data\n", " ['Usage Level']\n", " .value_counts()\n", " .sort_index()\n", ")" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.7/site-packages/numpy/core/_asarray.py:102: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.\n", " return array(a, dtype, copy=False, order=order)\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "(\n", " bike_share_data\n", " [['Temperature(°C)','Usage Level']]\n", " .boxplot(by='Usage Level')\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heat map with the new value group\n", "\n", "Now that we have the new value categories and we can create heat map to see the hours of the day that are the peak hours that can be used with higher price tiers." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "import seaborn as sns" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sns.heatmap(\n", " bike_share_data\n", " [['Usage Level','Hour','Date']]\n", " .pivot_table(\n", " index='Hour', \n", " columns='Usage Level',\n", " aggfunc='count'\n", " )\n", " .droplevel(0, axis='columns'),\n", " cmap=\"YlOrRd\"\n", ");" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }