{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Linear Programming (Optimization) \n", "\n", "In this example, we will use an common analysis that is used to optimize decision based on constrains, and it is based on the great blog post [How to Solve Optimization Problems with Python](https://towardsdatascience.com/how-to-solve-optimization-problems-with-python-9088bf8d48e5)\n", "\n", "[](https://studiolab.sagemaker.aws/import/github/aiola-lab/from-excel-to-pandas/blob/master/notebooks/08.03_Linear_optimization.ipynb)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading the data\n", "\n", "We will get the data from HTML table that are available from [Rotoguru](http://rotoguru.net), that is gather stats data from fantasy league. We will focus on the data of the NBA players. To keep the consistency of the notebook, we will check the data for a specific week ('2021-03-02')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "year = 2021\n", "month = 3\n", "day = 2" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "data_url = f'http://rotoguru1.com/cgi-bin/hyday.pl?mon={month}&day={day}&year={year}&game=fd'\n", "dfs = (\n", " pd\n", " .read_html(data_url)\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are a few HTML table in the page, and after some scorlling we can see that the table with the players data is the 6th one (index=5):\n", "* Start with the dataframe at index 5 of the list of table from the HTML page above\n", "* Rename the column to be: Position, Name,FD Points, Salary, Team, Opp., Score, Min, Stats\n", "* Filter out rows with Position string longer than 2 characters (header lines)\n", "* Add a column called _Points_ with the float value of the points to be used as the target of the optimization \n", "* Remove the dollar sign and commas (\\[$,\\]) from the Salary column and convert it to its integer value\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Position | \n", "Name | \n", "FD Points | \n", "Salary | \n", "Team | \n", "Opp. | \n", "Score | \n", "Min | \n", "Stats | \n", "position_length | \n", "Points | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
2 | \n", "PG | \n", "Morant, Ja^ | \n", "55 | \n", "7100 | \n", "mem | \n", "@ was | \n", "125-111 | \n", "33:02 | \n", "35pt 5rb 10as 1st 4to 2trey 11-18fg 11-14ft | \n", "2 | \n", "55.0 | \n", "
3 | \n", "PG | \n", "Westbrook, Russell^ | \n", "53.7 | \n", "9600 | \n", "was | \n", "v mem | \n", "111-125 | \n", "35:31 | \n", "23pt 6rb 15as 3st 8to 3trey 8-16fg 4-9ft | \n", "2 | \n", "53.7 | \n", "
4 | \n", "SG | \n", "George, Paul^ | \n", "45 | \n", "7900 | \n", "lac | \n", "@ bos | \n", "112-117 | \n", "38:39 | \n", "32pt 5rb 4as 1st 2to 5trey 12-26fg 3-3ft | \n", "2 | \n", "45.0 | \n", "
5 | \n", "SG | \n", "DeRozan, DeMar^ | \n", "43.3 | \n", "8200 | \n", "sas | \n", "v nyk | \n", "119-93 | \n", "29:44 | \n", "10pt 4rb 11as 4st 2-5fg 6-6ft | \n", "2 | \n", "43.3 | \n", "
6 | \n", "PG | \n", "Murray, Jamal^ | \n", "41.6 | \n", "8500 | \n", "den | \n", "@ mil | \n", "128-97 | \n", "37:59 | \n", "24pt 3rb 6as 2st 1to 1trey 10-17fg 3-3ft | \n", "2 | \n", "41.6 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
199 | \n", "C | \n", "Robinson, Mitchell | \n", "0 | \n", "4500 | \n", "nyk | \n", "@ sas | \n", "93-119 | \n", "NaN | \n", "NaN | \n", "1 | \n", "0.0 | \n", "
200 | \n", "C | \n", "Bryant, Thomas | \n", "0 | \n", "6200 | \n", "was | \n", "v mem | \n", "111-125 | \n", "NaN | \n", "NaN | \n", "1 | \n", "0.0 | \n", "
201 | \n", "C | \n", "Leonard, Meyers | \n", "0 | \n", "3500 | \n", "mia | \n", "v atl | \n", "80-94 | \n", "NaN | \n", "NaN | \n", "1 | \n", "0.0 | \n", "
202 | \n", "C | \n", "Gasol, Marc | \n", "0 | \n", "4800 | \n", "lal | \n", "v pho | \n", "104-114 | \n", "NaN | \n", "NaN | \n", "1 | \n", "0.0 | \n", "
203 | \n", "C | \n", "Oturu, Daniel | \n", "0 | \n", "3500 | \n", "lac | \n", "@ bos | \n", "112-117 | \n", "NaN | \n", "NaN | \n", "1 | \n", "0.0 | \n", "
198 rows × 11 columns
\n", "\n", " | \n", " | \n", " | Salary | \n", "
---|---|---|---|
Name | \n", "Position | \n", "FD Points | \n", "\n", " |
Barton, Will^ | \n", "SF | \n", "35.1 | \n", "4900 | \n", "
George, Paul^ | \n", "SG | \n", "45 | \n", "7900 | \n", "
James, LeBron^ | \n", "SF | \n", "59 | \n", "10800 | \n", "
Jokic, Nikola^ | \n", "C | \n", "69.5 | \n", "11000 | \n", "
Lyles, Trey^ | \n", "PF | \n", "33 | \n", "4000 | \n", "
Melton, De'Anthony | \n", "PG | \n", "38.3 | \n", "4600 | \n", "
Morant, Ja^ | \n", "PG | \n", "55 | \n", "7100 | \n", "
Quickley, Immanuel | \n", "SG | \n", "37.8 | \n", "5100 | \n", "
Saric, Dario | \n", "PF | \n", "32.5 | \n", "4100 | \n", "
Total | \n", "\n", " | \n", " | 59500 | \n", "