Welcome to the tutorial “from-Excel-to-Pandas” for data analysis

Welcome to the tutorial “from-Excel-to-Pandas” for data analysis

This is a guide for data analysts that are fluent with Excel and want to modernize their analyses to Python and mainly using the Pandas library. This tutorial is NOT teaching programing in Python, as it is focused on the FUNCTIONs to implement in a sequence to achieve the result of an analysis. This simple multi-step flow of functions makes the analysis easier to understand and follow, and also to run the analyses in a consistent way in the future and across the organization.

The tutorial is based on a set of Jupyter notebooks demonstrating the various ways to implement the main functions that are available in Microsoft Excel with their Pandas equivalents. You can jump directly to a specific section by searching the Excel function name or read through the tutorial to learn the basic and advanced topics. The tutorial is organized in complexity order and the later chapters assume the knowledge that is covered in earlier ones.

Open In Studio Lab

Tip

Run the notebook in an interactive environment to better learn

“Tell me and I forget, teach me and I may remember, involve me and I learn.” - Xunzi

Main Chapters

  • Getting Started (if you want also to play with the code)

    • Installing locally (python, jyputer, git)

    • Using Amazon SageMaker

    • Using Google Colab

    • Using Binder

  • Loading Data

    • From Excel and CSV files

    • From HTML sources

    • From APIs

  • Table Summary

    • Statistics (describe, info, head)

    • Totals and Unique values

  • Adding Columns to Tables

    • Using built-in functions

    • Using custom functions

  • Grouping and Pivot Tables

    • Using pivot_table

    • Using groupby

  • Joining and Merging Tables and Lists

    • Using merge

  • Adding Charts

    • Using plot

    • Using plotly and seaborn

    • Extending plots with Matplotlib

  • Industries

    • Financial Analysis (Stocks, for example)

    • Manufacturing (Market Share, for example)

    • Logistics (Scheduling, for example)

    • Planning (Linear Programming, for example)