3.2.2.3. Excel#

Microsoft Excel is one of the oldest and most widely used spreadsheet tools. While Excel files may resemble CSV files visually, they are far more powerful in functionality.

Unlike CSVs, Excel files are binary file formats (not plain text) and typically have extensions like .xls or .xlsx. An Excel file can:

  • Contain multiple sheets

  • Support formulas, charts, and formatting

  • Include macros (VBA) for automation

Due to its widespread use in non-technical and business environments, datasets are often shared in Excel format.

We can read and write Excel files using Pandas. To do so, you’ll need to install an Excel engine like openpyxl:

pip install openpyxl

Reading an Excel File#

Suppose we have an Excel file named example.xlsx and want to read a sheet called "Sheet1":

import pandas as pd

# Read a specific sheet from the Excel file
df = pd.read_excel("example.xlsx", sheet_name="Sheet1")
df.head()
Name Age Department Salary
0 Alice 30 Engineering 85000
1 Bob 25 Marketing 62000
2 Charlie 28 Sales 70000
3 Diana 35 Engineering 92000
4 Ethan 40 HR 78000

Writing a DataFrame to Excel#

You can also save a DataFrame to an Excel file with a custom sheet name:

# Write the DataFrame to an Excel file
df.to_excel("output.xlsx", sheet_name="Sheet2", index=False)

# Verify the Excel file
df2 = pd.read_excel("output.xlsx", sheet_name="Sheet2")
df2.head()
Name Age Department Salary
0 Alice 30 Engineering 85000
1 Bob 25 Marketing 62000
2 Charlie 28 Sales 70000
3 Diana 35 Engineering 92000
4 Ethan 40 HR 78000

This allows for easy integration with business tools and workflows that rely on Excel.