3.2.2.3. Excel#
Microsoft Excel is one of the oldest and most widely used spreadsheet tools. While Excel files may resemble CSV files visually, they are far more powerful in functionality.
Unlike CSVs, Excel files are binary file formats (not plain text) and typically have extensions like .xls or .xlsx. An Excel file can:
Contain multiple sheets
Support formulas, charts, and formatting
Include macros (VBA) for automation
Due to its widespread use in non-technical and business environments, datasets are often shared in Excel format.
We can read and write Excel files using Pandas. To do so, you’ll need to install an Excel engine like openpyxl:
pip install openpyxl
Reading an Excel File#
Suppose we have an Excel file named example.xlsx and want to read a sheet called "Sheet1":
import pandas as pd
# Read a specific sheet from the Excel file
df = pd.read_excel("example.xlsx", sheet_name="Sheet1")
df.head()
| Name | Age | Department | Salary | |
|---|---|---|---|---|
| 0 | Alice | 30 | Engineering | 85000 |
| 1 | Bob | 25 | Marketing | 62000 |
| 2 | Charlie | 28 | Sales | 70000 |
| 3 | Diana | 35 | Engineering | 92000 |
| 4 | Ethan | 40 | HR | 78000 |
Writing a DataFrame to Excel#
You can also save a DataFrame to an Excel file with a custom sheet name:
# Write the DataFrame to an Excel file
df.to_excel("output.xlsx", sheet_name="Sheet2", index=False)
# Verify the Excel file
df2 = pd.read_excel("output.xlsx", sheet_name="Sheet2")
df2.head()
| Name | Age | Department | Salary | |
|---|---|---|---|---|
| 0 | Alice | 30 | Engineering | 85000 |
| 1 | Bob | 25 | Marketing | 62000 |
| 2 | Charlie | 28 | Sales | 70000 |
| 3 | Diana | 35 | Engineering | 92000 |
| 4 | Ethan | 40 | HR | 78000 |
This allows for easy integration with business tools and workflows that rely on Excel.