3.2.2.1. CSV#

CSV stands for Comma-Separated Values and is one of the simplest and most widely used formats for storing tabular data. In a CSV file:

  • Each row represents a record.

  • Columns (features) are separated by commas (",").

  • The first row often contains a header that defines the column names.

CSV files are human-readable and supported by nearly all data tools and spreadsheet software, making them a common choice for data exchange and storage.

Reading a CSV File into a DataFrame#

We can use the pandas.read_csv() method to load a CSV file into a DataFrame:

import pandas as pd

# Read CSV into a DataFrame
df = pd.read_csv("example.csv")
df.head()
Name Age Department Salary
0 Alice 30 Engineering 85000
1 Bob 25 Marketing 62000
2 Charlie 28 Sales 70000
3 Diana 35 Engineering 92000
4 Ethan 40 HR 78000

Writing a DataFrame to a CSV File#

You can save a DataFrame back to a CSV file using the to_csv() method:

# Save DataFrame to CSV
df.to_csv("new_file.csv", index=False)

# Read it back in to verify
df2 = pd.read_csv("new_file.csv")
df2.head()
Name Age Department Salary
0 Alice 30 Engineering 85000
1 Bob 25 Marketing 62000
2 Charlie 28 Sales 70000
3 Diana 35 Engineering 92000
4 Ethan 40 HR 78000

Setting index=False ensures the row index is not written as an extra column in the CSV file.