Pandas Read CSV in Python

Last Updated : 28 Apr, 2026

CSV files are Comma-Separated values files that allow storage of tabular data.

  • To access data from the CSV file, we require a function read_csv() from Pandas that retrieves data in the form of the data frame.
  • First, we must import the Pandas library, then using Pandas load this data into a DataFrame

In the code below, we are working with a CSV file named people.csv which contains people data.

PYTHON
import pandas as pd

df = pd.read_csv("people.csv")
df

Output

Pandas-Read-CSV
Pandas Read CSV in Python

read_csv() function

read_csv() function in Pandas is used to read data from CSV files into a Pandas DataFrame. A DataFrame is a data structure that allows you to manipulate and analyze tabular data efficiently. CSV files are plain-text files where each row represents a record and columns are separated by commas (or other delimiters).

Syntax

pd.read_csv(filepath_or_buffer, sep=' ,' , header='infer',  index_col=None, usecols=None, engine=None, skiprows=None, nrows=None) 

Parametersthere are no :

  • filepath_or_buffer: Location of the csv file. It accepts any string path or URL of the file.
  • sep: It stands for separator, default is ', '.
  • header: It accepts int, a list of int, row numbers to use as the column names and the start of the data. If no names are passed, i.e., header=None, then, it will display the first column as 0, the second as 1 and so on.
  • usecols: Retrieves only selected columns from the CSV file.
  • nrows: Number of rows to be displayed from the dataset.
  • index_col: If set to None, Pandas automatically assigns a default integer index (0, 1, 2, ...) to the dataset.
  • skiprows: Skips passed rows in the new data frame.

Features in Pandas read_csv

1. Read specific columns using read_csv

The usecols parameter allows to load only specific columns from a CSV file. This reduces memory usage and processing time by importing only the required data.

Python
df = pd.read_csv("people.csv", usecols=["First Name", "Email"])
print(df)

Output

1111
specific columns using read_csv

2. Setting an Index Column (index_col)

The index_col parameter sets one or more columns as the DataFrame index, making the specified column(s) act as row labels for easier data referencing.

Python
df = pd.read_csv("people.csv", index_col="First Name")
print(df)

Output

setting-columns-as-the-DataFrame-index
Read CSV in Python

3. Handling Missing Values Using read_csv

The na_values parameter replaces specified strings (e.g., "N/A", "Unknown") with NaN, enabling consistent handling of missing or incomplete data during analysis.\

Python
df = pd.read_csv("people.csv", na_values=["N/A", "Unknown"])

na_values only specifies which values should be treated as NaN; it does not guarantee that the dataset has no missing values.

4. Reading CSV Files with Different Delimiters

In this example, we will take a CSV file and then add some special characters to see how the sepparameter works.

Python
import pandas as pd

data = """totalbill_tip, sex:smoker, day_time, size
16.99, 1.01:Female|No, Sun, Dinner, 2
10.34, 1.66, Male, No|Sun:Dinner, 3
21.01:3.5_Male, No:Sun, Dinner, 3
23.68, 3.31, Male|No, Sun_Dinner, 2
24.59:3.61, Female_No, Sun, Dinner, 4
25.29, 4.71|Male, No:Sun, Dinner, 4"""

with open("sample.csv", "w") as file:
    file.write(data)
print(data)

Output
totalbill_tip, sex:smoker, day_time, size
16.99, 1.01:Female|No, Sun, Dinner, 2
10.34, 1.66, Male, No|Sun:Dinner, 3
21.01:3.5_Male, No:Sun, Dinner, 3
23.68, 3.31, Male|No, Sun_Dinner, 2
24.59:3.61, Fe...

The sample data is stored in a multi-line string for demonstration purposes.

  • Separator (sep): The sep='[:, |_]' argument allows Pandas to handle multiple delimiters (:, |, _, ,) using a regular expression.
  • Engine: The engine='python' argument is used because the default C engine does not support regular expressions for delimiters.
Python
df = pd.read_csv('sample.csv',
                 sep='[:, |_]',  
                 engine='python')  
df

Output

2222
CSV Files with Different Delimiters

5. Using nrows in read_csv()

The nrows parameter limits the number of rows read from a file, enabling quick previews or partial data loading for large datasets. Here, we just display only 3 rows using nrows parameter.

Python
df = pd.read_csv('people.csv', nrows=3)
df

Output

3333
Data after using nrows

6. Using skiprows in read_csv()

The skiprows parameter skips unnecessary rows at the start of a file, which is useful for ignoring metadata or extra headers that are not part of the dataset.

Python
df= pd.read_csv("people.csv")
print("Previous Dataset: ")
print(df)

df = pd.read_csv("people.csv", skiprows = [4,5])
print("Dataset After skipping rows: ")
print(df)

Output

4444
Data after using skiprows

7. Parsing Dates (parse_dates)

The parse_dates parameter converts date columns into datetime objects, simplifying operations like filtering, sorting or time-based analysis.

Python
df = pd.read_csv("people.csv", parse_dates=["Date of birth"])
print(df.info())

Output

5555
Parsing Dates

Loading a CSV Data from a URL

Pandas allows you to directly read a CSV file hosted on the internet using the file's URL. This can be incredibly useful when working with datasets shared on websites, cloud storage or public repositories like GitHub.

Python
url = "https://media.geeksforgeeks.org/wp-content/uploads/20241121154629307916/people_data.csv"
df = pd.read_csv(url)
df

Output

6666
CSV Data from a URL
Comment

Explore