Every AI project starts with data. And data almost always lives in a CSV file. Whether you are reading the CBSE-suggested rainfall.csv, a dataset downloaded from Kaggle, or your own student marks file — this is the skill that unlocks everything else.
This tutorial covers all the CSV reading methods you need for Class 10 (Unit 7), Class 11 (Unit 3 Level 2 and Unit 5), and Class 12 (Unit 1) practicals.
What You’ll Learn
- What a CSV file is and how Python reads it
- All essential
pd.read_csv()options with practical examples - How to explore, filter, and extract data from a CSV
- How to handle common problems — missing values, wrong separators, encoding errors
- How to write data back to CSV after cleaning
What Is a CSV File?
CSV stands for Comma-Separated Values. It is a plain text file where each row is a line and each column is separated by a comma.
A CSV file looks like this in a text editor:
Name,Marks,Grade,City
Arjun,85,B,Delhi
Priya,92,A,Mumbai
Kiran,78,C,Delhi
And it looks like this when opened in Excel:
| Name | Marks | Grade | City |
|---|---|---|---|
| Arjun | 85 | B | Delhi |
| Priya | 92 | A | Mumbai |
| Kiran | 78 | C | Delhi |
Python’s Pandas library reads CSV files and converts them into a DataFrame — a table you can query, filter, and analyse with code.
Part 1 — Basic CSV Reading
The Standard Way
python
# Program to read a CSV file and display its contents
import pandas as pd
df = pd.read_csv("data.csv")
print("First 5 rows:")
print(df.head())
Before running: Save your CSV file in the same folder as your Jupyter Notebook. If the file is elsewhere, you must provide the full path: pd.read_csv("C:/Users/Arjun/Documents/data.csv").
Read and Display 10 Rows ✅ CBSE Class 10 Suggested Program
python
# Program to read the csv file saved in your system and display 10 rows
import pandas as pd
df = pd.read_csv("data.csv")
print(df.head(10))
Expected Output: The first 10 rows of your CSV displayed as a formatted table with column headers and row index numbers on the left.
Read and Display Information ✅ CBSE Class 10 Suggested Program
python
# Program to read csv file saved in your system and display its information
import pandas as pd
df = pd.read_csv("data.csv")
print("Shape:", df.shape)
print("\nColumn Names:", df.columns.tolist())
print("\nDataset Info:")
print(df.info())
print("\nStatistical Summary:")
print(df.describe())
Expected Output (example for a 50-row, 4-column dataset):
Shape: (50, 4)
Column Names: ['Name', 'Marks', 'Grade', 'City']
Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50 entries, 0 to 49
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Name 50 non-null object
1 Marks 47 non-null float64
2 Grade 50 non-null object
3 City 50 non-null object
dtypes: float64(1), object(3)
Statistical Summary:
Marks
count 47.000000
mean 78.340426
...
What df.info() reveals that df.describe() doesn’t: The Non-Null Count column tells you exactly which columns have missing data. In the example above, Marks shows 47 non-null out of 50 rows — meaning 3 values are missing.
Part 2 — Useful read_csv() Options
pd.read_csv() has many parameters. These are the ones you will actually use in CBSE practicals:
Specifying a Different Separator
Some CSV files use semicolons or tabs instead of commas:
python
# Reading a semicolon-separated file
df = pd.read_csv("data.csv", sep=";")
# Reading a tab-separated file (.tsv)
df = pd.read_csv("data.tsv", sep="\t")
Reading Only Specific Columns
python
# Read only Name and Marks columns — ignore the rest
df = pd.read_csv("data.csv", usecols=["Name", "Marks"])
print(df.head())
Skipping Rows at the Top
python
# Skip the first 2 rows (useful when CSV has header comments)
df = pd.read_csv("data.csv", skiprows=2)
Setting a Column as the Index
python
# Use the Name column as the row label instead of 0, 1, 2...
df = pd.read_csv("data.csv", index_col="Name")
print(df.head())
Handling Encoding Issues
Some CSV files saved in Indian languages or with special characters cause UnicodeDecodeError:
python
# Try utf-8 first (standard)
df = pd.read_csv("data.csv", encoding="utf-8")
# If that fails, try latin-1
df = pd.read_csv("data.csv", encoding="latin-1")
Part 3 — Exploring the Data After Reading
Once you have read the CSV, always explore it before doing anything else. This sequence is standard in every data science workflow:
python
# Program: Complete CSV exploration workflow
import pandas as pd
df = pd.read_csv("rainfall.csv")
# Step 1: Check shape
print("Rows, Columns:", df.shape)
# Step 2: Preview data
print("\nFirst 5 rows:")
print(df.head())
# Step 3: Check data types
print("\nData types:")
print(df.dtypes)
# Step 4: Missing values
print("\nMissing values per column:")
print(df.isnull().sum())
# Step 5: Basic statistics
print("\nStatistical summary:")
print(df.describe())
# Step 6: Unique values in a column (useful for categories)
# print(df["Grade"].unique())
Part 4 — Reading, Cleaning, and Saving
This is the complete workflow for Class 11 Unit 5 (Data Literacy — Data Pre-processing) and Class 12 Unit 1:
python
# Program to read a CSV, perform statistical analysis,
# check and fill missing values, then save cleaned data
import pandas as pd
# Step 1: Read
df = pd.read_csv("data.csv")
print("Original shape:", df.shape)
print("Missing values:\n", df.isnull().sum())
# Step 2: Statistical analysis
print("\nMean of numeric columns:")
print(df.mean(numeric_only=True))
# Step 3: Fill missing numeric values with column mean
df.fillna(df.mean(numeric_only=True), inplace=True)
# Step 4: Verify
print("\nMissing values after cleaning:")
print(df.isnull().sum())
# Step 5: Save cleaned data
df.to_csv("data_cleaned.csv", index=False)
print("\nCleaned file saved as data_cleaned.csv")
Part 5 — Common Problems and Fixes
Problem: FileNotFoundError: [Errno 2] No such file or directory: 'data.csv'
The most common error. Cause: your CSV is not in the same folder as your notebook.
Fix:
python
import os
print(os.getcwd()) # Shows which folder Jupyter is looking in
Move your CSV to that folder, or copy the full file path into read_csv().
Problem: Extra unnamed column appears (usually called Unnamed: 0)
Cause: The CSV was saved with index=True (the default), adding row numbers as an extra column.
Fix:
python
df = pd.read_csv("data.csv", index_col=0) # Treats first column as index
Or, when saving: df.to_csv("data.csv", index=False)
Problem: Numbers reading as text (object dtype instead of int/float)
python
# Check: df["Marks"].dtype → shows 'object' instead of 'int64'
# Fix: convert after reading
df["Marks"] = pd.to_numeric(df["Marks"], errors="coerce")
errors="coerce" converts non-numeric values to NaN instead of crashing.
Quick Revision Box
| Function / Parameter | What It Does |
|---|---|
pd.read_csv("file.csv") | Reads a CSV file into a DataFrame |
df.head(n) | Shows first n rows (default 5) |
df.tail(n) | Shows last n rows |
df.info() | Shows column names, types, and null counts |
df.describe() | Statistical summary of numeric columns |
df.shape | Returns (rows, columns) as a tuple |
df.isnull().sum() | Counts missing values per column |
df.fillna(value) | Replaces missing values |
df.to_csv("file.csv", index=False) | Saves DataFrame to CSV |
sep="," | Separator character (default comma) |
usecols=["col1","col2"] | Read only specified columns |
encoding="utf-8" | Character encoding for special characters |
Practice Questions
Q1 (2 marks): Write Python code to read a CSV file called marks.csv and display its shape, column names, and the number of missing values in each column.
Model Answer:
python
import pandas as pd
df = pd.read_csv("marks.csv")
print("Shape:", df.shape)
print("Columns:", df.columns.tolist())
print("Missing values:\n", df.isnull().sum())
Q2 (MCQ): Which parameter in pd.read_csv() prevents an extra index column from appearing when the file was saved without index=False?
a) header=False b) index_col=0 c) skiprows=1 d) usecols=None
Answer: b) index_col=0 — treats the first column as the DataFrame index instead of loading it as a data column.
Frequently Asked Questions
Q1: What CSV file should I use for CBSE AI practice programs? The Class 11 AI syllabus specifically references rainfall.csv for Unit 5 programs. Ask your teacher for this file. For additional practice, download datasets from: Kaggle (kaggle.com/datasets), data.gov.in (Indian government open data), or the CBSE-linked AI activities spreadsheet mentioned in the Class 10 syllabus. Any properly formatted CSV works — the programs don’t depend on specific data.
Q2: My CSV file has Indian characters (Hindi names, etc.) and shows garbled text. How do I fix it? This is an encoding problem. Try these in order:
python
df = pd.read_csv("data.csv", encoding="utf-8") # Try first
df = pd.read_csv("data.csv", encoding="utf-8-sig") # If saved from Excel
df = pd.read_csv("data.csv", encoding="latin-1") # Last resort
Q3: Can I read an Excel file instead of CSV using Pandas? Yes: df = pd.read_excel("file.xlsx"). You need the openpyxl library: pip install openpyxl. However, for CBSE AI practicals, CSV is the standard format — use CSV files to stay aligned with the syllabus.