🚀 Quick Navigation
1 What is File Handling?
File Handling in Python is the process of creating, reading, writing, updating, and deleting files stored on a storage device (HDD, SSD, etc.).
Key Concept:
Files allow programs to store data persistently, unlike variables which lose data when the program stops. This is essential for real-world applications.
Python provides built-in functions and methods to handle files easily and safely with proper error handling mechanisms.
2 Why Do We Need File Handling?
Primary Reasons
- Data Persistence: Store data beyond program execution
- Configuration: Read settings and configuration files
- Logging: Maintain application logs and audit trails
- Data Exchange: Share data between different programs
- Backup: Create backups of important data
Real-World Applications
- User authentication systems storing credentials
- E-commerce websites saving order history
- Data analysis pipelines reading datasets
- Web servers serving HTML/CSS/JS files
- Games saving player progress and scores
3 File Types in Python
Text vs Binary Files
Text Files
Human-readable, store characters using encoding (UTF-8, ASCII).
Examples: .txt, .csv, .json, .xml, .html
Binary Files
Store raw bytes (0s and 1s), not human-readable.
Examples: .jpg, .png, .mp3, .pdf, .exe
| File Type | Extension | Description | Python Module |
|---|---|---|---|
| Text File | .txt |
Simple readable text data | open() |
| CSV File | .csv |
Comma-separated values, tabular data | csv |
| JSON File | .json |
JavaScript Object Notation, API data | json |
| Excel File | .xlsx, .xls |
Excel spreadsheet with multiple sheets | openpyxl, pandas |
| Binary File | .bin, .dat |
Images, audio, video, executables | open() with 'b' mode |
| Log File | .log |
Application logs & debugging | logging |
📊 CSV vs Excel Files: Key Differences
| Aspect | CSV Files | Excel Files (XLSX) |
|---|---|---|
| Format | Plain text, comma-separated values | Binary/XML format with metadata |
| Size | Smaller, efficient for large datasets | Larger due to formatting and metadata |
| Sheets | Single sheet only | Multiple sheets per file |
| Formatting | No formatting (just data) | Supports fonts, colors, formulas |
| Python Handling | csv module or pandas |
openpyxl, pandas, xlrd |
4
The open() Function
The open() function is the gateway to file operations in Python. It returns a file object (also called a file handle).
file_object = open("data.txt", "r") # Basic syntax
Syntax Parameters
filename: Path to the file (string)mode: Access mode (string, optional, default='r')encoding: Text encoding (string, optional)errors: How to handle encoding errors (optional)newline: How newlines are handled (optional)
Full Syntax
open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
Most commonly used: open(filename, mode, encoding='utf-8')
⚠️ Important Note:
Always ensure files are properly closed after operations to prevent resource leaks and data corruption. Use the with statement for automatic closing.
5 File Modes Explained
Basic Modes
"r" - Read
Default mode. Opens for reading. File must exist.
Beginner"w" - Write
Creates new file or overwrites existing file.
Beginner"a" - Append
Opens for writing, appends to end if file exists.
BeginnerExtended Modes
"x" - Exclusive Creation
Creates new file, fails if file already exists.
Intermediate"r+" - Read and Write
Opens for both reading and writing. File must exist.
Intermediate"w+" - Write and Read
Creates/overwrites file, then allows reading.
Intermediate"a+" - Append and Read
Opens for appending and reading.
IntermediateBinary Mode Modifiers
"b" - Binary Mode
Opens file in binary mode (e.g., "rb", "wb", "ab").
Use case: Images, videos, executables
Advanced"t" - Text Mode
Default. Opens file in text mode (e.g., "rt", "wt").
Use case: Text files, CSV, JSON
BeginnerMode Combination Reference
| Mode | Read | Write | Creates File | Truncates File | Cursor Position |
|---|---|---|---|---|---|
| r | ✓ | ✗ | ✗ | ✗ | Beginning |
| r+ | ✓ | ✓ | ✗ | ✗ | Beginning |
| w | ✗ | ✓ | ✓ | ✓ | Beginning |
| w+ | ✓ | ✓ | ✓ | ✓ | Beginning |
| a | ✗ | ✓ | ✓ | ✗ | End |
| a+ | ✓ | ✓ | ✓ | ✗ | End |
| x | ✗ | ✓ | ✓ | ✗ | Beginning |
6 Reading Files - Complete Guide
read() - Read Entire File
with open("data.txt", "r") as file:
content = file.read() # Reads entire file as string
print(content)
Optional parameter: size - maximum bytes/chars to read
content = file.read(100) # Reads first 100 characters
readline() - Read Single Line
with open("data.txt", "r") as file:
line1 = file.readline() # Reads first line
line2 = file.readline() # Reads second line
# Use in a loop to read all lines:
while True:
line = file.readline()
if not line:
break
print(line.strip())
readlines() - Read All Lines as List
with open("data.txt", "r") as file:
lines = file.readlines() # Returns list of strings
for line in lines:
print(f"Line {lines.index(line)}: {line.strip()}")
Pro Tip: For large files, use iteration instead of readlines() to save memory:
for line in file: # Memory efficient
process(line)
Iterating Over File Object
# Most efficient way for large files
with open("large_data.txt", "r") as file:
for line in file:
# Process each line without loading all into memory
print(line.strip())
7 Writing to Files
write() - Basic Writing
# Creates new file or overwrites existing
with open("output.txt", "w") as file:
file.write("Hello, World!\n")
file.write("This is a second line.")
⚠️ Warning: 'w' mode truncates (erases) existing files! Use 'a' to append or 'x' to avoid overwriting.
writelines() - Write Multiple Lines
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
with open("data.txt", "w") as file:
file.writelines(lines) # Writes list of strings
Note: You need to include newline characters (\n) manually in each string.
Writing Formatted Data
# Writing formatted strings (f-strings)
name = "Alice"
age = 30
with open("user_data.txt", "w") as file:
file.write(f"Name: {name}\n")
file.write(f"Age: {age}\n")
file.write(f"Next year: {age + 1}")
8 Appending Data
append() Mode - Add to End
# File content before: "Line 1"
with open("log.txt", "a") as file:
file.write("\nNew log entry at end")
# File content after: "Line 1\nNew log entry at end"
a+ Mode - Append and Read
with open("data.txt", "a+") as file:
# Write (appends to end)
file.write("\nNew data appended")
# Move to beginning to read
file.seek(0)
content = file.read()
print("File content:", content)
Note: In 'a' and 'a+' modes, writing always happens at the end, regardless of seek position.
9 Closing Files Properly
Why Close Files?
- Resource Management: Frees system resources (file descriptors)
- Data Integrity: Ensures all data is written to disk
- File Locking: Releases file locks for other processes
- Prevent Corruption: Reduces risk of data corruption
Manual Closing
file = open("data.txt", "r")
content = file.read()
# ... do something with content ...
file.close() # ✅ Always close manually
⚠️ Problem: If an error occurs before close(), file may remain open.
Automatic Closing (try-finally)
file = open("data.txt", "r")
try:
content = file.read()
# Process content
finally:
file.close() # ✅ Always executes
✅ Better: Ensures file is closed even if errors occur.
10
Using with Statement (Best Practice)
The with statement (context manager) is the recommended way to handle files in Python. It automatically closes the file when the block exits, even if an exception occurs.
with open("data.txt", "r") as file:
content = file.read()
# File automatically closed here
print("File is now closed")
# You can still use 'content' variable
Multiple Files with 'with'
# Reading from one file and writing to another
with open("source.txt", "r") as src, open("destination.txt", "w") as dest:
content = src.read()
dest.write(content.upper()) # Convert to uppercase
# Both files automatically closed
11 Complete File Object Methods
| Method | Description | Parameters | Returns | Example |
|---|---|---|---|---|
| read(size) | Reads at most size characters/bytes | size (optional) |
String or bytes | file.read(100) |
| readline(size) | Reads one entire line | size (optional max chars) |
String or bytes | file.readline() |
| readlines() | Reads all lines into a list | None | List of strings/bytes | lines = file.readlines() |
| write(string) | Writes a string to the file | string to write |
Number of chars/bytes written | file.write("text") |
| writelines(list) | Writes a list of strings | List of strings | None | file.writelines(lines) |
| seek(offset, whence) | Changes file cursor position | offset, whence (0=start, 1=current, 2=end) |
New position | file.seek(0) (rewind) |
| tell() | Returns current cursor position | None | Integer position | pos = file.tell() |
| flush() | Forces write buffer to disk | None | None | file.flush() |
| truncate(size) | Resizes file to given size | size (optional) |
None | file.truncate(100) |
| close() | Closes the file | None | None | file.close() |
seek() Method - Detailed Explanation
The seek() method moves the file cursor to a specific position:
file.seek(offset, whence)
whence=0(default): Offset from beginning of filewhence=1: Offset from current positionwhence=2: Offset from end of file
Examples:
file.seek(0) # Go to beginning
file.seek(10) # Go to 10th byte from start
file.seek(-5, 2) # Go to 5th byte from end
12 Binary File Handling
Binary files store data in bytes (0s and 1s) rather than human-readable text. They're used for images, audio, video, executables, and any non-text data.
# Reading a binary file (image)
with open("image.jpg", "rb") as file:
image_data = file.read()
print(f"Image size: {len(image_data)} bytes")
# Writing binary data
with open("copy.jpg", "wb") as file:
file.write(image_data)
Common Binary Operations
- File Copy: Read binary, write binary
- Image Processing: Read image bytes, modify, save
- Serialization: Save Python objects with
pickle - Database Files: Read/write SQLite databases
- Network Data: Handle binary network packets
Binary Mode Flags
"rb"- Read binary"wb"- Write binary (overwrites)"ab"- Append binary"rb+"- Read/write binary"wb+"- Write/read binary (overwrites)"ab+"- Append/read binary
Working with Binary Data
# Binary file manipulation example
with open("data.bin", "wb") as file:
# Write bytes object
file.write(b'\x00\x01\x02\x03\x04')
# Write integer as bytes
file.write((1024).to_bytes(4, 'big'))
# Reading back
with open("data.bin", "rb") as file:
first_five = file.read(5)
print(f"First 5 bytes: {first_five}")
int_bytes = file.read(4)
value = int.from_bytes(int_bytes, 'big')
print(f"Integer value: {value}")
13 File Operations: Delete, Rename, Check
Deleting Files
import os
# Delete a file
os.remove("data.txt")
# Delete with error handling
try:
os.remove("data.txt")
print("File deleted successfully")
except FileNotFoundError:
print("File does not exist")
except PermissionError:
print("Permission denied")
Renaming Files
import os
# Rename a file
os.rename("old_name.txt", "new_name.txt")
# Rename with path
os.rename("/path/to/old", "/path/to/new")
Checking File Existence
import os
# Check if file exists
if os.path.exists("data.txt"):
print("File exists")
else:
print("File does not exist")
# Check if it's a file (not directory)
if os.path.isfile("data.txt"):
print("It's a file")
⚠️ Important Security Note:
Always validate file paths before operations to prevent directory traversal attacks. Never use user input directly in file operations without sanitization.
# BAD: user_input = "../../etc/passwd"
# GOOD: basename = os.path.basename(user_input)
14 Advanced File Handling Concepts
Buffering in File Operations
Python uses buffering to improve I/O performance. Data is read/written in chunks rather than byte-by-byte.
# Control buffer size (in bytes)
with open("largefile.txt", "r", buffering=8192) as file:
content = file.read()
# Buffer sizes:
# 0 = no buffering (binary mode only)
# 1 = line buffering (text mode)
# >1 = buffer size in bytes
# -1 = default buffer size (usually 4096 or 8192)
File Encoding and Unicode
Text files use character encodings. UTF-8 is the standard for modern applications.
# Specify encoding when opening files
with open("data.txt", "r", encoding="utf-8") as file:
content = file.read()
# Common encodings:
# utf-8 (recommended), utf-16, ascii, latin-1
# cp1252 (Windows), iso-8859-1
# Handling encoding errors
with open("data.txt", "r", encoding="utf-8", errors="ignore") as file:
content = file.read() # Ignores invalid characters
# errors can be: 'strict' (default), 'ignore', 'replace', 'backslashreplace'
Working with Large Files
For files too large to fit in memory, use chunking or line-by-line processing.
# Process large file in chunks
chunk_size = 1024 * 1024 # 1MB chunks
with open("huge_file.txt", "r") as file:
while True:
chunk = file.read(chunk_size)
if not chunk:
break
# Process chunk
process_chunk(chunk)
# Count lines in huge file efficiently
line_count = 0
with open("large_log.txt", "r") as file:
for line in file:
line_count += 1
print(f"Total lines: {line_count}")
Temporary Files
Use tempfile module for temporary files that are automatically deleted.
import tempfile
# Create a temporary file
with tempfile.NamedTemporaryFile(mode='w', delete=False) as tmp:
tmp.write("Temporary data")
tmp_path = tmp.name # Get the file path
print(f"Temporary file: {tmp_path}")
# Temporary file is automatically deleted when closed
# unless delete=False is specified
# Create temporary directory
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
print(f"Temporary directory: {tmpdir}")
# Directory and contents deleted automatically
15 Real-World Use Cases & Examples
Common Applications
-
User Registration & Login Systems
Store user credentials, preferences, and activity logs in files or databases.
-
Configuration Management
Read settings from JSON/YAML/INI config files for applications.
-
Data Analysis Pipelines
Read CSV/Excel datasets, process, and output results to reports.
-
Web Server Logs
Log HTTP requests, errors, and performance metrics to rotating log files.
-
Backup Systems
Create incremental backups by comparing and copying modified files.
Industry Examples
-
E-commerce Platforms
Save order history, product catalogs, customer data in files/databases.
-
Scientific Computing
Process large datasets from sensors, simulations, or experiments.
-
Game Development
Save game state, player progress, high scores, and configuration.
-
IoT Devices
Log sensor data, device status, and error reports to local storage.
-
Financial Software
Process transaction records, generate reports, and maintain audit trails.
Complete Example: User Management System
import json
import os
class UserManager:
def __init__(self, filename="users.json"):
self.filename = filename
self.users = self.load_users()
def load_users(self):
"""Load users from JSON file"""
if os.path.exists(self.filename):
with open(self.filename, 'r') as file:
return json.load(file)
return {}
def save_users(self):
"""Save users to JSON file"""
with open(self.filename, 'w') as file:
json.dump(self.users, file, indent=2)
def add_user(self, username, email):
"""Add a new user"""
self.users[username] = {
"email": email,
"created": datetime.now().isoformat()
}
self.save_users()
print(f"User '{username}' added successfully")
def export_to_csv(self, csv_filename):
"""Export users to CSV file"""
import csv
with open(csv_filename, 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['Username', 'Email', 'Created'])
for username, data in self.users.items():
writer.writerow([username, data['email'], data['created']])
print(f"Users exported to {csv_filename}")
# Usage
manager = UserManager()
manager.add_user("alice", "alice@example.com")
manager.add_user("bob", "bob@example.com")
manager.export_to_csv("users_export.csv")
🎯 Summary & Best Practices
Key Takeaways
- Always use
withstatement for automatic resource management - Choose the right file mode for your use case
- Handle exceptions (FileNotFoundError, PermissionError)
- Use appropriate encoding (UTF-8 for text files)
- Close files properly to prevent resource leaks
- Validate file paths to prevent security issues
Performance Tips
- Use buffering for better I/O performance
- Process large files line-by-line or in chunks
- Use binary mode for non-text files
- Consider memory-mapped files for very large files
- Use appropriate data structures (JSON, CSV, etc.)
Quick Reference Cheat Sheet
with statementPython file handling is simple, powerful, and essential for real-world applications.
Mastering file operations will make you a more effective Python developer.