Module #
import csv
Reading csv #
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
# print each line
for line in csv_reader:
print(line)
Reading specific column, specify based on column index:
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
# print each line, by index
for line in csv_reader:
print(line[2])
To skip first line (usually the field name), add a next()
:
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader)
# print each line, by index
for line in csv_reader:
print(line[2])
Specify a delimiter other than a comma:
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file, delimiter='\t')
...
Writing csv #
Let’s say we’re reading a file, and replacing delimiter with tabs, and writing to another csv.
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
with open('<target>.csv', 'w') as new_file:
csv_writer = csv.writer(new_file, delimiter='\t')
# each line in source, write it out
for line in csv_reader:
csv_writer.writerow(line)
Once run, <target>.csv
should exist.
The cool thing about delimiter
– if the specified delimiter appears as a value in the source file, then it gets wrapped in double quotes to avoid confusion.
Read csv with dictionary reader #
Creates a dictionary per record. Makes it easier to parse information by being able to specify by key.
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
# print specific field
for line in csv_reader:
print(line['<key>'])
Write csv with dictionary writer #
With dictionary writer, need to specify field names.
If there are extra fields, specify extrasaction='ignore'
to ignore them.
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
with open('<target>.csv', 'w') as new_file:
fieldnames = ['<key1>', '<key2>']
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter='\t', extrasaction='ignore')
csv_writer.writeheader() # explicitly specify to write headers
for line in csv_reader:
csv_writer.writerow(line)
Modifying content #
If there’s anything that needs to be changed record by record, it should happen before the .writerow(line)
call.
For instance, if we want to fill a column with a set of 0’s to achieve a desired character length for a string:
with open('<file>.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
with open('<target>.csv', 'w') as new_file:
fieldnames = ['<key1>', '<key2>']
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter='\t', extrasaction='ignore')
csv_writer.writeheader() # explicitly specify to write headers
for line in csv_reader:
line[fieldnames[0]] = line[fieldnames[0]].zfill(10)
csv_writer.writerow(line)
Pandas write to csv #
Alternatively, Pandas can be used to write a dataframe to csv.
import pandas as pd
some_dataframe("<path to file>.csv", index=False)
The index=False
piece removes the index.