How do I read and write CSV files with Python?
Here are some minimal complete examples how to read CSV files and how to write CSV files with Python.
Python 3: Reading a CSV file
Pure Python
import csv
# Define data
data = [
(1, "A towel,", 1.0),
(42, " it says, ", 2.0),
(1337, "is about the most ", -1),
(0, "massively useful thing ", 123),
(-2, "an interstellar hitchhiker can have.", 3),
]
# Write CSV file
with open("test.csv", "wt") as fp:
writer = csv.writer(fp, delimiter=",")
# writer.writerow(["your", "header", "foo"]) # write header
writer.writerows(data)
# Read CSV file
with open("test.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
# next(reader, None) # skip the headers
data_read = [row for row in reader]
print(data_read)
After that, the contents of data_read
are
[['1', 'A towel,', '1.0'],
['42', ' it says, ', '2.0'],
['1337', 'is about the most ', '-1'],
['0', 'massively useful thing ', '123'],
['-2', 'an interstellar hitchhiker can have.', '3']]
Please note that CSV reads only strings. You need to convert to the column types manually.
A Python 2+3 version was here before (link), but Python 2 support is dropped. Removing the Python 2 stuff massively simplified this answer.
Related
- How do I write data into csv format as string (not file)?
- How can I use io.StringIO() with the csv module?: This is interesting if you want to serve a CSV on-the-fly with Flask, without actually storing the CSV on the server.
mpu
Have a look at my utility package mpu
for a super simple and easy to remember one:
import mpu.io
data = mpu.io.read('example.csv', delimiter=',', quotechar='"', skiprows=None)
mpu.io.write('example.csv', data)
Pandas
import pandas as pd
# Read the CSV into a pandas data frame (df)
# With a df you can do many things
# most important: visualize data with Seaborn
df = pd.read_csv('myfile.csv', sep=',')
print(df)
# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]
# or export it as a list of dicts
dicts = df.to_dict().values()
See read_csv
docs for more information. Please note that pandas automatically infers if there is a header line, but you can set it manually, too.
If you haven't heard of Seaborn, I recommend having a look at it.
Other
Reading CSV files is supported by a bunch of other libraries, for example:
dask.dataframe.read_csv
spark.read.csv
Created CSV file
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3
Common file endings
.csv
Working with the data
After reading the CSV file to a list of tuples / dicts or a Pandas dataframe, it is simply working with this kind of data. Nothing CSV specific.
Alternatives
- JSON: Nice for writing human-readable data; VERY commonly used (read & write)
- CSV: Super simple format (read & write)
- YAML: Nice to read, similar to JSON (read & write)
- pickle: A Python serialization format (read & write)
- MessagePack (Python package): More compact representation (read & write)
- HDF5 (Python package): Nice for matrices (read & write)
- XML: exists too *sigh* (read & write)
For your application, the following might be important:
- Support by other programming languages
- Reading / writing performance
- Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python
How to read and write to CSV Files in Python
Here is a simple example:
The data.csv is a csv with one column and multiple rows.
The results.csv contain the mean and median of the input and is a csv with 1 row and 2 columns (mean is in 1st column and median in 2nd column)
Example:
import numpy as np
import pandas as pd
import csv
#load the data
data = pd.read_csv("data.csv", header=None)
#calculate things for the 1st column that has the data
calculate_mean = [np.mean(data.loc[:,0])]
calculate_median = [np.median(data.loc[:,0])]
results = [calculate_mean, calculate_median]
#write results to csv
row = []
for result in results:
row.append(result)
with open("results.csv", "wb") as file:
writer = csv.writer(file)
writer.writerow(row)
Reading from a sensor and writing to a CSV file
I ended up, following Pranav Hosangadi's advice, handling the sigterm call manually as
import serial
import signal
def signal_handler(signal, frame):
global interrupted
interrupted = True
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
interrupted = False
if __name__ == '__main__':
ser = serial.Serial('COM4')
with open(filename, 'w') as f:
while True:
if ser.in_waiting > 0:
temp = ser.readline()
f.write(temp)
if interrupted:
break
Writing to a CSV file with only one header line
There are two issues with your code, the first is data
is an list yet you're enclosing it in another list, i.e. [header]
is the same as [['cost', 'tax', 'percentage_of_pay']]
which is an list of lists.
Second you would normally write the header first then write the data in a loop one per data row
You probably want something like:
with open('Sales-and-cost.csv', 'w') as f:
writer=csv.writer(f, delimiter='\t', lineterminator='\n')
writer.writerow(header)
for row in rows:
writer.writerow(row)
Where rows is a list of lists containing the output data, i.e.
rows = [[price_1, taxes_1, pay_percentage_1],[price_2, taxes_2, pay_percentage_2],[price_3, taxes_3, pay_percentage_3]]
Related Topics
Loop "Forgets" to Remove Some Items
Understanding the "Is" Operator
Finding Local Ip Addresses Using Python'S Stdlib
Unicodeencodeerror: 'Charmap' Codec Can't Encode Characters
Why Does Substring Slicing With Index Out of Range Work
How to Find Overlapping Matches With a Regexp
How to Get Indices of N Maximum Values in a Numpy Array
Pygame Doesn't Let Me Use Float For Rect.Move, But I Need It
Updating Gui Elements in Multithreaded Pyqt
Pygame Window Not Responding After a Few Seconds
Tkinter: How to Use After Method
Threading Pool Similar to the Multiprocessing Pool
How to Sort a Dataframe in Python Pandas by Two or More Columns
Find the Row Indexes of Several Values in a Numpy Array