Inline CSV File Editing with Python

Inline CSV File Editing with Python

No, you should not attempt to write to the file you are currently reading from. You can do it if you keep seeking back after reading a row but it is not advisable, especially if you are writing back more data than you read.

The canonical method is to write to a new, temporary file and move that into place over the old file you read from.

from tempfile import NamedTemporaryFile
import shutil
import csv

filename = 'tmpEmployeeDatabase.csv'
tempfile = NamedTemporaryFile('w+t', newline='', delete=False)

with open(filename, 'r', newline='') as csvFile, tempfile:
reader = csv.reader(csvFile, delimiter=',', quotechar='"')
writer = csv.writer(tempfile, delimiter=',', quotechar='"')

for row in reader:
row[1] = row[1].title()
writer.writerow(row)

shutil.move(tempfile.name, filename)

I've made use of the tempfile and shutil libraries here to make the task easier.

Editing an open csv file

You can build a generator to allow editing the csv file on the fly. This does not actually edit the file, it simply modifies what the csv reader sees.

Code:

def edit_csv_on_fly(csv_file):
for line in csv_file.readlines():
# replacing comma with semicolon
yield line.replace(",", ";")

Test Code:

import csv
with open('myfile.csv', 'rU') as csvfile:
csv_read = csv.reader(
edit_csv_on_fly(csvfile), delimiter=";", quotechar='"')
for row in csv_read:
print(row)

Test Data:

"COLUMN A","COLUMN B","COLUMN C","COLUMN D","COLUMN E"
"COLUMN A","COLUMN B","COLUMN C","COLUMN D","COLUMN E"
"COLUMN A","COLUMN B","COLUMN C","COLUMN D";"COLUMN E"

Results:

['COLUMN A', 'COLUMN B', 'COLUMN C', 'COLUMN D', 'COLUMN E']
['COLUMN A', 'COLUMN B', 'COLUMN C', 'COLUMN D', 'COLUMN E']
['COLUMN A', 'COLUMN B', 'COLUMN C', 'COLUMN D', 'COLUMN E']

Reading, editing and writing data from a csv file

You have to change modify in this loop.

   for line in csv_reader:
new_line = modify(line)
csv_writer.writerow(new_line)

This will work because here the line is a list of strings. So if your string is in the first column then it can be accessed using line[0]. This is for every row. To get the current row index, enumerate can be used. Then the code becomes

   for row_index, line in enumerate(csv_reader):
new_line = modify(row_index, line)
csv_writer.writerow(new_line)

Note I added a modify function. Now If we implement the modify function with your logic then the problem can be solved easily.

def modify(row_index, line):
title_names = ["N", "O", "P", "Q", "R", "S", "T"]
if row_index == 0:
line[11] = "New Title L"
line[12] = "New Title M"

# Also add 7 more columns
if row_index == 0:
line.extend(["New Title %s" % n for n in title_names])
else:
# Add empty columns to make sure column size is always same.
line.extend([""] * 7)
return line

How to update rows in a CSV file

With the csv module you can iterate over the rows and access each one as a dict. As also noted here, the preferred way to update a file is by using temporary file.

from tempfile import NamedTemporaryFile
import shutil
import csv

filename = 'my.csv'
tempfile = NamedTemporaryFile(mode='w', delete=False)

fields = ['ID', 'Name', 'Course', 'Year']

with open(filename, 'r') as csvfile, tempfile:
reader = csv.DictReader(csvfile, fieldnames=fields)
writer = csv.DictWriter(tempfile, fieldnames=fields)
for row in reader:
if row['ID'] == str(stud_ID):
print('updating row', row['ID'])
row['Name'], row['Course'], row['Year'] = stud_name, stud_course, stud_year
row = {'ID': row['ID'], 'Name': row['Name'], 'Course': row['Course'], 'Year': row['Year']}
writer.writerow(row)

shutil.move(tempfile.name, filename)

If that's still not working you might try one of these encodings:

with open(filename, 'r', encoding='utf8') as csvfile, tempfile:
with open(filename, 'r', encoding='ascii') as csvfile, tempfile:

Edit: added str, print and encodings

Copy and modify a row in csv file

You can use a Dictreader to read each row with the headers. Then, split() the DealerID column on commas, and replicate the row with the matching ID.

So for an input file file.csv like:

Dealership|Address|Zipcode|DealerID
Charleston Kia|123 Bowman rd|29412|12345,21456,32145

The following code:

import csv

with open("file.csv") as file, open("new.csv", 'w', newline='') as out_file:
reader = csv.DictReader(file, delimiter='|')
writer = csv.DictWriter(out_file, fieldnames=reader.fieldnames)
writer.writeheader()
for row in reader:
for _id in row["DealerID"].split(','):
writer.writerow({**row, "DealerID": _id})

Will create a new file, new.csv with contents:

Dealership,Address,Zipcode,DealerID
Charleston Kia,123 Bowman rd,29412,12345
Charleston Kia,123 Bowman rd,29412,21456
Charleston Kia,123 Bowman rd,29412,32145

Open a csv file in 'rb' mode and iterate over it

You are not lucky here. Answers from Martijn Pieters are always high quality ones. This one was but was targetted at Python 2 while you use Python 3. And the csv semantics have changed... You should no longer use binary mode in Python 3 and do what you did for the initial csv file:

...
tempfile = NamedTemporaryFile(delete=False, newline='', mode='w')

with open(filename, mode='r', newline='') as csvFile, tempfile:
reader = csv.reader(csvFile)
writer = csv.writer(tempfile)
...

Python - Replacing a specific value in a CSV file while keeping the rest

This is how it would look like using pandas

import pandas as pd
from io import StringIO

# Things you'll get from a user
globalUsername = "Den1994"
field = 'Fav Artist'
new_value = 'Linkin Park'

# Things you'll probably get from a data file
data = """
Username,Password,Name,DOB,Fav Artist,Fav Genre
Den1994,Denis1994,Denis,01/02/1994,Eminem,Pop
Joh1997,John1997,John,03/04/1997,Daft Punk,House
"""

# Load your data (e.g. from a CSV file)
df = pd.read_csv(StringIO(data)).set_index('Username')

print(df)

# Now change something
df.loc[globalUsername][field] = new_value

print(df)

Here df.loc[] allows you to access a row by the index. In this case Username is set as index. Then, [field] selects the column in that row.

Also, consider this:

df.loc[globalUsername][['Fav Artist', 'Fav Genre']] = 'Linkin Park', 'Nu Metal'

In case you have a my-data.csv file you can load it with:

df = pd.read_csv('my-data.csv') 

The code above will return

           Password   Name         DOB Fav Artist Fav Genre
Username
Den1994 Denis1994 Denis 01/02/1994 Eminem Pop
Joh1997 John1997 John 03/04/1997 Daft Punk House

and

           Password   Name         DOB   Fav Artist Fav Genre
Username
Den1994 Denis1994 Denis 01/02/1994 Linkin Park Pop
Joh1997 John1997 John 03/04/1997 Daft Punk House

Python CSV edit fields with numbers greater than a specific number

import csv
import tempfile
import shutil

filename = 'distMatrix.csv'
with open(filename, 'rb') as f, tempfile.NamedTemporaryFile(mode='wb', delete=False) as g:
writer = csv.writer(g)
for row in csv.reader(f):
row = [element if int(element) < 10**6 else 0 for element in row]
writer.writerow(row)

shutil.move(g.name, filename)


Related Topics



Leave a reply



Submit