Why Can't I Repeat the 'For' Loop for CSV.Reader

Why can't I repeat the 'for' loop for csv.Reader?

The csv reader is an iterator over the file. Once you go through it once, you read to the end of the file, so there is no more to read. If you need to go through it again, you can seek to the beginning of the file:

fh.seek(0)

This will reset the file to the beginning so you can read it again. Depending on the code, it may also be necessary to skip the field name header:

next(fh)

This is necessary for your code, since the DictReader consumed that line the first time around to determine the field names, and it's not going to do that again. It may not be necessary for other uses of csv.

If the file isn't too big and you need to do several things with the data, you could also just read the whole thing into a list:

data = list(read)

Then you can do what you want with data.

Python CSV needs to be re-read in before each dict operation

The reason it doesn't work the second time is that the file pointer is already at the end of the file, so there is nothing to read; your reader object has already reached the end of the file.

By the way, don't use dict as a variable name, it shadows the built-in dict() function.

You can store all the rows in a list, so that you don't have to read the file again:

with open('minimal.csv', 'r') as readin:
reader = csv.DictReader(readin, delimiter=',', quotechar="\"")
rows = list(reader)

for row in rows:
print(row)

Can't run while loop through csv files using python

The line:

x = str(n)

is in the wrong place it needs to be in the loop at the top.

You only set at to the initial value of n and then it stays that way.

You do not actually even need it. Just put the str(n) directly in the string creation like this:

"C:\\Desktop\\server_" + str(n) + ".csv"

I would also recommend you use a different loop structure. The while format you are using does not make sense for the way you are using it. You should be using a for loop, like this:

import pandas as pd
import csv

for n in range(1, 9):
with open("C:\\Desktop\\server_" + str(n) + ".csv",'r', newline='') as infile, open("C:\\Desktop\\server_" + str(n) + "_out.csv",'w', newline='') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
writer.writerow(item.replace(",", "") for item in row)

This fixes a few things. You simplify the code, but also there were some other issues. You had the increment of n in the wrong place which this corrects. Also you were doing the import cvs inside the loop which would probably not break anything, but would at the very least be inefficient since it would attempt reload the cvs module every pass through the outer loop.

Loop over rows of csv.DictReader more than once

You read the entire file the first time you iterated, so there is nothing left to read the second time. Since you don't appear to be using the csv data the second time, it would be simpler to count the number of rows and just iterate over that range the second time.

import csv
from itertools import count

with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
row_count = count(1)

for row in reader:
next(count)
print(row)

for i in range(row_count):
print('Stack Overflow')

If you need to iterate over the raw csv data again, it's simple to open the file again. Most likely, you should be iterating over some data you stored the first time, rather than reading the file again.

with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)

for row in reader:
print(row)

with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)

for row in reader:
print('Stack Overflow')

If you don't want to open the file again, you can seek to the beginning, skip the header, and iterate again.

with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)

for row in reader:
print(row)

f.seek(0)
next(reader)

for row in reader:
print('Stack Overflow')

Nested for loop doesn't work in python while reading a same csv file

For sake of example, let's say I have a CSV file which looks like this:

foods.csv

beef,stew,apple,sauce
apple,pie,potato,salami
tomato,cherry,pie,bacon

And the following code, which is meant to simulate the structure of your current code:

def main():
import csv

keywords = ["apple", "pie"]

with open("foods.csv", "r") as file:
reader = csv.reader(file)

for keyword in keywords:
for row in reader:
if keyword in row:
print(f"{keyword} was in {row}")

print("Done")

main()

The desired result is that, for every keyword in my list of keywords, if that keyword exists in one of the lines in my CSV file, I will print a string to the screen - indicating in which row the keyword has occurred.

However, here is the actual output:

apple was in ['beef', 'stew', 'apple', 'sauce']
apple was in ['apple', 'pie', 'potato', 'salami']
Done
>>>

It was able to find both instances of the keyword apple in the file, but it didn't find pie! So, what gives?

The problem

The file handle (in your case csvfile) yields its contents once, and then they are consumed. Our reader object wraps around the file-handle and consumes its contents until they are exhausted, at which point there will be no rows left to read from the file (the internal file pointer has advanced to the end), and the inner for-loop will not execute a second time.

The solution

Either move the interal file pointer to the beginning using seek after each iteration of the outer for-loop, or read the contents of the file once into a list or similar collection, and then iterate over the list instead:

Updated code:

def main():
import csv

keywords = ["apple", "pie"]

with open("foods.csv", "r") as file:
contents = list(csv.reader(file))

for keyword in keywords:
for row in contents:
if keyword in row:
print(f"{keyword} was in {row}")

print("Done")

main()

New output:

apple was in ['beef', 'stew', 'apple', 'sauce']
apple was in ['apple', 'pie', 'potato', 'salami']
pie was in ['apple', 'pie', 'potato', 'salami']
pie was in ['tomato', 'cherry', 'pie', 'bacon']
Done
>>>

Python: Speed up FOR loops while searching through csv files

If your the part file is small enough to fit in memory, you can speed this up by loading it into a dictionary (efficient, fast access data structure). When you loop through file2, you're looking for a line where row[2] == partnumber, and then (presumably) taking using row[4], so a dictionary with row[2] as the key and row[4] as the value would make the lookup really fast:

parts = {}
with [however you open CSV 2] as f:
for row in f:
parts[row[2]] = row[4]

Then instead of re-opening that file every time, just do:

data = parts[partnumber]

EDIT: There are also a bunch of other things you can do to make this code better:

  • Consider following PEP8, since it will make it easier for other people to read your code. Having a bunch of variables starting with upper-case letters is tricking the syntax highlighting on this site into thinking they're classes.
  • Use True and False for booleans, rather than the strings "Y" and "N".

    part_exists = False
    if some_condition:
    part_exists = True
    if part_exists:
    selected_object = "X" # not clear what this does so I'm not messing with it
  • When you're splitting an array into variables, you can do that much more easily:

    for row4 in r4:
    file1, file1Del, file1titles, file1PartNumber = row
  • You repeat a lot of code to handle the case of one row, two rows, three rows, and four rows. Consider using loops and lists here. It would also let you get rid of that eval.

This may seem like pointless nitpicking, but code that doesn't repeat so much is much easier to improve.



Related Topics



Leave a reply



Submit