Why can't I repeat the 'for' loop for csv.Reader?
The csv reader is an iterator over the file. Once you go through it once, you read to the end of the file, so there is no more to read. If you need to go through it again, you can seek to the beginning of the file:
fh.seek(0)
This will reset the file to the beginning so you can read it again. Depending on the code, it may also be necessary to skip the field name header:
next(fh)
This is necessary for your code, since the DictReader
consumed that line the first time around to determine the field names, and it's not going to do that again. It may not be necessary for other uses of csv
.
If the file isn't too big and you need to do several things with the data, you could also just read the whole thing into a list:
data = list(read)
Then you can do what you want with data
.
Python CSV needs to be re-read in before each dict operation
The reason it doesn't work the second time is that the file pointer is already at the end of the file, so there is nothing to read; your reader object has already reached the end of the file.
By the way, don't use dict
as a variable name, it shadows the built-in dict()
function.
You can store all the rows in a list, so that you don't have to read the file again:
with open('minimal.csv', 'r') as readin:
reader = csv.DictReader(readin, delimiter=',', quotechar="\"")
rows = list(reader)
for row in rows:
print(row)
Can't run while loop through csv files using python
The line:
x = str(n)
is in the wrong place it needs to be in the loop at the top.
You only set at to the initial value of n
and then it stays that way.
You do not actually even need it. Just put the str(n)
directly in the string creation like this:
"C:\\Desktop\\server_" + str(n) + ".csv"
I would also recommend you use a different loop structure. The while
format you are using does not make sense for the way you are using it. You should be using a for
loop, like this:
import pandas as pd
import csv
for n in range(1, 9):
with open("C:\\Desktop\\server_" + str(n) + ".csv",'r', newline='') as infile, open("C:\\Desktop\\server_" + str(n) + "_out.csv",'w', newline='') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
writer.writerow(item.replace(",", "") for item in row)
This fixes a few things. You simplify the code, but also there were some other issues. You had the increment of n
in the wrong place which this corrects. Also you were doing the import cvs
inside the loop which would probably not break anything, but would at the very least be inefficient since it would attempt reload the cvs
module every pass through the outer loop.
Loop over rows of csv.DictReader more than once
You read the entire file the first time you iterated, so there is nothing left to read the second time. Since you don't appear to be using the csv data the second time, it would be simpler to count the number of rows and just iterate over that range the second time.
import csv
from itertools import count
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
row_count = count(1)
for row in reader:
next(count)
print(row)
for i in range(row_count):
print('Stack Overflow')
If you need to iterate over the raw csv data again, it's simple to open the file again. Most likely, you should be iterating over some data you stored the first time, rather than reading the file again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print('Stack Overflow')
If you don't want to open the file again, you can seek to the beginning, skip the header, and iterate again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
f.seek(0)
next(reader)
for row in reader:
print('Stack Overflow')
Nested for loop doesn't work in python while reading a same csv file
For sake of example, let's say I have a CSV file which looks like this:
foods.csv
beef,stew,apple,sauce
apple,pie,potato,salami
tomato,cherry,pie,bacon
And the following code, which is meant to simulate the structure of your current code:
def main():
import csv
keywords = ["apple", "pie"]
with open("foods.csv", "r") as file:
reader = csv.reader(file)
for keyword in keywords:
for row in reader:
if keyword in row:
print(f"{keyword} was in {row}")
print("Done")
main()
The desired result is that, for every keyword in my list of keywords, if that keyword exists in one of the lines in my CSV file, I will print a string to the screen - indicating in which row the keyword has occurred.
However, here is the actual output:
apple was in ['beef', 'stew', 'apple', 'sauce']
apple was in ['apple', 'pie', 'potato', 'salami']
Done
>>>
It was able to find both instances of the keyword apple
in the file, but it didn't find pie
! So, what gives?
The problem
The file
handle (in your case csvfile
) yields its contents once, and then they are consumed. Our reader
object wraps around the file-handle and consumes its contents until they are exhausted, at which point there will be no rows left to read from the file (the internal file pointer has advanced to the end), and the inner for-loop will not execute a second time.
The solution
Either move the interal file pointer to the beginning using seek
after each iteration of the outer for-loop, or read the contents of the file once into a list or similar collection, and then iterate over the list instead:
Updated code:
def main():
import csv
keywords = ["apple", "pie"]
with open("foods.csv", "r") as file:
contents = list(csv.reader(file))
for keyword in keywords:
for row in contents:
if keyword in row:
print(f"{keyword} was in {row}")
print("Done")
main()
New output:
apple was in ['beef', 'stew', 'apple', 'sauce']
apple was in ['apple', 'pie', 'potato', 'salami']
pie was in ['apple', 'pie', 'potato', 'salami']
pie was in ['tomato', 'cherry', 'pie', 'bacon']
Done
>>>
Python: Speed up FOR loops while searching through csv files
If your the part file is small enough to fit in memory, you can speed this up by loading it into a dictionary (efficient, fast access data structure). When you loop through file2
, you're looking for a line where row[2] == partnumber
, and then (presumably) taking using row[4]
, so a dictionary with row[2]
as the key and row[4]
as the value would make the lookup really fast:
parts = {}
with [however you open CSV 2] as f:
for row in f:
parts[row[2]] = row[4]
Then instead of re-opening that file every time, just do:
data = parts[partnumber]
EDIT: There are also a bunch of other things you can do to make this code better:
- Consider following PEP8, since it will make it easier for other people to read your code. Having a bunch of variables starting with upper-case letters is tricking the syntax highlighting on this site into thinking they're classes.
Use
True
andFalse
for booleans, rather than the strings "Y" and "N".part_exists = False
if some_condition:
part_exists = True
if part_exists:
selected_object = "X" # not clear what this does so I'm not messing with itWhen you're splitting an array into variables, you can do that much more easily:
for row4 in r4:
file1, file1Del, file1titles, file1PartNumber = rowYou repeat a lot of code to handle the case of one row, two rows, three rows, and four rows. Consider using loops and lists here. It would also let you get rid of that
eval
.
This may seem like pointless nitpicking, but code that doesn't repeat so much is much easier to improve.
Related Topics
Why Should I Close Files in Python
Access Data in Package Subdirectory
Numpy: Fix Array with Rows of Different Lengths by Filling the Empty Elements with Zeros
Activate Python Virtualenv in Dockerfile
How to Ssh Connect Through Python Paramiko with Ppk Public Key
Python How to Read N Number of Lines at a Time
Python Script Execute Commands in Terminal
How to Install a Python Package from Within Ipython
Attributeerror: 'List' Object Has No Attribute 'Click' - Selenium Webdriver
How to Generate Random Numbers That Are Different
How to Import a Python Module from a Sibling Folder
Pyspark: Explode JSON in Column to Multiple Columns
Using Python's Multiprocessing Module to Execute Simultaneous and Separate Seawat/Modflow Model Runs