Why python has limit for count of file handles?
The number of open files is limited by the operating system. On linux you can type
ulimit -n
to see what the limit is. If you are root, you can typeulimit -n 2048
now your program will run ok (as root) since you have lifted the limit to 2048 open files Python: Which command increases the number of open files on Windows?
Try to use win32file
from pywin32:
import win32file
print win32file._getmaxstdio() #512
win32file._setmaxstdio(1024)
print win32file._getmaxstdio() #1024
Downsides of keeping file handles open?
You don't didn't say just how many files the process would end up holding open. If it's not so many that it creates a problem, then this could be a good approach. I doubt you can really know without trying it out with your data and in your execution environment.
In my experience, open()
is relatively slow, so avoiding unnecessary calls is definitely worth thinking about-- you also avoid setting up all the associated buffers, populating them, flushing them every time you close the file, and garbage-collecting. Since you ask, file pointers do come with large buffers. On OS X, the default buffer size is 8192 bytes (8KB) and there is additional overhead for the object, as with all Python object. So if you have hundreds or thousands of files and little RAM, it can add up. You can specify less buffering or no buffering at all, but that could defeat any efficiency gained from avoiding repeated opens.
Edit: For just 35 distinct files (or any two-digit number), you have nothing to worry about: The space that 35 output buffers will need (at 8 KB per buffer for the actual buffering) will not even be the biggest part of your memory footprint. So just go ahead and do it they way you proposed. You'll see a dramatic speed improvement over opening and closing the file for each xml node.
PS. The default buffer size is given by io.DEFAULT_BUFFER_SIZE
.
IOError: [Errno 24] Too many open files -Python, Windows
Apparently 512 is the maximum in python.
I found the solution here- https://stackoverflow.com/a/28212496/8875017
import win32file
win32file._setmaxstdio(2048)
Related Topics
Django/Python Beginner: Error When Executing Python Manage.Py Syncdb - Psycopg2 Not Found
Cannot Redirect Output When I Run Python Script on Windows Using Just Script's Name
How to Combine Multiple Rows into a Single Row with Pandas
Matplotlib Custom Marker/Symbol
Converting Strings to Floats in a Dataframe
Python Memory Usage of Numpy Arrays
Pandas Latitude-Longitude to Distance Between Successive Rows
First Python List Index Greater Than X
SQL Alchemy Orm Returning a Single Column, How to Avoid Common Post Processing
Why Isn't .Ico File Defined When Setting Window's Icon
Selenium Webdriver in Python - Files Download Directory Change in Chrome Preferences
List Sorting with Multiple Attributes and Mixed Order
Weighted Standard Deviation in Numpy
Turning Off Logging in Selenium (From Python)
What Is the Purpose of Flask's Context Stacks
How to Delete Specific Strings from a File