where to put freeze_support() in a Python script?
On Windows all of your multiprocessing
-using code must be guarded by if __name__ == "__main__":
So to be safe, I would put all of your the code currently at the top-level of your script in a main()
function, and then just do this at the top-level:
if __name__ == "__main__":
main()
See the "Safe importing of main module" sub-section here for an explanation of why this is necessary. You probably don't need to call freeze_support
at all, though it won't hurt anything to include it.
Note that it's a best practice to use the if __name__ == "__main__"
guard for scripts anyway, so that code isn't unexpectedly executed if you find you need to import
your script into another script at some point in the future.
Multiprocessing in a Python function: where to put freeze_support()?
In windows and by default in macos, the way a new process is "spawn"ed basically amounts to "start a new python process, import all the same modules, import the 'main' file as a library, then use pickle
to exchange which function to call and what the arguments are". The alternative to this on *nix systems is "fork", where the process memory is copied and the new process starts from the same point.
The important implication here is that when using "spawn" the "main" file you're running must not spawn more child threads when it is import
ed. If it did, the first children would spawn grandchildren when they import __main__
, which would then spawn great-grandchildren when they import __main__
, and so-on creating infintely recursive child processes. This is obviously a problem, so python raises an error if you attempt to create new processes in a child process during this import phase.
The solution to avoid this problem is to prevent any which spawns child processes from executing outside of the main process (this is also useful for things like running tests on a library when it's run as a main process rather than imported as a library).
if __name__ == "__main__":
trainTestResults = GGSCrossVal(data, 25, [10, 1, 0.1, 0.01, 0.001, 0.0001], [], False)
multiprocessing.freeze_support()
The reason is lack of fork()
on Windows (which is not entirely true). Because of this, on Windows the fork is simulated by creating a new process in which code, which on Linux is being run in child process, is being run. As the code is to be run in technically unrelated process, it has to be delivered there before it can be run. The way it's being delivered is first it's being pickled and then sent through the pipe from the original process to the new one. In addition this new process is being informed it has to run the code passed by pipe, by passing --multiprocessing-fork
command line argument to it. If you take a look at implementation of freeze_support()
function its task is to check if the process it's being run in is supposed to run code passed by pipe or not.
freeze_support bug in using scikit-learn in the Anaconda python distro?
This probably means that you are on Windows and you have forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
Related Topics
Pandas Index Column Title or Name
How to Write a File or Data to an S3 Object Using Boto3
Python, Https Get with Basic Authentication
Websocket VS Rest API for Real Time Data
What Is This Odd Colon Behavior Doing
Convert Datetime to Unix Timestamp and Convert It Back in Python
How to Create a Read-Only Class Property in Python
Creating a Bat File for Python Script
Non-Ascii Characters in Matplotlib
Split String into Strings by Length
Adding a Legend to Pyplot in Matplotlib in the Simplest Manner Possible
How to Dynamically Add/Remove Periodic Tasks to Celery (Celerybeat)
Number of Days Between 2 Dates, Excluding Weekends
Get the Name of a Pandas Dataframe
How to Distribute Python Programs