Understanding tensorflow inter/intra parallelism threads
When both parameters are set to 1, there will be 1 thread running on 1 of the 4 cores. The core on which it runs might change but it will always be 1 at a time.
When running something in parallel there is always a trade-off between lost time on communication and gained time through parallelization. Depending on the used hardware and the specific task (like the size of the matrices) the speedup will change. Sometimes running something in parallel will be even slower than using one core.
For example when using floats on a cpu,
(a + b) + c
will not be equal toa + (b + c)
because of the floating point precision. Using multiple parallel threads means that operations likea + b + c
will not always be computed in the same order, leading to different results on each run. However those differences are extremely small and will not effect the overall result in most cases. Completely reproducible results are usually only needed for debugging. Enforcing complete reproducibility would slow down multi-threading a lot.
Should I set `inter_op_parallelism_threads` and `intra_op_parallelism_threads` to 1 when I use ray to create a actor?
It depends how many resources you want to the actor to use. If there is a dedicated machine for a given actor, and it's ok for the actor to use all of the resources on that machine, then use TensorFlow's default settings. If you are creating more like one actor per core, then setting inter_op_parallelism_threads
and intra_op_parallelism_threads
to small values like 1 or 2 is a good idea.
In general, you can try both approaches and see which is faster.
What is difference between Keras backend + Tensorflow and Keras from Tensorflow using CPU(in Tensorflow 2.x)
Not exactly, it's not as simple as that. As per official documentation -
intra_op_parallelism_threads - Certain operations like matrix multiplication and reductions can utilize parallel threads for speedups. A value of 0 means the system picks an appropriate number. Refer this
inter_op_parallelism_threads - Determines the number of parallel threads used by independent non-blocking operations. 0 means the system picks an appropriate number. Refer this
So technically you can not limit the number of CPUs but only the number of parallel threads, which, for the sake of limiting resource consumption, is sufficient.
Regarding the methods, you are using -
The third approach allows you to directly set the environment variables using os
library.
import os
os.environ['TF_NUM_INTRAOP_THREADS'] = '2'
os.environ['TF_NUM_INTEROP_THREADS'] = '4'
The second approach is a method in tf2 that does exactly the same (sets environment variables), the difference being that Keras is packaged into tf2 now.
import tensorflow as tf
from tensorflow import keras
tf.config.threading.set_intra_op_parallelism_threads(2)
tf.config.threading.set_inter_op_parallelism_threads(4)
The first approach is for standalone Keras. This approach will work if keras is set to tensorflow backend. Again, it does the same thing which is set environment variables indirectly.
from keras import backend as K
import tensorflow as tf
config = tf.ConfigProto(intra_op_parallelism_threads=2, \
inter_op_parallelism_threads=4, \
allow_soft_placement=True, \
device_count = {'CPU': 1})
session = tf.Session(config=config)
K.set_session(session)
If you still have doubts, you can check what happens to the environment variables after running all 3 independently and then check the specific variable using os
with -
print(os.environ.get('KEY_THAT_MIGHT_EXIST'))
For a better understanding of the topic, you can check this link that details it out quite well.
TLDR; You can use the second or third approach if you are working with tf2. Else use the first or third approach if you are using standalone Keras with tensorflow backend.
Related Topics
What Exactly Is Contained Within a Obj._Closure_
Which Is Faster in Python: X**.5 or Math.Sqrt(X)
Opencv Error: (-215)Size.Width>0 && Size.Height>0 in Function Imshow
E731 Do Not Assign a Lambda Expression, Use a Def
How to Remove Blanks/Na's from Dataframe and Shift the Values Up
Deleting List Elements Based on Condition
How to Implement Option Buttons and Change the Button Color in Pygame
How to Use Inspect to Get the Caller's Info from Callee in Python
Popen with Conflicting Executable/Path
File Not Found Error When Launching a Subprocess Containing Piped Commands
What Is the Advantage of a List Comprehension Over a for Loop
Create a Day-Of-Week Column in a Pandas Dataframe Using Python
Comprehension for Flattening a Sequence of Sequences
Can Elementtree Be Told to Preserve the Order of Attributes
Call a Python Function from Jinja2
What Limitations Have Closures in Python Compared to Language X Closures