Tensorflow: Different results with the same random seed
I suggest checking whether your TensorFlow graph contains nondeterministic operations. For example, reduce_sum
before TensorFlow 1.2 was one such operation. These operations are nondeterministic because floating-point addition and multiplication are nonassociative (the order in which floating-point numbers are added or multiplied affects the result) and because such operations don't guarantee their inputs are added or multiplied in the same order every time. See also this question.
EDIT (Sep. 20, 2020): The GitHub repository framework-determinism
has more information about sources of nondeterminism in machine learning frameworks, particularly TensorFlow.
Unable to reproduce tensorflow results even after setting random seed
If you're running code on a GPU, it is likely due to the non-deterministic behavior of cuDNN (see this thread for more details). The order in which some operations are executed on a GPU can be random due to performance optimizations. This means that rounding errors also occur in a different order, which leads to small differences in the result of these operations. In your case, these small differences add up over the course of training, which leads to significantly different behavior already after a few training steps.
The order of magnitude of the rounding errors depends on the floating point precision used by the GPU. With float64
, the rounding errors take a lot longer to add up noticeably than with float32
.
On a CPU, this non-deterministic behavior should not occur when python's, numpy
's and tensorflow
's random seeds are fixed (and op parallelism is deactivated, more info here). So, if you run your code on a CPU, you should get the same results for every run (but that of course takes a lot longer).
Keras getting different results with set seed
This was solved by setting the pythonhashseed at an os level using (Reproducible results using Keras with TensorFlow backend):
# Seed value (can actually be different for each attribution step)
seed_value= 0
# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)
# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)
# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)
# 4. Set `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.set_random_seed(seed_value)
# 5. Configure a new global `tensorflow` session
from keras import backend as K
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
TensorFlow: Non-repeatable results
You need to set operation level seed in addition to graph-level seed, ie
tf.reset_default_graph()
a = tf.constant([1, 1, 1, 1, 1], dtype=tf.float32)
graph_level_seed = 1
operation_level_seed = 1
tf.set_random_seed(graph_level_seed)
b = tf.nn.dropout(a, 0.5, seed=operation_level_seed)
Related Topics
Split an Integer into Digits to Compute an Isbn Checksum
How to Convert Defaultdict to Dict
No Module Named When Using Pyinstaller
Which Seeds Have to Be Set Where to Realize 100% Reproducibility of Training Results in Tensorflow
Selecting Pandas Column by Location
Converting Between Datetime and Pandas Timestamp Objects
Anyone Know of a Good Python Based Web Crawler That I Could Use
Get Class Labels from Keras Functional Model
Matching Any Character Including Newlines in a Python Regex Subexpression, Not Globally
Reading Data from a CSV File in Python
Smtplib Sends Blank Message If the Message Contain Certain Characters
Python 3 Replacement for Deprecated Compiler.Ast Flatten Function
Selenium Unable to Locate Element Only When Using Headless Chrome (Python)