How to Get Reproducible Results in Keras

How to Get Reproducible Results (Keras, Tensorflow):

As a reference from the documentation

Operations that rely on a random seed actually derive it from two seeds: the global and operation-level seeds. This sets the global seed.

Its interactions with operation-level seeds are as follows:

  1. If neither the global seed nor the operation seed is set: A randomly picked seed is used for this op.
  2. If the operation seed is not set but the global seed is set: The system picks an operation seed from a stream of seeds determined by the global seed.
  3. If the operation seed is set, but the global seed is not set: A default global seed and the specified operation seed are used to determine the random sequence.
  4. If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence.

1st Scenario

A random seed will be picked by default. This can be easily noticed with the results.
It will have different values every time you re-run the program or call the code multiple times.

x_train = tf.random.normal((10,1), 1, 1, dtype=tf.float32)
print(x_train)

2nd Scenario

The global is set but the operation has not been set.
Although it generated a different seed from first and second random. If you re-run or restart the code. The seed for both will still be the same.
It both generated the same result over and over again.

tf.random.set_seed(2)
first = tf.random.normal((10,1), 1, 1, dtype=tf.float32)
print(first)
sec = tf.random.normal((10,1), 1, 1, dtype=tf.float32)
print(sec)

3rd Scenario

For this scenario, where the operation seed is set but not the global.
If you re-run the code it will give you different results but if you restart the runtime if will give you the same sequence of results from the previous run.

x_train = tf.random.normal((10,1), 1, 1, dtype=tf.float32, seed=2)
print(x_train)

4th scenario

Both seeds will be used to determine the random sequence.
Changing the global and operation seed will give different results but restarting the runtime with the same seed will still give the same results.

tf.random.set_seed(3)
x_train = tf.random.normal((10,1), 1, 1, dtype=tf.float32, seed=1)
print(x_train)

Created a reproducible code as a reference.
By setting the global seed, It always gives the same results.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

## GLOBAL SEED ##
tf.random.set_seed(3)
x_train = tf.random.normal((10,1), 1, 1, dtype=tf.float32)
y_train = tf.math.sin(x_train)
x_test = tf.random.normal((10,1), 2, 3, dtype=tf.float32)
y_test = tf.math.sin(x_test)

model = Sequential()
model.add(Dense(1200, input_shape=(1,), activation='relu'))
model.add(Dense(150, activation='relu'))
model.add(Dense(80, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

loss="binary_crossentropy"
optimizer=tf.keras.optimizers.Adam(lr=0.01)
metrics=['mse']
epochs = 5
batch_size = 32
verbose = 1

model.compile(loss=loss,
optimizer=optimizer,
metrics=metrics)
histpry = model.fit(x_train, y_train, epochs = epochs, batch_size=batch_size, verbose = verbose)
predictions = model.predict(x_test)
print(predictions)

Sample Image


Note: If you are using TensorFlow 2 higher, the Keras is already in the API, therefore, you should use TF.Keras rather than the native one.
All of these are simulated on the google colab.

Why can't I get reproducible results in Keras even though I set the random seeds?

You can find the answer at Keras docs: https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development.

In short, to be absolutely sure that you will get reproducible results with your python script on one computer's/laptop's CPU then you will have to do the following:

  1. Set PYTHONHASHSEED environment variable at a fixed value
  2. Set python built-in pseudo-random generator at a fixed value
  3. Set numpy pseudo-random generator at a fixed value
  4. Set tensorflow pseudo-random generator at a fixed value
  5. Configure a new global tensorflow session

Following the Keras link at the top, the source code I am using is the following:

# Seed value
# Apparently you may use different seed values at each stage
seed_value= 0

# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

# 4. Set the `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value)
# for later versions:
# tf.compat.v1.set_random_seed(seed_value)

# 5. Configure a new global `tensorflow` session
from keras import backend as K
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
# for later versions:
# session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
# sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
# tf.compat.v1.keras.backend.set_session(sess)

It is needless to say that you do not have to to specify any seed or random_state at the numpy, scikit-learn or tensorflow/keras functions that you are using in your python script exactly because with the source code above we set globally their pseudo-random generators at a fixed value.

How to get reproducible weights initializaiton in Keras?

tf.keras.initializers objects have a seed argument for reproducible initialization.

import tensorflow as tf
import numpy as np

initializer = tf.keras.initializers.GlorotUniform(seed=42)

for _ in range(10):
print(np.round(initializer((4,)), 3))
[-0.377 -0.003  0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]
[-0.377 -0.003 0.373 -0.831]

In a Keras layer, you can use it like this:

tf.keras.layers.Dense(1024, 
activation='relu',
input_dim=2048,
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=42))


Related Topics



Leave a reply



Submit