Getting Gradient of Model Output W.R.T Weights Using Keras

Getting gradient of model output w.r.t weights using Keras

To get the gradients of model output with respect to weights using Keras you have to use the Keras backend module. I created this simple example to illustrate exactly what to do:

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as k

model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

To calculate the gradients we first need to find the output tensor. For the output of the model (what my initial question asked) we simply call model.output. We can also find the gradients of outputs for other layers by calling model.layers[index].output

outputTensor = model.output #Or model.layers[index].output

Then we need to choose the variables that are in respect to the gradient.

  listOfVariableTensors = model.trainable_weights
  #or variableTensors = model.trainable_weights[0]

We can now calculate the gradients. It is as easy as the following:

gradients = k.gradients(outputTensor, listOfVariableTensors)

To actually run the gradients given an input, we need to use a bit of Tensorflow.

trainingExample = np.random.random((1,8))
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:trainingExample})

And thats it!

How do I get the gradient of a keras model with respect to its inputs?

I ended up getting this to work with a variant of the answer to this question: Get Gradients with Keras Tensorflow 2.0

x_tensor = tf.convert_to_tensor(input_data, dtype=tf.float32)
with tf.GradientTape() as t:
    t.watch(x_tensor)
    output = model(x_tensor)

result = output
gradients = t.gradient(output, x_tensor)

This allows me to obtain both the output and the gradient without redundant computation.

How to obtain the gradients in keras?

You need to create a symbolic Keras function, taking the input/output as inputs and returning the gradients.
Here is a working example :

import numpy as np
import keras
from keras import backend as K

model = keras.Sequential()
model.add(keras.layers.Dense(20, input_shape = (10, )))
model.add(keras.layers.Dense(5))
model.compile('adam', 'mse')

dummy_in = np.ones((4, 10))
dummy_out = np.ones((4, 5))
dummy_loss = model.train_on_batch(dummy_in, dummy_out)

def get_weight_grad(model, inputs, outputs):
    """ Gets gradient of model for given inputs and outputs for all weights"""
    grads = model.optimizer.get_gradients(model.total_loss, model.trainable_weights)
    symb_inputs = (model._feed_inputs + model._feed_targets + model._feed_sample_weights)
    f = K.function(symb_inputs, grads)
    x, y, sample_weight = model._standardize_user_data(inputs, outputs)
    output_grad = f(x + y + sample_weight)
    return output_grad

def get_layer_output_grad(model, inputs, outputs, layer=-1):
    """ Gets gradient a layer output for given inputs and outputs"""
    grads = model.optimizer.get_gradients(model.total_loss, model.layers[layer].output)
    symb_inputs = (model._feed_inputs + model._feed_targets + model._feed_sample_weights)
    f = K.function(symb_inputs, grads)
    x, y, sample_weight = model._standardize_user_data(inputs, outputs)
    output_grad = f(x + y + sample_weight)
    return output_grad

weight_grads = get_weight_grad(model, dummy_in, dummy_out)
output_grad = get_layer_output_grad(model, dummy_in, dummy_out)

The first function I wrote returns all the gradients in the model but it wouldn't be difficult to extend it so it supports layer indexing. However, it's probably dangerous because any layer without weights in the model will be ignored by this indexing and you would end up with different layer indexing in the model and the gradients.

The second function I wrote returns the gradient at a given layer's output and there, the indexing is the same as in the model, so it's safe to use it.

Note : This works with Keras 2.2.0, not under, as this release included a major refactoring of keras.engine

Keras GradientType: Calculating gradients with respect to the output node

Well, after some research I found the answer myself: It is possible to extract the trainable variables of a given layer based on the layer name. Then we can apply tape.gradient and optimizer.apply_gradients to the extracted set of trainable variables. My current solution is pretty slow, but it works. I just need to figure out how to improve its runtime.

How to compute gradient of output wrt input in Tensorflow 2.0

This should work in TF2:

inp = tf.Variable(np.random.normal(size=(25, 120)), dtype=tf.float32)

with tf.GradientTape() as tape:
    preds = model(inp)

grads = tape.gradient(preds, inp)

Basically you do it the same way as TF1, but using GradientTape.

Calculating gradient norm wrt weights with keras

There are several placeholders related to the gradient computation process in Keras:

Input x
Target y
Sample weights: even if you don't provide it in model.fit(), Keras still generates a placeholder for sample weights, and feed np.ones((y.shape[0],), dtype=K.floatx()) into the graph during training.
Learning phase: this placeholder will be connected to the gradient tensor only if there's any layer using it (e.g. Dropout).

So, in your provided example, in order to compute the gradients, you need to feed x, y and sample_weights into the graph. That's the underlying reason of the error.

Inside Model._make_train_function() there are the following lines showing how to construct the necessary inputs to K.function() in this case:

inputs = self._feed_inputs + self._feed_targets + self._feed_sample_weights
if self.uses_learning_phase and not isinstance(K.learning_phase(), int):
    inputs += [K.learning_phase()]

with K.name_scope('training'):
    ...
    self.train_function = K.function(inputs,
                                     [self.total_loss] + self.metrics_tensors,
                                     updates=updates,
                                     name='train_function',
                                     **self._function_kwargs)

By mimicking this function, you should be able to get the norm value:

def get_gradient_norm_func(model):
    grads = K.gradients(model.total_loss, model.trainable_weights)
    summed_squares = [K.sum(K.square(g)) for g in grads]
    norm = K.sqrt(sum(summed_squares))
    inputs = model.model._feed_inputs + model.model._feed_targets + model.model._feed_sample_weights
    func = K.function(inputs, [norm])
    return func

def main():
    x = np.random.random((128,)).reshape((-1, 1))
    y = 2 * x
    model = Sequential(layers=[Dense(2, input_shape=(1,)),
                               Dense(1)])
    model.compile(loss='mse', optimizer='rmsprop')
    get_gradient = get_gradient_norm_func(model)
    history = model.fit(x, y, epochs=1)
    print(get_gradient([x, y, np.ones(len(y))]))

Execution output:

Epoch 1/1
128/128 [==============================] - 0s - loss: 2.0073     
[4.4091368]

Note that since you're using Sequential instead of Model, model.model._feed_* is required instead of model._feed_*.

Getting Gradient of Model Output W.R.T Weights Using Keras