How to apply gradient clipping in TensorFlow?
Gradient clipping needs to happen after computing the gradients, but before applying them to update the model's parameters. In your example, both of those things are handled by the AdamOptimizer.minimize()
method.
In order to clip your gradients you'll need to explicitly compute, clip, and apply them as described in this section in TensorFlow's API documentation. Specifically you'll need to substitute the call to the minimize()
method with something like the following:
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
gvs = optimizer.compute_gradients(cost)
capped_gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs]
train_op = optimizer.apply_gradients(capped_gvs)
How to clip the gradient norm on the grad_and_var tuple in tensorflow-r1.0?
One possible approach that I have seen is to zip clipped_gradients
and your variables and to use opt.apply_gradients
on the zipped list, like in the code below (taken from here, lines 78-83):
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(self.cost, tvars),
args.grad_clip)
with tf.name_scope('optimizer'):
optimizer = tf.train.AdamOptimizer(self.lr)
self.train_op = optimizer.apply_gradients(zip(grads, tvars))
Can't apply gradients on tf.Variable
The python zip
function expects iterable objects, like for example a list or a tuple.
In your calls to tape.gradient
, or optimizer.apply_gradients
, you can put your Variable in a list to solve the issue :
with tf.GradienTape() as tape:
gradients = tape.gradient(loss_value, [self.trainable_variables])
# Apply gradients via optimizer
self.optimizer.apply_gradients(zip(gradients, [self.trainable_variables]))
tape.gradient
respects the shape of the sources
object passed to compute the gradients of, so if you feed it with a list, you will get a list out of it. It is stated in the documentation:
Returns
a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in sources. Returned structure is the same as the structure of sources.
How to manipulate client gradients in tensorflow federated sgd
build_federated_sgd_process
is fully-canned; it is really designed to serve as a reference implementation, not as a point of extensibility.
I believe what you are looking for is the function that build_federated_sgd_process
calls under the hoos, tff.learning.framework.build_model_delta_optimizer_process
. This function allows you to supply your own mapping from a model function (IE, a zero-arg callable that returns a tff.learning.Model
) to a tff.learning.framework.ClientDeltaFn
.
Your ClientDeltaFn
would look something like:
@tf.function
def _clip_and_noise(grads):
return ...
class ClippedGradClientDeltaFn(tff.learning.framework.ClientDeltaFn)
def __init__(self, model, ...):
self._model = model
...
@tf.function
def __call__(dataset, weights):
# Compute gradients grads
return _clip_and_noise(grads)
And you would be able to construct a tff.templates.IterativeProcess
by calling:
def clipped_sgd(model_fn: Callable[[], model_lib.Model]) -> ClippedGradClientDeltaFn:
return ClippedGradClientDeltaFn(
model_fn(),
...)
iterproc = optimizer_utils.build_model_delta_optimizer_process(
model_fn, model_to_client_delta_fn=clipped_sgd, ...)
as more or less in the body of build_federated_sgd_process
.
It sounds to me like you are interested in differential privacy; TFF is actually designed to compose with differential privacy generally through the aggregation processes rather than writing different client updates, though this is certainly one approach. See the pointers from the TFF for research documentation for idiomatic ways to wire differential privacy in to TFF.
Related Topics
Correct Way to Define Class Variables in Python
Typeerror: Str Does Not Support Buffer Interface
Difference Between Returns and Printing in Python
Standard Way to Embed Version into Python Package
How to Use Brew Installed Python as the Default Python
Python Dictionary:Typeerror: Unhashable Type: 'List'
How to Do Assignments in a List Comprehension
What Is the Current Choice for Doing Rpc in Python
Random State (Pseudo-Random Number) in Scikit Learn
Catching an Exception While Using a Python 'With' Statement
Deleting List Elements Based on Condition
How to Make Environment Variable Changes Stick in Python
Setting Stacksize in a Python Script