Tensorflow Estimator logits and labels must have the same shape
The answer was to create a one_hot first and then to squeeze() the resulting tensor.
This worked for the loss calculation.
loss = tf.losses.sigmoid_cross_entropy(
multi_class_labels=tf.squeeze(tf.one_hot(labels, depth=2), axis=1),
logits=logits)
The sigmoid_cross_entropy goes on to make the sigmoid_cross_entropy_with_logits. During the search I ended up switching to what is posted merely by laziness.
ValueError: Dimensions must be equal, but are 512 and 256
The error says that inside the LSTM of the decoder (decoding/decoder/while/BasicDecoderStep/decoder/multi_rnn_cell/cell_0/cell_0/basic_lstm_cell/mul
) there is a dimension mismatch during a multiplication (Mul
).
My guess is that, for your implementation, you need twice as many cells for the decoder LSTM as for the encoder LSTM, due to the fact that you are using a bidirectional encoder. If you have a bidirectional encoder with a LSTM with 256 cells, then the result will have 512 units (as you concatenate the outputs of the forward and backward LSTM). Currently the decoder seems to expect an input of 256 cells.
using LSTMs Decoder without teacher forcing - Tensorflow
There are different Helpers which all inherit from the same class. More information you can find in the documentation. As you said TrainingHelper
requires predefined true inputs which are expected to be outputted from the decoder and this true inputs are fed as next steps (instead of feeding the output of a previous step). This approach (by some research) should speed up training of decoder.
In your case, you are looking for GreedyEmbeddingHelper
. Just replace it instead of TrainingHelper
as:
training_helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(
embedding=embedding,
start_tokens=tf.tile([GO_SYMBOL], [batch_size]),
end_token=END_SYMBOL)
Just replace it with embedding
tensor and variables which you use in your problem. This helper automatically takes the output of a step applies embedding and feed it as input to next steps. For the first step is used the start_token
.
The resulting output by using GreedyEmbeddingHelper
doesn't have to match the length of expected output. You have to use padding to match their shapes. TensorFlow provides functiontf.pad()
. Also tf.contrib.seq2seq.dynamic_decode
returns tuple containing (final_outputs, final_state, final_sequence_lengths)
, so you can use value of final_sequece_lengths
for padding.
logits_pad = tf.pad(
logits,
[[0, tf.maximum(expected_length - tf.reduce_max(final_seq_lengths), 0)],
[0, 0]],
constant_values=PAD_VALUE,
mode='CONSTANT')
targets_pad = tf.pad(
targets,
[[0, tf.maximum(tf.reduce_max(final_seq_lengths) - expected_length, 0)]],
constant_values=PAD_VALUE,
mode='CONSTANT')
You may have to change the padding a little bit depending on the shapes of your inputs. Also you don't have to pad the targets
if you set the maximum_iterations
parameter to match targets
shape.
Related Topics
Printing a Multiplication Table With Nested Loops
Using Look Up Tables in Python
Numpy Array Typeerror: Only Integer Scalar Arrays Can Be Converted to a Scalar Index
Python - Having Trouble Opening a File With Spaces
How to Remove \N from a List Element
How to Clear All Variables in the Middle of a Python Script
How to Get the Return Value from a Thread in Python
How to Add Multiple Embed Images in an Email Using Python
Using a String Variable as a Variable Name
How to Retrieve SQL Result Column Value Using Column Name in Python
How to Extract the Substring Between Two Markers
How to Update a Label Inside While Loop in Tkinter
How to Determine If My Python Shell Is Executing in 32Bit or 64Bit
Parentheses and Quotation Marks in Output
How to Verify If a Button Is Enabled and Disabled in Webdriver Python