Tensorflow:为什么softmax输出[1,0,0…,0]

时间:2022-12-10 18:22:34

I have a neural net model, it's last layer is fully connected layer with 9 output neurons.
To train my network correctly, I'm using softmax_cross_entropy_with_logits.
It trains ok, but when I want to evaluate my model, I want probabilities also.
So I take an evaluation sample and feed it to the network.
After that I apply softmax to the output and get
[[ 0. 0. 0. 0. 0. 1. 0. 0. 0.]]

我有一个神经网络模型,它的最后一层是完全连通的层,有9个输出神经元。为了正确地训练我的网络,我使用了softmax_cross_entropy_with_logits。它可以训练,但是当我想要评估我的模型时,我也想要概率。所以我拿了一个评估样本并把它反馈给网络。在此之后,我将softmax应用于输出并得到[[0]。0。0。0。0。1。0。0。0。]

Here unnormalized probabilites also:

这里也非规范概率:

[[ -2710.10620117  -2914.37866211  -5045.04443359  -4361.91601562
-459.57000732   8843.65820312  -1871.62756348   5447.12451172
-10947.22949219]]

I'm also getting probility of 1 and rest are zeros. Could anyone please help to handle this issue?

我也得到了1的概率,剩下的是0。有人能帮忙处理这个问题吗?

EDIT: Input images are of shape 64 * 160.
All activation functions are relu.
Max poolings are 2x2.
In conv_plus_max_pool_layer(x_image, 5, 1, 96) 5 is kernel size.
Here is network layout:

编辑:输入图像的形状为64 * 160。所有激活函数都是relu。马克斯池2 x2。在conv_plus_max_pool_layer中(x_image, 5,1,96) 5是内核大小。这是网络布局:

hidden_block_1 = conv_plus_max_pool_layer(x_image, 5, 1, 96)
hidden_block_2 = conv_plus_max_pool_layer(hidden_block_1, 5, 96, 256)
hidden_block_3 = conv_plus_max_pool_layer(hidden_block_2, 3, 256, 384)
hidden_block_4 = conv_plus_max_pool_layer(hidden_block_3, 3, 384, 512)

fc1 = dropout_plus_fc(4 * 10 * 512, 512, hidden_block_4, keep_prob_drop1)
output = dropout_plus_fc(512, model_net10_train.class_num, fc1, keep_prob_drop2)

1 个解决方案

#1


2  

Looks like your network is pretty sure about the output ;)

看起来你的网络对输出很有把握;

In this case, I don't think we can do a lot for you without your network layout... Some gut feelings from my side: the layer leading up to your output layer has too many nodes (thus giving you these huuuge numbers), and I suspect that you don't use nonlinearities such as RELU, or tanh. Another thing you might want to check are the initial values for the weights (might be too big), and the learning rate you are using (might be too high).

在这种情况下,如果没有你的网络布局,我不认为我们能为你做很多。从我的角度来看,一些直觉:导致输出层的层有太多的节点(因此给了您这些huuuge数字),我怀疑您不使用诸如RELU或tanh之类的非线性函数。您可能需要检查的另一件事是权重的初始值(可能太大),以及您正在使用的学习速率(可能太高)。

#1


2  

Looks like your network is pretty sure about the output ;)

看起来你的网络对输出很有把握;

In this case, I don't think we can do a lot for you without your network layout... Some gut feelings from my side: the layer leading up to your output layer has too many nodes (thus giving you these huuuge numbers), and I suspect that you don't use nonlinearities such as RELU, or tanh. Another thing you might want to check are the initial values for the weights (might be too big), and the learning rate you are using (might be too high).

在这种情况下,如果没有你的网络布局,我不认为我们能为你做很多。从我的角度来看,一些直觉:导致输出层的层有太多的节点(因此给了您这些huuuge数字),我怀疑您不使用诸如RELU或tanh之类的非线性函数。您可能需要检查的另一件事是权重的初始值(可能太大),以及您正在使用的学习速率(可能太高)。