LSTM文本分类不良准确性Keras

时间:2023-02-11 13:57:00

I'm going crazy in this project. This is multi-label text-classification with lstm in keras. My model is this:

我在这个项目中疯了。这是在keras中使用lstm的多标签文本分类。我的模型是这样的:

model = Sequential()

model.add(Embedding(max_features, embeddings_dim, input_length=max_sent_len, mask_zero=True, weights=[embedding_weights] ))
model.add(Dropout(0.25))
model.add(LSTM(output_dim=embeddings_dim , activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True))
model.add(Dropout(0.25))
model.add(LSTM(activation='sigmoid', units=embeddings_dim, recurrent_activation='hard_sigmoid', return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))

adam=keras.optimizers.Adam(lr=0.04)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])

Only that I have too low an accuracy .. with the binary-crossentropy I get a good accuracy, but the results are wrong !!!!! changing to categorical-crossentropy, I get very low accuracy. Do you have any suggestions?

只有我的准确度太低了..用二进制 - 交叉熵我得到了很好的准确性,但结果是错误的!!!!!改为分类 - 交叉熵,我的准确度非常低。你有什么建议吗?

there is my code: GitHubProject - Multi-Label-Text-Classification

有我的代码:GitHubProject - 多标签文本分类

2 个解决方案

#1


2  

In last layer, the activation function you are using is sigmoid, so binary_crossentropy should be used. Incase you want to use categorical_crossentropy then use softmax as activation function in last layer.

在最后一层,您使用的激活函数是sigmoid,因此应该使用binary_crossentropy。如果你想使用categorical_crossentropy,那么在最后一层使用softmax作为激活函数。

Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh as activation function in LSTM layers.

现在,来到你的模型的另一部分,因为你正在处理文本,我会告诉你在LSTM层中作为激活函数使用tanh。

And you can try using LSTM's dropouts as well like dropout and recurrent dropout

你可以尝试使用LSTM的辍学,以及辍学和经常性辍学

LSTM(units, dropout=0.2, recurrent_dropout=0.2,
                             activation='tanh')

You can define units as 64 or 128. Start from small number and after testing you take them till 1024.

您可以将单位定义为64或128.从小数字开始,测试后将它们带到1024。

You can try adding convolution layer as well for extracting features or use Bidirectional LSTM But models based Bidirectional takes time to train.

您可以尝试添加卷积层以提取特征或使用双向LSTM但基于模型的双向需要时间来训练。

Moreover, since you are working on text, pre-processing of text and size of training data always play much bigger role than expected.

此外,由于您正在处理文本,因此文本的预处理和训练数据的大小始终比预期发挥更大的作用。

Edited

编辑

Add Class weights in fit parameter

在fit参数中添加类权重

class_weights = class_weight.compute_class_weight('balanced',
                                                  np.unique(labels),
                                                  labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
                          class_weights))


model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)

#2


1  

change:

更改:

model.add(Activation('sigmoid'))

to:

至:

model.add(Activation('softmax'))

#1


2  

In last layer, the activation function you are using is sigmoid, so binary_crossentropy should be used. Incase you want to use categorical_crossentropy then use softmax as activation function in last layer.

在最后一层,您使用的激活函数是sigmoid,因此应该使用binary_crossentropy。如果你想使用categorical_crossentropy,那么在最后一层使用softmax作为激活函数。

Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh as activation function in LSTM layers.

现在,来到你的模型的另一部分,因为你正在处理文本,我会告诉你在LSTM层中作为激活函数使用tanh。

And you can try using LSTM's dropouts as well like dropout and recurrent dropout

你可以尝试使用LSTM的辍学,以及辍学和经常性辍学

LSTM(units, dropout=0.2, recurrent_dropout=0.2,
                             activation='tanh')

You can define units as 64 or 128. Start from small number and after testing you take them till 1024.

您可以将单位定义为64或128.从小数字开始,测试后将它们带到1024。

You can try adding convolution layer as well for extracting features or use Bidirectional LSTM But models based Bidirectional takes time to train.

您可以尝试添加卷积层以提取特征或使用双向LSTM但基于模型的双向需要时间来训练。

Moreover, since you are working on text, pre-processing of text and size of training data always play much bigger role than expected.

此外,由于您正在处理文本,因此文本的预处理和训练数据的大小始终比预期发挥更大的作用。

Edited

编辑

Add Class weights in fit parameter

在fit参数中添加类权重

class_weights = class_weight.compute_class_weight('balanced',
                                                  np.unique(labels),
                                                  labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
                          class_weights))


model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)

#2


1  

change:

更改:

model.add(Activation('sigmoid'))

to:

至:

model.add(Activation('softmax'))