LSTM单元如何包含空间或顺序信息？

As I understand it, LSTM units are linked in sequence and each unit has an output, and each LSTM unit passes an output to the next LSTM unit in the chain.

据我了解，LSTM单元按顺序链接，每个单元都有一个输出，每个LSTM单元将输出传递给链中的下一个LSTM单元。

However, don't you put your entire input in every LSTM unit? I don't see how this chain reflects the structure of the data in which the sequential order matters.

但是，您不是要将所有输入都放在每个LSTM单元中吗？我没有看到这个链如何反映顺序顺序重要的数据结构。

Can someone explain where I go wrong? I'm particularly interested in the version of lstm implemented in keras but every answer is very welcome!

有人可以解释我哪里出错吗？我对在keras中实现的lstm版本特别感兴趣但是每个答案都非常受欢迎！

1 个解决方案

#1

No, LSTM units are all parallel.

不，LSTM单位都是平行的。

The sequence exists only in the data itself, when you separate a dimension to be what they call the time steps. Data passed to an LSTM is shaped as (Batch Size, Time Steps, Data Size).

当您将维度分离为他们称之为时间步长的维度时，序列仅存在于数据本身中。传递给LSTM的数据的形状为（批量大小，时间步长，数据大小）。

The sequence occurs in "time steps", but all units work in parallel.

序列出现在“时间步长”中，但所有单元并行工作。

Even an LSTM with only one unit will still work in a sequence with the time steps.

即使只有一个单元的LSTM仍然可以按时间步长顺序工作。

What happens to LSTMs is that they've got a "state". It's an internal matrix that is like it's memory. In each sequence step, there are "gates" (other matrices) that decide, based on the step input, if that step will change the state and how much. There are also "forget gates", that decide if the old state will kept or forgotten.

LSTM发生的事情是他们有一个“状态”。它是一个内部矩阵，就像它的记忆一样。在每个序列步骤中，存在“门”（其他矩阵），其基于步进输入决定该步骤是否将改变状态和多少。还有“忘记门”，决定旧州是否会被遗忘或遗忘。

In keras, you can have the attribute return_sequences set to true or false.

在keras中，可以将属性return_sequences设置为true或false。

If true, the result will carry the results of each time step.
If false, only the final result will be output.

如果为true，则结果将包含每个时间步的结果。如果为false，则仅输出最终结果。

In both cases, units will be just a "size" of the result. (Pretty much as the units in a Dense layer, or the filters in a Convolutional layer, they're more power, more features, but not more steps).

在这两种情况下，单位只是结果的“大小”。（几乎与Dense图层中的单位或卷积层中的滤镜一样，它们更强大，功能更多，但步骤更多）。

The output with return_sequences=False will have only the units as size: (Batch Size, Units)
The output with return_sequences=True will keep the time steps: (Batch Size, Time Steps, Units)

带return_sequences = False的输出只有单位作为大小:(批量大小，单位）带有return_sequences = True的输出将保持时间步长:(批量大小，时间步长，单位）

#1