deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

时间:2022-12-14 17:20:59

序列模型
吴恩达 Andrew Ng

Why sequence models

Examples

Speech recognition, Music generation, Sentiment classification, DNA sequence analysis, Machine translation, Video activity recognition, Name entity recognition

Notation

  • X ( i ) < t > : 第i个输入样本的第t个元素

  • T X ( i ) : 第i个输入样本的长度

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

  • 建立字典(单词的列向量),使用one-hot表示单词位置

  • UNK: unknown word, 表示不在字典里的词

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

Recurrent Neural Network Model

  • Inputs and outputs can be different lengths in different examples

    每个样本的输入输出维度不固定

  • at each time-step, RNN passes on activation to the next time-step

  • 从左到右依次扫描参数

  • 每个时间步采用的是相同的参数 W a x , W a a , W y a

  • 只使用了之前的信息来做出预测

  • BRNN,双向循环神经网络

  • a < 0 > = 0 , a < 1 > = g 1 ( W a a a < 0 > + W a x x < 1 > + b a ) , y ^ < 1 > = g 2 ( W y a a < 1 > + b y )

  • 激活函数 g 1 常用 t a n h g 2 常用 s i g m o i d , s o f t m a x

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

Backpropagation through time

deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)
deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

Different types of RNNs

deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

Language model and sequence generation

  • corpus 语料库、tokenize 标记、End Of Sentence

  • y ^ < 1 > 输出第一个词是XX的概率
    deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

  • 给定前面的词,预测下一个词是什么

Sampling novel sequences 新序列采样

  • 训练一个序列模型之后,要想了解到这个模型学到了什么,一种非正式的方法就是进行一次新序列采样

  • character language model, word level language model

  • 基于词汇的语言模型可以捕捉长范围的关系,基于字符的语言模型略逊一筹,并且训练成本比较高昂

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

Vanishing gradients with RNNs

  • The basic RNN models are not good at capturing very long-term dependency.
  • local influences 局部影响
  • gradient clipping 梯度修剪,用于解决梯度爆炸,大于某个值时就进行缩放

Gated Recurrent Unit (GRU) 门控循环单元

  • c, memory cell, c ~ < t > = tanh ( W c [ c < t 1 > , x < t > ] + b c ) , c < t > = x < t >

  • Γ u = σ ( W u [ c < t 1 > , x < t > ] + b u ) , update gate, this gate value is between 0 and 1

  • gate decides when to update c, c < t > = Γ u c ~ < t > + ( 1 Γ u ) c < t 1 > , element-wise multiplication

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)
  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

Long Short Term Memory (LSTM) 长短期记忆

  • update, forget, output
    deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

  • peephole connection 窥探孔连接
    deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

Bidirectional RNN

  • combine ​information from the past, the present and the future

  • deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)

    图中的前向传播一部分计算是从左到右,一部分计算是从右到左

  • 对于大量自然语言处理问题,LSTM 单元的双向 RNN 模型是用的最多的

  • need the entire sequence of data before making predictions

Deep RNNs

a [ l ] < t > : layer l, at time t, activation value

deeplearning.ai - 循环神经网络 (Recurrent Neural Networks)