【论文笔记】Dynamic Routing Between Capsules

时间:2022-11-17 19:38:06

Dynamic Routing Between Capsules

2018-09-16 20:18:30

Paperhttps://arxiv.org/pdf/1710.09829.pdf%20

PyTorch Implementationhttps://github.com/timomernick/pytorch-capsule

Abstract

本文的实验表明:capsule network 比传统的 CNN 在识别重叠的字符上,有更好的效果(we show that a discriminative trained, multi-layer capsule system ahcieves state of the art performance on Mnist and is considerably better results than a convolutional net at recognizing highly overlapping digits)。Capsule network 可以用较少的训练数据,取得较好的结果。

How the vector inputs and outputs of a capsule are computed

我们想要 capsule 的输出向量的长度代表 entity 出现的概率。我们提出一种非线性的 “squashing” 函数来确保,short vectors 可以被压缩到接近零,long vectors 被压缩到接近 1。该激活函数如下:

【论文笔记】Dynamic Routing Between Capsules

对于所有但不是 capsules 的第一层,对于 capsule $s_j$ 的总输入是:a weighted sum over all "prediction vector" 【论文笔记】Dynamic Routing Between Capsules  from the capsules in the layer below and is produced by multipying the output $u_i$ of a capsule in the layer below by a weight matrix $W_{ij}$:

【论文笔记】Dynamic Routing Between Capsules

其中,$c_{ij}$ 是在迭代的动态路由过程中决定的耦合系数(coupling coefficients)。

capsule i 和当前层所有的 capsules 的耦合系数 加和为 1,通过 “routing softmax” 来决定。

【论文笔记】Dynamic Routing Between Capsules

【论文笔记】Dynamic Routing Between Capsules

==