Akka.IO Tcp是否存在双向通信的瓶颈?

时间:2022-08-26 18:40:36

EDIT: This is a duplicate of Does Akka Tcp support full-duplex communication? (please don’t ask the same question multiple times, the same goes for duplicating on mailing lists, this wastes the time of those who volunteer their help, reducing your chances of getting answers in the future)

编辑:这是重复的Akka Tcp是否支持全双工通信? (请不要多次提出同样的问题,在邮件列表上复制也是如此,这会浪费那些自愿提供帮助的人的时间,减少你将来获得答案的机会)


I've modified Echo server from https://github.com/akka/akka/blob/master/akka-docs/rst/scala/code/docs/io/EchoServer.scala#L96

我已经从https://github.com/akka/akka/blob/master/akka-docs/rst/scala/code/docs/io/EchoServer.scala#L96修改了Echo服务器

case Received(data) =>
  connection ! Write(data, Ack(currentOffset))
  log.debug("same {}", sender.eq(connection)) // true
  buffer(data)

That means incoming and outgoing messages are handled by the same actor. So a single working thread(that takes messages from a mailbox) will process read and write operations. Looks like a potential bottleneck.

这意味着传入和传出消息由同一个actor处理。因此,单个工作线程(从邮箱接收消息)将处理读写操作。看起来像是一个潜在的瓶颈。

In "classical" world I can create one thread to read from a socket and another for a writing and get simultaneous communication.

在“经典”世界中,我可以创建一个线程来从套接字读取,另一个线程用于写入并获得同步通信。

Update Discussion in google group https://groups.google.com/forum/#!topic/akka-dev/mcs5eLKiAVQ

谷歌群组中的更新讨论https://groups.google.com/forum/#!topic/akka-dev/mcs5eLKiAVQ

1 个解决方案

#1


While there is a single Actor that either reads or writes at any given point in time, each of these operations takes very few cycles since it only occurs when there are data to be read or buffer space available to be written to. The system call overhead of ~1µs means that with the default buffer sizes of 128kiB you should be able to transfer up to 100GiB/s in total, which sure is a bottleneck but probably not today and in practice (this roughly coincides with typical CPU memory bandwidth, so more data rate is currently impossible anyway). Once this changes we can split the reading and writing responsibilities between different selectors and wake up different Actors, but before doing that we’ll need to verify that there actually is a measurable effect.

虽然有一个Actor可以在任何给定的时间点读取或写入,但是这些操作中的每一个都只需要很少的周期,因为它只在有待读取的数据或可用于写入的缓冲区空间时才会发生。大约1μs的系统调用开销意味着默认缓冲区大小为128kiB,你应该总共可以传输高达100GiB / s,这肯定是一个瓶颈,但可能不是今天和实际上(这大致与典型的CPU内存一致)带宽,所以无论如何更多的数据速率是不可能的)。一旦发生这种变化,我们就可以在不同的选择器之间分配读写责任并唤醒不同的Actors,但在此之前我们需要验证实际上是否存在可测量的效果。

The other question that needs answering is which operating system kernels actually allow concurrent operations on a single socket from multiple threads. I have not researched this yet, but I would not be surprised to find that fully independent locking will be hard to do and there might not (yet) be a reason to expend that effort.

需要回答的另一个问题是哪个操作系统内核实际上允许来自多个线程的单个套接字上的并发操作。我还没有对此进行过研究,但我不会惊讶地发现完全独立的锁定将很难做到,并且可能还没有(还)是花费这些努力的理由。

#1


While there is a single Actor that either reads or writes at any given point in time, each of these operations takes very few cycles since it only occurs when there are data to be read or buffer space available to be written to. The system call overhead of ~1µs means that with the default buffer sizes of 128kiB you should be able to transfer up to 100GiB/s in total, which sure is a bottleneck but probably not today and in practice (this roughly coincides with typical CPU memory bandwidth, so more data rate is currently impossible anyway). Once this changes we can split the reading and writing responsibilities between different selectors and wake up different Actors, but before doing that we’ll need to verify that there actually is a measurable effect.

虽然有一个Actor可以在任何给定的时间点读取或写入,但是这些操作中的每一个都只需要很少的周期,因为它只在有待读取的数据或可用于写入的缓冲区空间时才会发生。大约1μs的系统调用开销意味着默认缓冲区大小为128kiB,你应该总共可以传输高达100GiB / s,这肯定是一个瓶颈,但可能不是今天和实际上(这大致与典型的CPU内存一致)带宽,所以无论如何更多的数据速率是不可能的)。一旦发生这种变化,我们就可以在不同的选择器之间分配读写责任并唤醒不同的Actors,但在此之前我们需要验证实际上是否存在可测量的效果。

The other question that needs answering is which operating system kernels actually allow concurrent operations on a single socket from multiple threads. I have not researched this yet, but I would not be surprised to find that fully independent locking will be hard to do and there might not (yet) be a reason to expend that effort.

需要回答的另一个问题是哪个操作系统内核实际上允许来自多个线程的单个套接字上的并发操作。我还没有对此进行过研究,但我不会惊讶地发现完全独立的锁定将很难做到,并且可能还没有(还)是花费这些努力的理由。