TCP之Delay ACK在Linux和Windows上实现的异同

时间:2022-06-01 16:41:34

关于TCP Delay ACK的概念我就不多说了,到处复制粘贴标准文献以及别人的文章只能让本文篇幅加长而降低被阅读完的欲望,再者这也不是什么论文,附录参考文献几乎很少有人去看,所以我把这些都略过了。

和风吹的干皮鞋,吹的断愁绪吗?

写完本文后的补充:这段话是我写完本文后补上去的。本来我想把这篇文章控制在2000字以内,或者更少的,800-1000字以内,无奈还是说多了...今天心情非常好,因为我竟然在梦里把三亨利之战的细节搞清楚了,迄今,我觉得自己对于西洋史的认知更近了一步,可以说今天是一个里程碑!我本来是想写一篇关于欧洲王朝史的文章的,可是没有太多的时间,只能延期,然后觉得自己事实上是一个搞IT的,并不是什么历史系的科班,于是回归正途,写一篇TCP的吧。
        当一个人常常自己喝淡酒,并且不喝醉,那么这喝的便不再是酒了,而是情怀!

序言很多人在提到Delay ACK的时候,都会认为它既然是一个TCP特性,那就必然存在一个开关,可以随意开启或者关闭,以下是一些显然的想法:
1.系统中有一个开关,比如sysctl或者Windows注册表项,可以开启本机的TCP Delay ACK特性;
2.系统的编程API中提供了socket选项,可以通过setsockopt来开启或者关闭某一条连接的TCP Delay ACK特性。
3.某些操作系统或者某些系统版本可能不支持TCP Delay ACK选项。

持这些想法当然没有错,但是如果从另一个角度去理解Delay ACK话,可能就会有另一种想法了,而Linux正是这另一种想法的体现,当然,这也是本文的主题。

Delay ACK的实质Delay ACK是什么?这是一个伪问题!解释概念往往非常容易,但对理解问题却是毫无益处。我们应该问:为什么会让一个针对接收到数据的ACK延迟发送?有人认为是为了减少网络上的ACK流量,有人说是为了给发送端一点突发的机会所以要积累ACK再发送,在ACK延迟的这段时间,发送端可能已经积累了足够的数据,而这对于提高长肥管道的吞吐率是有益处的。
...
也许你会认为我马上要长篇大论一通关于Nagle算法以及Write-Write-Read算法的细节了,事实上不!我们只需要知道,Delay ACK以及Nagle算法是针对特定场景的,不光是Delay ACK,不光是Nagle算法,所有的TCP算法,包括那些拥塞控制算法,都是针对特定场景的,没有放之四海而皆准的TCP算法!
        在一个TCP连接启动的时候,没有人可以预知该TCP连接后续的交互模式以及数据发送序列(除非你是在做重放实验!!),因此如果你开启了Delay ACK,但是恰恰遇到了并不适合开启Delay ACK的场景,比如遇到了Write-Write-Read,那岂不是会吃亏?那么一个问题摆在了人们面前:
到底是开启Delay ACK好呢?还是关闭它好呢?

正文请记住上一节序言的最后的那个问题,本文将围绕它展开。本文不会去分析Delay ACK会造成问题(比如Write-Write-Read这种)的各种场景,而仅仅从以下一个场景开始去展开。该场景是我自己构造的,旨在解剖两类经典的Delay ACK的实现机制。
        先看场景吧。But,场景前有个声明。
关于Delay ACK的触发声明一般而言,如果TCP接收端收到了超过一个MSS大小的数据,无论怎样都会立即回复一个ACK,这个2个MSS大小阈值是为了平衡延迟和吞吐,我不知道为什么会选为2个MSS,可能是经验值,也可能是大牛大傻逼拍脑袋拍出来的值。所以,为了简单起见,我接下来的论述中,每次(对于理解Linux内核的而言,就是每个传输的skb)所传输的数据长度均不会大于1000,我所有实验的MTU均为1500,也就是说,每次传输的数据长度均不会大于1个MSS,这样就不用考虑Delay ACK与MSS的关系了。
因此,下文所有情况中,如果按照标准的Delay ACK的理解,所有的传输均会触发接收端Delay ACK!
一个场景我部署了3份代码在三台机器:
1.Server_Windows.c部署在一台Win7机器上
#undef UNICODE #define WIN32_LEAN_AND_MEAN #include <windows.h> #include <winsock2.h> #include <ws2tcpip.h> #include <stdlib.h> #include <stdio.h> // Need to link with Ws2_32.lib #pragma comment (lib, "Ws2_32.lib") #define DEFAULT_PORT "8080" int __cdecl main(void) { WSADATA wsaData; int iResult; SOCKET ListenSocket = INVALID_SOCKET; SOCKET ClientSocket = INVALID_SOCKET; struct addrinfo *result = NULL; struct addrinfo hints; int iSendResult; // Initialize Winsock iResult = WSAStartup(MAKEWORD(2,2), &wsaData); if (iResult != 0) { printf("WSAStartup failed with error: %d\n", iResult); return 1; } ZeroMemory(&hints, sizeof(hints)); hints.ai_family = AF_INET; hints.ai_socktype = SOCK_STREAM; hints.ai_protocol = IPPROTO_TCP; hints.ai_flags = AI_PASSIVE; // Resolve the server address and port iResult = getaddrinfo(NULL, DEFAULT_PORT, &hints, &result); if ( iResult != 0 ) { printf("getaddrinfo failed with error: %d\n", iResult); WSACleanup(); return 1; } // Create a SOCKET for connecting to server ListenSocket = socket(result->ai_family, result->ai_socktype, result->ai_protocol); if (ListenSocket == INVALID_SOCKET) { printf("socket failed with error: %ld\n", WSAGetLastError()); freeaddrinfo(result); WSACleanup(); return 1; } // Setup the TCP listening socket iResult = bind( ListenSocket, result->ai_addr, (int)result->ai_addrlen); if (iResult == SOCKET_ERROR) { printf("bind failed with error: %d\n", WSAGetLastError()); freeaddrinfo(result); closesocket(ListenSocket); WSACleanup(); return 1; } freeaddrinfo(result); iResult = listen(ListenSocket, SOMAXCONN); if (iResult == SOCKET_ERROR) { printf("listen failed with error: %d\n", WSAGetLastError()); closesocket(ListenSocket); WSACleanup(); return 1; } // Accept a client socket ClientSocket = accept(ListenSocket, NULL, NULL); if (ClientSocket == INVALID_SOCKET) { printf("accept failed with error: %d\n", WSAGetLastError()); closesocket(ListenSocket); WSACleanup(); return 1; } int v = 1; //setsockopt(ClientSocket, IPPROTO_TCP, TCP_QUICKACK, &v, 4); // No longer need server socket closesocket(ListenSocket); // Receive until the peer shuts down the connection do { char buff1[1] = {0}; char buff2[999] = {0}; char buff3[1000] = {0}; char buff4[2000] = {0}; iSendResult = send( ClientSocket, buff1, sizeof(buff1), 0 ); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 1000, 0); iSendResult = send( ClientSocket, buff2, sizeof(buff2), 0 ); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iSendResult = send( ClientSocket, buff3, sizeof(buff3), 0 ); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); } while (0); sleep(10); // shutdown the connection since we‘re done iResult = shutdown(ClientSocket, SD_SEND); if (iResult == SOCKET_ERROR) { printf("shutdown failed with error: %d\n", WSAGetLastError()); closesocket(ClientSocket); WSACleanup(); return 1; } // cleanup closesocket(ClientSocket); WSACleanup(); return 0; }
我通过Dev-C++将其编译为Server.exe。代码来自MSDN,我只是修改了了数据的收发序列。
2.Server_Linux.c部署在一台Linux CentOS机器上
#include <stdio.h> #include <stdlib.h> #include <sys/types.h> /* See NOTES */ #include <sys/socket.h> #include<arpa/inet.h> #include<string.h> #include <netinet/in.h> #include <netinet/tcp.h> #define DEFAULT_PORT "8080" int main(void) { int iResult; int ListenSocket; int ClientSocket; struct sockaddr_in server; int iSendResult; char recvbuf[2000]; int recvbuflen = 2000; memset(&server, 0, sizeof(server)); server.sin_family = AF_INET; server.sin_addr.s_addr = INADDR_ANY; server.sin_port = htons(8080); // Create a SOCKET for connecting to server ListenSocket = socket(AF_INET , SOCK_STREAM , 0); if (ListenSocket == -1) { return 1; } // Setup the TCP listening socket iResult = bind( ListenSocket, (struct sockaddr *)&server , sizeof(server)); if (iResult < 0) { perror("bind"); close(ListenSocket); return 1; } iResult = listen(ListenSocket, 3); if (iResult == -1) { close(ListenSocket); return 1; } // Accept a client socket ClientSocket = accept(ListenSocket, NULL, NULL); if (ClientSocket == -1) { close(ListenSocket); return 1; } int v = 0; setsockopt(ClientSocket, IPPROTO_TCP, TCP_QUICKACK, &v, 4); // No longer need server socket close(ListenSocket); // Receive until the peer shuts down the connection do { char buff1[1] = {0}; char buff2[999] = {0}; char buff3[1000] = {0}; char buff4[2000] = {0}; iSendResult = send( ClientSocket, buff1, sizeof(buff1), 0 ); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 1000, 0); iSendResult = send( ClientSocket, buff2, sizeof(buff2), 0 ); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iSendResult = send( ClientSocket, buff3, sizeof(buff3), 0 ); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); iResult = recv(ClientSocket, buff4, 700, 0); iResult = recv(ClientSocket, buff4, 300, 0); } while (0); sleep(10); // cleanup close(ClientSocket); return 0; }
Linux上通过gcc将其编译为Server。值得注意的是,此段代码由上述的Server_Windows.c,也就是说和改自MSDN的代码逻辑完全一致,只是适配了Linux的API而已,因此它的程序语义和部署在Windows上的代码编译后的Server.exe是完全一致的,如果哪里发现了不同,那一定是协议栈实现的不同导致的。
3.Clinet.c部署在一台Linux Debian机器上
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #define DEFAULT_PORT 8080 int main(int argc, char** argv) { int cPort = DEFAULT_PORT; int cClient = 0; int cLen = 0; struct sockaddr_in cli; char cbuf[4096] = {0}; char buff1[700] = {0}; char buff2[300] = {0}; char buff3[1000] = {0}; if(argc < 2) { printf("Uasge: client[server IP address]\n"); return -1; } memset(cbuf, 0, sizeof(cbuf)); cli.sin_family = AF_INET; cli.sin_port = htons(cPort); cli.sin_addr.s_addr = inet_addr(argv[1]); cClient = socket(AF_INET, SOCK_STREAM, 0); if(cClient < 0) { printf("socket() failure!\n"); return -1; } if(connect(cClient, (struct sockaddr*)&cli, sizeof(cli)) < 0) { printf("connect() failure!\n"); return -1; } cLen = recv(cClient, cbuf, 1,0); cLen = send(cClient, buff1, sizeof(buff1), 0); sleep(1); cLen = send(cClient, buff2, sizeof(buff2), 0); cLen = send(cClient, buff3, sizeof(buff3), 0); cLen = recv(cClient, cbuf, 999, 0); cLen = send(cClient, buff1, sizeof(buff1), 0); cLen = send(cClient, buff2, sizeof(buff2), 0); sleep(1); cLen = send(cClient, buff1, sizeof(buff1), 0); cLen = send(cClient, buff2, sizeof(buff2), 0); cLen = recv(cClient, cbuf, 1000, 0); cLen = send(cClient, buff1, sizeof(buff1), 0); cLen = send(cClient, buff2, sizeof(buff2), 0); sleep(1); cLen = send(cClient, buff1, sizeof(buff1), 0); cLen = send(cClient, buff2, sizeof(buff2), 0); sleep(1); cLen = send(cClient, buff1, sizeof(buff1), 0); cLen = send(cClient, buff2, sizeof(buff2), 0); cLen = send(cClient, buff1, sizeof(buff1), 0); cLen = send(cClient, buff2, sizeof(buff2), 0); close(cClient); return 0; }编译为Client。代码非常简单,就是连接上面的两个几乎一模一样的服务器。


我给出该场景的连接拓扑: