使用Java中的套接字的HTTP 1.1持久连接

时间:2023-02-11 03:50:26

Let's say I have a java program that makes an HTTP request on a server using HTTP 1.1 and doesn't close the connection. I make one request, and read all data returned from the input stream I have bound to the socket. However, upon making a second request, I get no response from the server (or there's a problem with the stream - it doesn't provide any more input). If I make the requests in order (Request, request, read) it works fine, but (request, read, request, read) doesn't.

假设我有一个java程序,它使用HTTP 1.1在服务器上发出HTTP请求,并且不关闭连接。我发出一个请求,并读取从绑定到套接字的输入流返回的所有数据。但是,在发出第二个请求时,我没有得到服务器的响应(或者流有问题 - 它不再提供任何输入)。如果我按顺序发出请求(请求,请求,读取)它工作正常,但(请求,读取,请求,读取)不会。

Could someone shed some insight onto why this might be happening? (Code snippets follow). No matter what I do, the second read loop's isr_reader.read() only ever returns -1.

有人可以了解为什么会发生这种情况吗? (代码片段如下)。无论我做什么,第二个读取循环的isr_reader.read()只返回-1。

try{
        connection = new Socket("SomeServer", port);
        con_out = connection.getOutputStream();
        con_in  = connection.getInputStream();
        PrintWriter out_writer = new PrintWriter(con_out, false);
        out_writer.print("GET http://somesite HTTP/1.1\r\n");
        out_writer.print("Host: thehost\r\n");
        //out_writer.print("Content-Length: 0\r\n");
        out_writer.print("\r\n");
        out_writer.flush();

        // If we were not interpreting this data as a character stream, we might need to adjust byte ordering here.
        InputStreamReader isr_reader = new InputStreamReader(con_in);
        char[] streamBuf = new char[8192];
        int amountRead;
        StringBuilder receivedData = new StringBuilder();
        while((amountRead = isr_reader.read(streamBuf)) > 0){
            receivedData.append(streamBuf, 0, amountRead);
        }

// Response is processed here.

        if(connection != null && !connection.isClosed()){
            //System.out.println("Connection Still Open...");

        out_writer.print("GET http://someSite2\r\n");
        out_writer.print("Host: somehost\r\n");
        out_writer.print("Connection: close\r\n");
        out_writer.print("\r\n");
        out_writer.flush();

        streamBuf = new char[8192];
        amountRead = 0;
        receivedData.setLength(0);
        while((amountRead = isr_reader.read(streamBuf)) > 0 || amountRead < 1){
            if (amountRead > 0)
                receivedData.append(streamBuf, 0, amountRead);
        }
}
        // Process response here
    }

Responses to questions: Yes, I'm receiving chunked responses from the server. I'm using raw sockets because of an outside restriction.

回答问题:是的,我收到服务器的分块响应。我正在使用原始套接字,因为外部限制。

Apologies for the mess of code - I was rewriting it from memory and seem to have introduced a few bugs.

为代码混乱道歉 - 我从内存重写它,似乎引入了一些错误。

So the consensus is I have to either do (request, request, read) and let the server close the stream once I hit the end, or, if I do (request, read, request, read) stop before I hit the end of the stream so that the stream isn't closed.

所以达成共识是我必须做(请求,请求,读取)并让服务器在我结束时关闭流,或者,如果我这样做(请求,读取,请求,读取)在我到达结束之前停止流,以便流不关闭。

5 个解决方案

#1


5  

According to your code, the only time you'll even reach the statements dealing with sending the second request is when the server closes the output stream (your input stream) after receiving/responding to the first request.

根据您的代码,您甚至可以到达处理发送第二个请求的语句的唯一时间是服务器在接收/响应第一个请求后关闭输出流(您的输入流)。

The reason for that is that your code that is supposed to read only the first response

原因是你的代码应该只读取第一个响应

while((amountRead = isr_reader.read(streamBuf)) > 0) {
  receivedData.append(streamBuf, 0, amountRead);
}

will block until the server closes the output stream (i.e., when read returns -1) or until the read timeout on the socket elapses. In the case of the read timeout, an exception will be thrown and you won't even get to sending the second request.

将阻塞,直到服务器关闭输出流(即,当读取返回-1时)或直到服务器上的读取超时为止。在读取超时的情况下,将抛出异常,您甚至不会发送第二个请求。

The problem with HTTP responses is that they don't tell you how many bytes to read from the stream until the end of the response. This is not a big deal for HTTP 1.0 responses, because the server simply closes the connection after the response thus enabling you to obtain the response (status line + headers + body) by simply reading everything until the end of the stream.

HTTP响应的问题在于它们不会告诉您在响应结束之前要从流中读取多少字节。这对于HTTP 1.0响应来说并不是什么大问题,因为服务器只是在响应之后关闭连接,从而使您能够通过简单地读取所有内容直到流结束来获取响应(状态行+标题+正文)。

With HTTP 1.1 persistent connections you can no longer simply read everything until the end of the stream. You first need to read the status line and the headers, line by line, and then, based on the status code and the headers (such as Content-Length) decide how many bytes to read to obtain the response body (if it's present at all). If you do the above properly, your read operations will complete before the connection is closed or a timeout happens, and you will have read exactly the response the server sent. This will enable you to send the next request and then read the second response in exactly the same manner as the first one.

使用HTTP 1.1持久连接,您不能再直接读取所有内容,直到流结束。首先需要逐行读取状态行和标题,然后根据状态代码和标题(例如Content-Length)决定读取多少字节以获取响应主体(如果它出现在所有)。如果您正确执行上述操作,您的读取操作将在连接关闭或超时发生之前完成,您将完全读取服务器发送的响应。这将使您能够发送下一个请求,然后以与第一个响应完全相同的方式读取第二个响应。

P.S. Request, request, read might be "working" in the sense that your server supports request pipelining and thus, receives and processes both request, and you, as a result, read both responses into one buffer as your "first" response.

附:请求,请求,读取可能是“工作”,因为您的服务器支持请求流水线操作,因此,接收并处理两个请求,因此,您将两个响应读取到一个缓冲区作为“第一”响应。

P.P.S Make sure your PrintWriter is using the US-ASCII encoding. Otherwise, depending on your system encoding, the request line and headers of your HTTP requests might be malformed (wrong encoding).

P.P.S确保您的PrintWriter使用US-ASCII编码。否则,根据您的系统编码,HTTP请求的请求行和标头可能格式错误(编码错误)。

#2


3  

Writing a simple http/1.1 client respecting the RFC is not such a difficult task. To solve the problem of the blocking i/o access where reading a socket in java, you must use java.nio classes. SocketChannels give the possibility to perform a non-blocking i/o access.

编写一个尊重RFC的简单http / 1.1客户端并不是一项艰巨的任务。要解决在java中读取套接字时阻塞i / o访问的问题,必须使用java.nio类。 SocketChannels提供了执行非阻塞i / o访问的可能性。

This is necessary to send HTTP request on a persistent connection.

这是在持久连接上发送HTTP请求所必需的。

Furthermore, nio classes will give better performances.

此外,nio课程将提供更好的表现。

My stress test give to following results :

我的压力测试给出了以下结果:

  • HTTP/1.0 (java.io) -> HTTP/1.0 (java.nio) = +20% faster

    HTTP / 1.0(java.io) - > HTTP / 1.0(java.nio)=快20%

  • HTTP/1.0 (java.io) -> HTTP/1.1 (java.nio with persistent connection) = +110% faster

    HTTP / 1.0(java.io) - > HTTP / 1.1(带持久连接的java.nio)= + 110%更快

#3


0  

Make sure you have a Connection: keep-alive in your request. This may be a moot point though.

确保您的请求中有一个Connection:keep-alive。这可能是一个有争议的问题。

What kind of response is the server returning? Are you using chunked transfer? If the server doesn't know the size of the response body, it can't provide a Content-Length header and has to close the connection at the end of the response body to indicate to the client that the content has ended. In this case, the keep-alive won't work. If you're generating content on-the-fly with PHP, JSP etc., you can enable output buffering, check the size of the accumulated body, push the Content-Length header and flush the output buffer.

服务器返回什么样的响应?你在使用分块转移吗?如果服务器不知道响应主体的大小,则它不能提供Content-Length头,并且必须关闭响应主体末尾的连接以向客户端指示内容已结束。在这种情况下,keep-alive将不起作用。如果您使用PHP,JSP等动态生成内容,则可以启用输出缓冲,检查累积体的大小,推送Content-Length标头并刷新输出缓冲区。

#4


0  

Is there a particular reason you're using raw sockets and not Java's URL Connection or Commons HTTPClient?

您使用原始套接字而不是Java的URL连接或Commons HTTPClient是否有特殊原因?

HTTP isn't easy to get right. I know Commons HTTP Client can re-use connections like you're trying to do.

HTTP不容易正确。我知道Commons HTTP Client可以像您尝试的那样重新使用连接。

If there isn't a specific reason for you using Sockets this is what I would recommend :)

如果您没有使用套接字的具体原因,这是我建议:)

#5


0  

Writing your own correct client HTTP/1.1 implementation is nontrivial; historically most people who I've seen attempt it have got it wrong. Their implementation usually ignores the spec and just does what appears to work with one particular test server - in particular, they usually ignore the requirement to be able to handle chunked responses.

编写自己正确的客户端HTTP / 1.1实现是非常重要的;历史上,我见过的大多数人都试图弄错了。它们的实现通常忽略了规范,只是看起来与一个特定的测试服务器一起工作 - 特别是,它们通常忽略了能够处理分块响应的要求。

Writing your own HTTP client is probably a bad idea, unless you have some VERY strange requirements.

编写自己的HTTP客户端可能是个坏主意,除非你有一些非常奇怪的要求。

#1


5  

According to your code, the only time you'll even reach the statements dealing with sending the second request is when the server closes the output stream (your input stream) after receiving/responding to the first request.

根据您的代码,您甚至可以到达处理发送第二个请求的语句的唯一时间是服务器在接收/响应第一个请求后关闭输出流(您的输入流)。

The reason for that is that your code that is supposed to read only the first response

原因是你的代码应该只读取第一个响应

while((amountRead = isr_reader.read(streamBuf)) > 0) {
  receivedData.append(streamBuf, 0, amountRead);
}

will block until the server closes the output stream (i.e., when read returns -1) or until the read timeout on the socket elapses. In the case of the read timeout, an exception will be thrown and you won't even get to sending the second request.

将阻塞,直到服务器关闭输出流(即,当读取返回-1时)或直到服务器上的读取超时为止。在读取超时的情况下,将抛出异常,您甚至不会发送第二个请求。

The problem with HTTP responses is that they don't tell you how many bytes to read from the stream until the end of the response. This is not a big deal for HTTP 1.0 responses, because the server simply closes the connection after the response thus enabling you to obtain the response (status line + headers + body) by simply reading everything until the end of the stream.

HTTP响应的问题在于它们不会告诉您在响应结束之前要从流中读取多少字节。这对于HTTP 1.0响应来说并不是什么大问题,因为服务器只是在响应之后关闭连接,从而使您能够通过简单地读取所有内容直到流结束来获取响应(状态行+标题+正文)。

With HTTP 1.1 persistent connections you can no longer simply read everything until the end of the stream. You first need to read the status line and the headers, line by line, and then, based on the status code and the headers (such as Content-Length) decide how many bytes to read to obtain the response body (if it's present at all). If you do the above properly, your read operations will complete before the connection is closed or a timeout happens, and you will have read exactly the response the server sent. This will enable you to send the next request and then read the second response in exactly the same manner as the first one.

使用HTTP 1.1持久连接,您不能再直接读取所有内容,直到流结束。首先需要逐行读取状态行和标题,然后根据状态代码和标题(例如Content-Length)决定读取多少字节以获取响应主体(如果它出现在所有)。如果您正确执行上述操作,您的读取操作将在连接关闭或超时发生之前完成,您将完全读取服务器发送的响应。这将使您能够发送下一个请求,然后以与第一个响应完全相同的方式读取第二个响应。

P.S. Request, request, read might be "working" in the sense that your server supports request pipelining and thus, receives and processes both request, and you, as a result, read both responses into one buffer as your "first" response.

附:请求,请求,读取可能是“工作”,因为您的服务器支持请求流水线操作,因此,接收并处理两个请求,因此,您将两个响应读取到一个缓冲区作为“第一”响应。

P.P.S Make sure your PrintWriter is using the US-ASCII encoding. Otherwise, depending on your system encoding, the request line and headers of your HTTP requests might be malformed (wrong encoding).

P.P.S确保您的PrintWriter使用US-ASCII编码。否则,根据您的系统编码,HTTP请求的请求行和标头可能格式错误(编码错误)。

#2


3  

Writing a simple http/1.1 client respecting the RFC is not such a difficult task. To solve the problem of the blocking i/o access where reading a socket in java, you must use java.nio classes. SocketChannels give the possibility to perform a non-blocking i/o access.

编写一个尊重RFC的简单http / 1.1客户端并不是一项艰巨的任务。要解决在java中读取套接字时阻塞i / o访问的问题,必须使用java.nio类。 SocketChannels提供了执行非阻塞i / o访问的可能性。

This is necessary to send HTTP request on a persistent connection.

这是在持久连接上发送HTTP请求所必需的。

Furthermore, nio classes will give better performances.

此外,nio课程将提供更好的表现。

My stress test give to following results :

我的压力测试给出了以下结果:

  • HTTP/1.0 (java.io) -> HTTP/1.0 (java.nio) = +20% faster

    HTTP / 1.0(java.io) - > HTTP / 1.0(java.nio)=快20%

  • HTTP/1.0 (java.io) -> HTTP/1.1 (java.nio with persistent connection) = +110% faster

    HTTP / 1.0(java.io) - > HTTP / 1.1(带持久连接的java.nio)= + 110%更快

#3


0  

Make sure you have a Connection: keep-alive in your request. This may be a moot point though.

确保您的请求中有一个Connection:keep-alive。这可能是一个有争议的问题。

What kind of response is the server returning? Are you using chunked transfer? If the server doesn't know the size of the response body, it can't provide a Content-Length header and has to close the connection at the end of the response body to indicate to the client that the content has ended. In this case, the keep-alive won't work. If you're generating content on-the-fly with PHP, JSP etc., you can enable output buffering, check the size of the accumulated body, push the Content-Length header and flush the output buffer.

服务器返回什么样的响应?你在使用分块转移吗?如果服务器不知道响应主体的大小,则它不能提供Content-Length头,并且必须关闭响应主体末尾的连接以向客户端指示内容已结束。在这种情况下,keep-alive将不起作用。如果您使用PHP,JSP等动态生成内容,则可以启用输出缓冲,检查累积体的大小,推送Content-Length标头并刷新输出缓冲区。

#4


0  

Is there a particular reason you're using raw sockets and not Java's URL Connection or Commons HTTPClient?

您使用原始套接字而不是Java的URL连接或Commons HTTPClient是否有特殊原因?

HTTP isn't easy to get right. I know Commons HTTP Client can re-use connections like you're trying to do.

HTTP不容易正确。我知道Commons HTTP Client可以像您尝试的那样重新使用连接。

If there isn't a specific reason for you using Sockets this is what I would recommend :)

如果您没有使用套接字的具体原因,这是我建议:)

#5


0  

Writing your own correct client HTTP/1.1 implementation is nontrivial; historically most people who I've seen attempt it have got it wrong. Their implementation usually ignores the spec and just does what appears to work with one particular test server - in particular, they usually ignore the requirement to be able to handle chunked responses.

编写自己正确的客户端HTTP / 1.1实现是非常重要的;历史上,我见过的大多数人都试图弄错了。它们的实现通常忽略了规范,只是看起来与一个特定的测试服务器一起工作 - 特别是,它们通常忽略了能够处理分块响应的要求。

Writing your own HTTP client is probably a bad idea, unless you have some VERY strange requirements.

编写自己的HTTP客户端可能是个坏主意,除非你有一些非常奇怪的要求。