查询字符串参数的Java URL编码。

时间:2022-08-24 12:20:52

Say I have a URL

比方说我有一个URL。

http://example.com/query?q=

and I have a query entered by the user such as:

我有一个用户输入的查询,例如:

random word £500 bank $

随机词£500美元银行

I want the result to be a properly encoded URL:

我希望结果是一个正确编码的URL:

http://example.com/query?q=random%20word%20%A3500%20bank%20%24

What's the best way to achieve this? I tried URLEncoder and creating URI/URL objects but none of them come out quite right.

达到这个目标的最好方法是什么?我尝试了URLEncoder并创建了URI/URL对象,但是没有一个是完全正确的。

9 个解决方案

#1


894  

URLEncoder should be the way to go. You only need to keep in mind to encode only the individual query string parameter name and/or value, not the entire URL, for sure not the query string parameter separator character & nor the parameter name-value separator character =.

URLEncoder应该是正确的选择。您只需要记住只编码单个查询字符串参数名称和/或值,而不是整个URL,以确保查询字符串参数分隔符和参数名称-值分隔符=。

String q = "random word £500 bank $";
String url = "http://example.com/query?q=" + URLEncoder.encode(q, "UTF-8");

Note that spaces in query parameters are represented by +, not %20, which is legitimately valid. The %20 is usually to be used to represent spaces in URI itself (the part before the URI-query string separator character ?), not in query string (the part after ?).

注意,查询参数中的空格由+而不是%20表示,这是合法有效的。%20通常用来表示URI本身的空间(在URI查询字符串分隔符之前的部分),而不是查询字符串(后面的部分)。

Also note that there are two encode() methods. One without charset argument and another with. The one without charset argument is deprecated. Never use it and always specify the charset argument. The javadoc even explicitly recommends to use the UTF-8 encoding, as mandated by RFC3986 and W3C.

还要注意,有两个encode()方法。一个没有字符集,另一个没有。不赞成使用charset参数的那个。永远不要使用它,并且总是指定charset参数。javadoc甚至明确建议使用UTF-8编码,这是由RFC3986和W3C授权的。

All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented by the 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8. However, for compatibility reasons, if an encoding is not specified, then the default encoding of the platform is used.

所有其他字符都是不安全的,并且首先使用一些编码方案转换成一个或多个字节。然后,每个字节由3个字符的字符串“%xy”表示,其中xy为字节的两位数十六进制表示。推荐使用的编码方案是UTF-8。但是,出于兼容性的原因,如果没有指定编码,则使用平台的默认编码。

See also:

#2


122  

I would not use URLEncoder. Besides being incorrectly named (URLEncoder has nothing to do with URLs), inefficient (it uses a StringBuffer instead of Builder and does a couple of other things that are slow) Its also way too easy to screw it up.

我不会使用URLEncoder。除了被错误地命名(URLEncoder和url没有关系),低效(它使用StringBuffer代替Builder,还做了其他一些慢的事情),它也太容易搞砸了。

Instead I would use URIBuilder or Spring's org.springframework.web.util.UriUtils.encodeQuery or Commons Apache HttpClient. The reason being you have to escape the query parameters name (ie BalusC's answer q) differently than the parameter value.

相反,我将使用URIBuilder或Spring的org.springframework.web.util.UriUtils。encodeQuery或Commons Apache HttpClient。原因是,您必须以不同于参数值的方式来避免查询参数名称(即BalusC的答案q)。

The only downside to the above (that I found out painfully) is that URL's are not a true subset of URI's.

上面的唯一缺点(我很痛苦地发现)是URL不是URI的真正子集。

Sample code:

示例代码:

import org.apache.http.client.utils.URIBuilder;

URIBuilder ub = new URIBuilder("http://example.com/query");
ub.addParameter("q", "random word £500 bank \$");
String url = ub.toString();

// Result: http://example.com/query?q=random+word+%C2%A3500+bank+%24

Since I'm just linking to other answers I marked this as a community wiki. Feel free to edit.

因为我只是链接到其他的答案,我把它标记为一个社区维基。请编辑出来。

#3


80  

You need to first create a URI like:

首先需要创建一个URI,比如:

    String urlStr = "http://www.example.com/CEREC® Materials & Accessories/IPS Empress® CAD.pdf"
    URL url= new URL(urlStr);
    URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());

Then convert that Uri to ASCII string:

然后将该Uri转换为ASCII字符串:

    urlStr=uri.toASCIIString();

Now your url string is completely encoded first we did simple url encoding and then we converted it to ASCII String to make sure no character outside US-ASCII are remaining in string. This is exactly how browsers do.

现在你的url字符串被完全编码了首先我们做了简单的url编码然后我们将它转换为ASCII字符串以确保在US-ASCII之外的字符都不存在。浏览器就是这么做的。

#4


27  

Guava 15 has now added a set of straightforward URL escapers.

Guava 15现在添加了一组简单的URL逃避者。

#5


5  

Apache Http Components library provides a neat option for building and encoding query params -

Apache Http组件库为构建和编码查询params提供了一个整洁的选项。

With HttpComponents 4.x use - URLEncodedUtils

与HttpComponents 4。使用x - URLEncodedUtils

For HttpClient 3.x use - EncodingUtil

HttpClient 3。使用x - EncodingUtil

#6


4  

Here's a method you can use in your code to convert a url string and map of parameters to a valid encoded url string containing the query parameters.

这里有一个方法,可以在代码中使用,将url字符串和参数映射转换为包含查询参数的有效编码url字符串。

String addQueryStringToUrlString(String url, final Map<Object, Object> parameters) throws UnsupportedEncodingException {
    if (parameters == null) {
        return url;
    }

    for (Map.Entry<Object, Object> parameter : parameters.entrySet()) {

        final String encodedKey = URLEncoder.encode(parameter.getKey().toString(), "UTF-8");
        final String encodedValue = URLEncoder.encode(parameter.getValue().toString(), "UTF-8");

        if (!url.contains("?")) {
            url += "?" + encodedKey + "=" + encodedValue;
        } else {
            url += "&" + encodedKey + "=" + encodedValue;
        }
    }

    return url;
}

#7


2  

I would use this code:

我将使用以下代码:

Uri myUI = Uri.parse ("http://example.com/query").buildUpon().appendQueryParameter("q","random word A3500 bank 24").build();

#8


0  

1. Split URL into structural parts. Use java.net.URL for it.

1。将URL拆分为结构部分。使用java.net.URL。

2. Encode each structural part properly!

2。正确编码每个结构部分!

3. Use IDN.toASCII(putDomainNameHere) to Punycode encode the host name!

3所示。使用IDN.toASCII(putDomainNameHere)到Punycode编码主机名!

4. Use java.net.URI.toASCIIString() to percent-encode, NFC encoded unicode - (better would be NFKC!). For more info see: How to encode properly this URL

4所示。使用java.net.URI.toASCIIString()到百分比编码,NFC编码的unicode -(最好是NFKC!)有关更多信息,请参见:如何正确编码此URL。

URL url= new URL("http://example.com/query?q=random word £500 bank $");
URI uri = new URI(url.getProtocol(), url.getUserInfo(), IDN.toASCII(url.getHost()), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
String correctEncodedURL=uri.toASCIIString(); 
System.out.println(correctEncodedURL);

Prints

打印

http://example.com/query?q=random%20word%20%C2%A3500%20bank%20$

#9


-2  

  1. Use this: URLEncoder.encode(query, StandardCharsets.UTF_8.displayName()); or this:URLEncoder.encode(query, "UTF-8");
  2. 使用这个:URLEncoder。编码(查询,StandardCharsets.UTF_8.displayName());或:URLEncoder。编码(查询,“utf - 8”);
  3. You can use the follwing code.

    你可以使用下面的代码。

    String encodedUrl1 = UriUtils.encodeQuery(query, "UTF-8");//not change 
    String encodedUrl2 = URLEncoder.encode(query, "UTF-8");//changed
    String encodedUrl3 = URLEncoder.encode(query, StandardCharsets.UTF_8.displayName());//changed
    
    System.out.println("url1 " + encodedUrl1 + "\n" + "url2=" + encodedUrl2 + "\n" + "url3=" + encodedUrl3);
    

#1


894  

URLEncoder should be the way to go. You only need to keep in mind to encode only the individual query string parameter name and/or value, not the entire URL, for sure not the query string parameter separator character & nor the parameter name-value separator character =.

URLEncoder应该是正确的选择。您只需要记住只编码单个查询字符串参数名称和/或值,而不是整个URL,以确保查询字符串参数分隔符和参数名称-值分隔符=。

String q = "random word £500 bank $";
String url = "http://example.com/query?q=" + URLEncoder.encode(q, "UTF-8");

Note that spaces in query parameters are represented by +, not %20, which is legitimately valid. The %20 is usually to be used to represent spaces in URI itself (the part before the URI-query string separator character ?), not in query string (the part after ?).

注意,查询参数中的空格由+而不是%20表示,这是合法有效的。%20通常用来表示URI本身的空间(在URI查询字符串分隔符之前的部分),而不是查询字符串(后面的部分)。

Also note that there are two encode() methods. One without charset argument and another with. The one without charset argument is deprecated. Never use it and always specify the charset argument. The javadoc even explicitly recommends to use the UTF-8 encoding, as mandated by RFC3986 and W3C.

还要注意,有两个encode()方法。一个没有字符集,另一个没有。不赞成使用charset参数的那个。永远不要使用它,并且总是指定charset参数。javadoc甚至明确建议使用UTF-8编码,这是由RFC3986和W3C授权的。

All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented by the 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8. However, for compatibility reasons, if an encoding is not specified, then the default encoding of the platform is used.

所有其他字符都是不安全的,并且首先使用一些编码方案转换成一个或多个字节。然后,每个字节由3个字符的字符串“%xy”表示,其中xy为字节的两位数十六进制表示。推荐使用的编码方案是UTF-8。但是,出于兼容性的原因,如果没有指定编码,则使用平台的默认编码。

See also:

#2


122  

I would not use URLEncoder. Besides being incorrectly named (URLEncoder has nothing to do with URLs), inefficient (it uses a StringBuffer instead of Builder and does a couple of other things that are slow) Its also way too easy to screw it up.

我不会使用URLEncoder。除了被错误地命名(URLEncoder和url没有关系),低效(它使用StringBuffer代替Builder,还做了其他一些慢的事情),它也太容易搞砸了。

Instead I would use URIBuilder or Spring's org.springframework.web.util.UriUtils.encodeQuery or Commons Apache HttpClient. The reason being you have to escape the query parameters name (ie BalusC's answer q) differently than the parameter value.

相反,我将使用URIBuilder或Spring的org.springframework.web.util.UriUtils。encodeQuery或Commons Apache HttpClient。原因是,您必须以不同于参数值的方式来避免查询参数名称(即BalusC的答案q)。

The only downside to the above (that I found out painfully) is that URL's are not a true subset of URI's.

上面的唯一缺点(我很痛苦地发现)是URL不是URI的真正子集。

Sample code:

示例代码:

import org.apache.http.client.utils.URIBuilder;

URIBuilder ub = new URIBuilder("http://example.com/query");
ub.addParameter("q", "random word £500 bank \$");
String url = ub.toString();

// Result: http://example.com/query?q=random+word+%C2%A3500+bank+%24

Since I'm just linking to other answers I marked this as a community wiki. Feel free to edit.

因为我只是链接到其他的答案,我把它标记为一个社区维基。请编辑出来。

#3


80  

You need to first create a URI like:

首先需要创建一个URI,比如:

    String urlStr = "http://www.example.com/CEREC® Materials & Accessories/IPS Empress® CAD.pdf"
    URL url= new URL(urlStr);
    URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());

Then convert that Uri to ASCII string:

然后将该Uri转换为ASCII字符串:

    urlStr=uri.toASCIIString();

Now your url string is completely encoded first we did simple url encoding and then we converted it to ASCII String to make sure no character outside US-ASCII are remaining in string. This is exactly how browsers do.

现在你的url字符串被完全编码了首先我们做了简单的url编码然后我们将它转换为ASCII字符串以确保在US-ASCII之外的字符都不存在。浏览器就是这么做的。

#4


27  

Guava 15 has now added a set of straightforward URL escapers.

Guava 15现在添加了一组简单的URL逃避者。

#5


5  

Apache Http Components library provides a neat option for building and encoding query params -

Apache Http组件库为构建和编码查询params提供了一个整洁的选项。

With HttpComponents 4.x use - URLEncodedUtils

与HttpComponents 4。使用x - URLEncodedUtils

For HttpClient 3.x use - EncodingUtil

HttpClient 3。使用x - EncodingUtil

#6


4  

Here's a method you can use in your code to convert a url string and map of parameters to a valid encoded url string containing the query parameters.

这里有一个方法,可以在代码中使用,将url字符串和参数映射转换为包含查询参数的有效编码url字符串。

String addQueryStringToUrlString(String url, final Map<Object, Object> parameters) throws UnsupportedEncodingException {
    if (parameters == null) {
        return url;
    }

    for (Map.Entry<Object, Object> parameter : parameters.entrySet()) {

        final String encodedKey = URLEncoder.encode(parameter.getKey().toString(), "UTF-8");
        final String encodedValue = URLEncoder.encode(parameter.getValue().toString(), "UTF-8");

        if (!url.contains("?")) {
            url += "?" + encodedKey + "=" + encodedValue;
        } else {
            url += "&" + encodedKey + "=" + encodedValue;
        }
    }

    return url;
}

#7


2  

I would use this code:

我将使用以下代码:

Uri myUI = Uri.parse ("http://example.com/query").buildUpon().appendQueryParameter("q","random word A3500 bank 24").build();

#8


0  

1. Split URL into structural parts. Use java.net.URL for it.

1。将URL拆分为结构部分。使用java.net.URL。

2. Encode each structural part properly!

2。正确编码每个结构部分!

3. Use IDN.toASCII(putDomainNameHere) to Punycode encode the host name!

3所示。使用IDN.toASCII(putDomainNameHere)到Punycode编码主机名!

4. Use java.net.URI.toASCIIString() to percent-encode, NFC encoded unicode - (better would be NFKC!). For more info see: How to encode properly this URL

4所示。使用java.net.URI.toASCIIString()到百分比编码,NFC编码的unicode -(最好是NFKC!)有关更多信息,请参见:如何正确编码此URL。

URL url= new URL("http://example.com/query?q=random word £500 bank $");
URI uri = new URI(url.getProtocol(), url.getUserInfo(), IDN.toASCII(url.getHost()), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
String correctEncodedURL=uri.toASCIIString(); 
System.out.println(correctEncodedURL);

Prints

打印

http://example.com/query?q=random%20word%20%C2%A3500%20bank%20$

#9


-2  

  1. Use this: URLEncoder.encode(query, StandardCharsets.UTF_8.displayName()); or this:URLEncoder.encode(query, "UTF-8");
  2. 使用这个:URLEncoder。编码(查询,StandardCharsets.UTF_8.displayName());或:URLEncoder。编码(查询,“utf - 8”);
  3. You can use the follwing code.

    你可以使用下面的代码。

    String encodedUrl1 = UriUtils.encodeQuery(query, "UTF-8");//not change 
    String encodedUrl2 = URLEncoder.encode(query, "UTF-8");//changed
    String encodedUrl3 = URLEncoder.encode(query, StandardCharsets.UTF_8.displayName());//changed
    
    System.out.println("url1 " + encodedUrl1 + "\n" + "url2=" + encodedUrl2 + "\n" + "url3=" + encodedUrl3);