JSON解码中的阿拉伯字符[复制]

时间:2022-10-17 13:15:39

This question already has an answer here:

这个问题已经有了答案:

$test = json_encode('بسم الله');
echo $test;

As a result of this code, the output is: "\u0628\u0633\u0645 \u0627\u0644\u0644\u0647" while it should be something like "بسم الله". Arabic Characters are encoded when being JSON encoded while at the Youtube API this is not the case: http://gdata.youtube.com/feeds/api/videos/RqMxTnTZeNE?v=2&alt=json

由于这段代码,输出是:“\ u0628 \ u0633 \ u0645 \ u0627 \ u0644 \ u0644 \ u0647”而应该是类似“بسمالله”。阿拉伯字符在JSON编码时被编码,而在Youtube API中则不是这样:http://gdata.youtube.com/feeds/api/videos/rqmxtntzene?

You can see at Youtube that Arabic characters are displayed properly. What could be my mistake?

你可以在Youtube上看到阿拉伯字符被正确显示。我的错误是什么?

HINT: I'm working on an API< the example is just for the sake of clarification.

提示:我正在开发一个API <这个例子只是为了说明问题。< p>

4 个解决方案

#1


20  

"\u0628\u0633\u0645 \u0627\u0644\u0644\u0647" and "بسم الله" are equivalent in JSON.

“\ u0628 \ u0633 \ u0645 \ u0627 \ u0644 \ u0644 \ u0647”和“بسمالله”在JSON是等价的。

PHP just defaults to using Unicode escapes instead of literals for multibyte characters.

PHP默认使用Unicode转义代替多字节字符的文字。

You can specify otherwise with JSON_UNESCAPED_UNICODE (providing you are using PHP 5.4 or later).

您可以使用JSON_UNESCAPED_UNICODE(提供您使用的是PHP 5.4或更高版本)指定其他格式。

json_encode('بسم الله', JSON_UNESCAPED_UNICODE);

#2


2  

That is the correct JSON encoded version of the UTF-8 string. There is no need to change it, it represents the correct string. Characters in JSON can be escaped this way.

这是UTF-8字符串的JSON编码版本。不需要修改它,它表示正确的字符串。JSON中的字符可以通过这种方式转义。

JSON can represent UTF-8 characters directly if you want to. Since PHP 5.4 you have the option to set the JSON_UNESCAPED_UNICODE flag to produce raw UTF-8 strings:

如果需要,JSON可以直接表示UTF-8字符。由于PHP 5.4,您可以设置JSON_UNESCAPED_UNICODE标记以生成原始的UTF-8字符串:

json_encode($string, JSON_UNESCAPED_UNICODE)

But that is only a preference, it is not necessary.

但这只是一种偏好,没有必要。

#3


2  

Both formats are valid and equivalent JSON strings:

两种格式都是有效的和等效的JSON字符串:

char
    any-Unicode-character-
        except-"-or-\-or-
        control-character
    \"
    \\
    \/
    \b
    \f
    \n
    \r
    \t
    \u four-hex-digits

If you prefer the unencoded version, simply add the JSON_UNESCAPED_UNICODE flag:

如果您喜欢未编码的版本,只需添加JSON_UNESCAPED_UNICODE标记:

<?php

$test = json_encode('بسم الله', JSON_UNESCAPED_UNICODE);
echo $test;

This flag requires PHP/5.4.0 or greater.

此标志需要PHP/5.4.0或更高版本。

#4


2  

Well, as mentioned before it doesn't matter, since both strings are equivalent. What you SHOULD do however is make sure that the encoded string is decoded before it's send to an output.

正如前面提到的,这并不重要,因为这两个字符串是等价的。但是,您应该做的是确保编码的字符串在发送到输出之前被解码。

echo json_decode($test);

Or because JSON contain most likely more than just a single string:

或者因为JSON包含的可能不仅仅是一个字符串:

$obj['arabic'] = 'بسم الله';
$obj['latin'] = 'abcdef';
$obj['integer'] = 12345;

$test = json_encode($obj);

$testobject = json_decode($test);
echo $testobject['arabic'];

#1


20  

"\u0628\u0633\u0645 \u0627\u0644\u0644\u0647" and "بسم الله" are equivalent in JSON.

“\ u0628 \ u0633 \ u0645 \ u0627 \ u0644 \ u0644 \ u0647”和“بسمالله”在JSON是等价的。

PHP just defaults to using Unicode escapes instead of literals for multibyte characters.

PHP默认使用Unicode转义代替多字节字符的文字。

You can specify otherwise with JSON_UNESCAPED_UNICODE (providing you are using PHP 5.4 or later).

您可以使用JSON_UNESCAPED_UNICODE(提供您使用的是PHP 5.4或更高版本)指定其他格式。

json_encode('بسم الله', JSON_UNESCAPED_UNICODE);

#2


2  

That is the correct JSON encoded version of the UTF-8 string. There is no need to change it, it represents the correct string. Characters in JSON can be escaped this way.

这是UTF-8字符串的JSON编码版本。不需要修改它,它表示正确的字符串。JSON中的字符可以通过这种方式转义。

JSON can represent UTF-8 characters directly if you want to. Since PHP 5.4 you have the option to set the JSON_UNESCAPED_UNICODE flag to produce raw UTF-8 strings:

如果需要,JSON可以直接表示UTF-8字符。由于PHP 5.4,您可以设置JSON_UNESCAPED_UNICODE标记以生成原始的UTF-8字符串:

json_encode($string, JSON_UNESCAPED_UNICODE)

But that is only a preference, it is not necessary.

但这只是一种偏好,没有必要。

#3


2  

Both formats are valid and equivalent JSON strings:

两种格式都是有效的和等效的JSON字符串:

char
    any-Unicode-character-
        except-"-or-\-or-
        control-character
    \"
    \\
    \/
    \b
    \f
    \n
    \r
    \t
    \u four-hex-digits

If you prefer the unencoded version, simply add the JSON_UNESCAPED_UNICODE flag:

如果您喜欢未编码的版本,只需添加JSON_UNESCAPED_UNICODE标记:

<?php

$test = json_encode('بسم الله', JSON_UNESCAPED_UNICODE);
echo $test;

This flag requires PHP/5.4.0 or greater.

此标志需要PHP/5.4.0或更高版本。

#4


2  

Well, as mentioned before it doesn't matter, since both strings are equivalent. What you SHOULD do however is make sure that the encoded string is decoded before it's send to an output.

正如前面提到的,这并不重要,因为这两个字符串是等价的。但是,您应该做的是确保编码的字符串在发送到输出之前被解码。

echo json_decode($test);

Or because JSON contain most likely more than just a single string:

或者因为JSON包含的可能不仅仅是一个字符串:

$obj['arabic'] = 'بسم الله';
$obj['latin'] = 'abcdef';
$obj['integer'] = 12345;

$test = json_encode($obj);

$testobject = json_decode($test);
echo $testobject['arabic'];