将ASCII转换为UTF-8编码。

时间:2023-01-06 21:46:40

How to convert ASCII encoding to UTF8 in PHP

如何在PHP中将ASCII编码转换为UTF8

6 个解决方案

#1


45  

ASCII is a subset of UTF-8, so if a document is ASCII then it is already UTF-8.

ASCII是UTF-8的一个子集,所以如果一个文档是ASCII,那么它已经是UTF-8了。

#2


19  

If you know for sure that your current encoding is pure ASCII, then you don't have to do anything because ASCII is already a valid UTF-8.

如果您确信当前的编码是纯ASCII,那么您不需要做任何事情,因为ASCII已经是一个有效的UTF-8。

But if you still want to convert, just to be sure that its UTF-8, then you can use iconv

但是如果你仍然想转换,为了确保它是UTF-8,那么你可以使用iconv

$string = iconv('ASCII', 'UTF-8//IGNORE', $string);

The IGNORE will discard any invalid characters just in case some were not valid ASCII.

忽略将丢弃任何无效字符,以防某些字符不是有效的ASCII。

#3


4  

Use utf8_encode()

使用utf8_encode()

Man page can be found here http://php.net/manual/en/function.utf8-encode.php

可以在这里找到手册页http://php.net/manual/en/function.utf8-encode.php

Also read this article from Joel on Software. It provides an excellent explanation if what Unicode is and how it works. http://www.joelonsoftware.com/articles/Unicode.html

也请阅读乔尔关于软件的文章。如果Unicode是什么以及它是如何工作的,它提供了一个很好的解释。http://www.joelonsoftware.com/articles/Unicode.html

#4


2  

"ASCII is a subset of UTF-8, so..." - so UTF-8 is a set? :)

“ASCII是UTF-8的子集,所以……”- UTF-8是一个集合?:)

In other words: any string build with code points from x00 to x7F has indistinguishable representations (byte sequences) in ASCII and UTF-8. Converting such string is pointless.

换句话说:任何由x00到x7F的代码点构建的字符串在ASCII和UTF-8中都有难以区分的表示(字节序列)。转换这样的字符串是没有意义的。

#5


2  

Use mb_convert_encoding to convert an ASCII to UTF-8. More info here

使用mb_convert_encoding将ASCII转换为UTF-8。更多的信息在这里

$string = "chárêctërs";
print(mb_detect_encoding ($string));

$string = mb_convert_encoding($string, "UTF-8");
print(mb_detect_encoding ($string));

#6


-1  

Using iconv looks like best solution but i my case I have Notice form this function: "Detected an illegal character in input string in" (without igonore). I use 2 functions to manipulate ASCII strings convert it to array of ASCII code and then serialize:

使用iconv看起来是最好的解决方案,但我的情况是,我从这个函数中注意到:“在输入字符串中检测到非法字符”(没有igonore)。我用两个函数操作ASCII字符串将它转换成ASCII码的数组,然后序列化:

public static function ToAscii($string) {
    $strlen = strlen($string);
    $charCode = array();
    for ($i = 0; $i < $strlen; $i++) {
        $charCode[] = ord(substr($string, $i, 1));
    }
    $result = json_encode($charCode);
    return $result;
}

public static function fromAscii($string) {
    $charCode = json_decode($string);
    $result = '';
    foreach ($charCode as $code) {
        $result .= chr($code);
    };
    return $result;
}

#1


45  

ASCII is a subset of UTF-8, so if a document is ASCII then it is already UTF-8.

ASCII是UTF-8的一个子集,所以如果一个文档是ASCII,那么它已经是UTF-8了。

#2


19  

If you know for sure that your current encoding is pure ASCII, then you don't have to do anything because ASCII is already a valid UTF-8.

如果您确信当前的编码是纯ASCII,那么您不需要做任何事情,因为ASCII已经是一个有效的UTF-8。

But if you still want to convert, just to be sure that its UTF-8, then you can use iconv

但是如果你仍然想转换,为了确保它是UTF-8,那么你可以使用iconv

$string = iconv('ASCII', 'UTF-8//IGNORE', $string);

The IGNORE will discard any invalid characters just in case some were not valid ASCII.

忽略将丢弃任何无效字符,以防某些字符不是有效的ASCII。

#3


4  

Use utf8_encode()

使用utf8_encode()

Man page can be found here http://php.net/manual/en/function.utf8-encode.php

可以在这里找到手册页http://php.net/manual/en/function.utf8-encode.php

Also read this article from Joel on Software. It provides an excellent explanation if what Unicode is and how it works. http://www.joelonsoftware.com/articles/Unicode.html

也请阅读乔尔关于软件的文章。如果Unicode是什么以及它是如何工作的,它提供了一个很好的解释。http://www.joelonsoftware.com/articles/Unicode.html

#4


2  

"ASCII is a subset of UTF-8, so..." - so UTF-8 is a set? :)

“ASCII是UTF-8的子集,所以……”- UTF-8是一个集合?:)

In other words: any string build with code points from x00 to x7F has indistinguishable representations (byte sequences) in ASCII and UTF-8. Converting such string is pointless.

换句话说:任何由x00到x7F的代码点构建的字符串在ASCII和UTF-8中都有难以区分的表示(字节序列)。转换这样的字符串是没有意义的。

#5


2  

Use mb_convert_encoding to convert an ASCII to UTF-8. More info here

使用mb_convert_encoding将ASCII转换为UTF-8。更多的信息在这里

$string = "chárêctërs";
print(mb_detect_encoding ($string));

$string = mb_convert_encoding($string, "UTF-8");
print(mb_detect_encoding ($string));

#6


-1  

Using iconv looks like best solution but i my case I have Notice form this function: "Detected an illegal character in input string in" (without igonore). I use 2 functions to manipulate ASCII strings convert it to array of ASCII code and then serialize:

使用iconv看起来是最好的解决方案,但我的情况是,我从这个函数中注意到:“在输入字符串中检测到非法字符”(没有igonore)。我用两个函数操作ASCII字符串将它转换成ASCII码的数组,然后序列化:

public static function ToAscii($string) {
    $strlen = strlen($string);
    $charCode = array();
    for ($i = 0; $i < $strlen; $i++) {
        $charCode[] = ord(substr($string, $i, 1));
    }
    $result = json_encode($charCode);
    return $result;
}

public static function fromAscii($string) {
    $charCode = json_decode($string);
    $result = '';
    foreach ($charCode as $code) {
        $result .= chr($code);
    };
    return $result;
}