如何将UTF-8字节[]转换为字符串?

时间:2023-01-22 20:15:20

I have a byte[] array that is loaded from a file that I happen to known contains UTF-8. In some debugging code, I need to convert it to a string. Is there a one liner that will do this?

我有一个字节[]数组,它是从一个碰巧知道包含UTF-8的文件中加载的。在一些调试代码中,我需要将其转换为字符串。有一艘班轮可以做到这一点吗?

Under the covers it should be just an allocation and a memcopy, so even if it is not implemented, it should be possible.

在幕后,它应该只是一个分配和一个memcopy,所以即使它没有实现,它也应该是可能的。

12 个解决方案

#1


1179  

string result = System.Text.Encoding.UTF8.GetString(byteArray);

#2


268  

There're at least four different ways doing this conversion.

至少有四种不同的方法来实现这种转换。

  1. Encoding's GetString
    , but you won't be able to get the original bytes back if those bytes have non-ASCII characters.

    编码的GetString,但是如果这些字节有非ascii字符,您将无法获得原始字节。

  2. BitConverter.ToString
    The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.

    BitConverter。ToString输出是一个“-”分隔的字符串,但是没有. net内置的方法将字符串转换回字节数组。

  3. Convert.ToBase64String
    You can easily convert the output string back to byte array by using Convert.FromBase64String.
    Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.

    转换。通过使用Convert.FromBase64String,您可以轻松地将输出字符串转换为字节数组。注意:输出字符串可以包含'+'、'/'和'='。如果希望在URL中使用字符串,需要显式地对其进行编码。

  4. HttpServerUtility.UrlTokenEncode
    You can easily convert the output string back to byte array by using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is it needs System.Web assembly if your project is not a web project.

    HttpServerUtility。UrlTokenEncode可以使用HttpServerUtility.UrlTokenDecode轻松地将输出字符串转换为字节数组。输出字符串已经是URL友好的了!缺点是它需要系统。如果您的项目不是Web项目,那么请进行Web组装。

A full example:

一个完整的例子:

byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters

string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1);  // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results

string s2 = BitConverter.ToString(bytes);   // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
    decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes

string s3 = Convert.ToBase64String(bytes);  // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes

string s4 = HttpServerUtility.UrlTokenEncode(bytes);    // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes

#3


20  

A general solution to convert from byte array to string when you don't know the encoding:

当你不知道编码时,从字节数组转换成字符串的通用解决方案:

static string BytesToStringConverted(byte[] bytes)
{
    using (var stream = new MemoryStream(bytes))
    {
        using (var streamReader = new StreamReader(stream))
        {
            return streamReader.ReadToEnd();
        }
    }
}

#4


11  

Definition:

定义:

public static string ConvertByteToString(this byte[] source)
{
    return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}

Using:

使用:

string result = input.ConvertByteToString();

#5


8  

Converting a byte[] to a string seems simple but any kind of encoding is likely to mess up the output string. This little function just works without any unexpected results:

将字节[]转换为字符串看起来很简单,但是任何一种编码都可能使输出字符串出错。这个小函数的工作没有任何意外的结果:

private string ToString(byte[] bytes)
{
    string response = string.Empty;

    foreach (byte b in bytes)
        response += (Char)b;

    return response;
}

#6


7  

Using (byte)b.ToString("x2"), Outputs b4b5dfe475e58b67

使用(字节)b.ToString(x2)、输出b4b5dfe475e58b67

public static class Ext {

    public static string ToHexString(this byte[] hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return string.Empty;

        var s = new StringBuilder();
        foreach (byte b in hex) {
            s.Append(b.ToString("x2"));
        }
        return s.ToString();
    }

    public static byte[] ToHexBytes(this string hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return new byte[0];

        int l = hex.Length / 2;
        var b = new byte[l];
        for (int i = 0; i < l; ++i) {
            b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
        }
        return b;
    }

    public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
    {
        if (bytes == null && bytesToCompare == null) return true; // ?
        if (bytes == null || bytesToCompare == null) return false;
        if (object.ReferenceEquals(bytes, bytesToCompare)) return true;

        if (bytes.Length != bytesToCompare.Length) return false;

        for (int i = 0; i < bytes.Length; ++i) {
            if (bytes[i] != bytesToCompare[i]) return false;
        }
        return true;
    }

}

#7


4  

There is also class UnicodeEncoding, quite simple in usage:

还有类unicodecoding,用法很简单:

ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My Secret Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);

Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));

#8


1  

Alternatively:

另外:

 var byteStr = Convert.ToBase64String(bytes);

#9


1  

A Linq one-liner for converting a byte array byteArrFilename read from a file to a pure ascii C-style zero-terminated string would be this: Handy for reading things like file index tables in old archive formats.

将从文件中读取的字节数组byteArrFilename转换为纯ascii c格式的零终止字符串的Linq一行程序是这样的:方便读取旧归档格式的文件索引表之类的东西。

String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
                              .Select(x => x < 128 ? (Char)x : '?').ToArray());

I use '?' as default char for anything not pure ascii here, but that can be changed, of course. If you want to be sure you can detect it, just use '\0' instead, since the TakeWhile at the start ensures that a string built this way cannot possibly contain '\0' values from the input source.

我用的吗?“作为非纯ascii码的默认字符,当然也可以修改。”如果你想确定你能检测到它,那就用“\0”来代替,因为开始的时候,你可以确保用这种方式构建的字符串不可能包含来自输入源的“\0”值。

#10


1  

BitConverter class can be used to convert a byte[] to string.

位转换器类可用于将字节[]转换为字符串。

var convertedString = BitConverter.ToString(byteAttay);

Documentation of BitConverter class can be fount on MSDN

位转换器类的文档可以在MSDN上找到

#11


1  

To my knowledge none of the given answers guarantee correct behavior with null termination. Until someone shows me differently I wrote my own static class for handling this with the following methods:

据我所知,没有一个给定的答案保证零终止的正确行为。在有人以不同的方式向我展示之前,我编写了自己的静态类来处理这个问题:

// Mimics the functionality of strlen() in c/c++
// Needed because niether StringBuilder or Encoding.*.GetString() handle \0 well
static int StringLength(byte[] buffer, int startIndex = 0)
{
    int strlen = 0;
    while
    (
        (startIndex + strlen + 1) < buffer.Length // Make sure incrementing won't break any bounds
        && buffer[startIndex + strlen] != 0       // The typical null terimation check
    )
    {
        ++strlen;
    }
    return strlen;
}

// This is messy, but I haven't found a built-in way in c# that guarentees null termination
public static string ParseBytes(byte[] buffer, out int strlen, int startIndex = 0)
{
    strlen = StringLength(buffer, startIndex);
    byte[] c_str = new byte[strlen];
    Array.Copy(buffer, startIndex, c_str, 0, strlen);
    return Encoding.UTF8.GetString(c_str);
}

The reason for the startIndex was in the example I was working on specifically I needed to parse a byte[] as an array of null terminated strings. It can be safely ignored in the simple case

startIndex的原因是在我正在处理的示例中,我需要将一个字节[]解析为一个空终止字符串数组。在简单的情况下可以安全地忽略它

#12


0  

Try this:

试试这个:

string myresult = System.Text.Encoding.UTF8.GetString(byteArray);

#1


1179  

string result = System.Text.Encoding.UTF8.GetString(byteArray);

#2


268  

There're at least four different ways doing this conversion.

至少有四种不同的方法来实现这种转换。

  1. Encoding's GetString
    , but you won't be able to get the original bytes back if those bytes have non-ASCII characters.

    编码的GetString,但是如果这些字节有非ascii字符,您将无法获得原始字节。

  2. BitConverter.ToString
    The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.

    BitConverter。ToString输出是一个“-”分隔的字符串,但是没有. net内置的方法将字符串转换回字节数组。

  3. Convert.ToBase64String
    You can easily convert the output string back to byte array by using Convert.FromBase64String.
    Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.

    转换。通过使用Convert.FromBase64String,您可以轻松地将输出字符串转换为字节数组。注意:输出字符串可以包含'+'、'/'和'='。如果希望在URL中使用字符串,需要显式地对其进行编码。

  4. HttpServerUtility.UrlTokenEncode
    You can easily convert the output string back to byte array by using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is it needs System.Web assembly if your project is not a web project.

    HttpServerUtility。UrlTokenEncode可以使用HttpServerUtility.UrlTokenDecode轻松地将输出字符串转换为字节数组。输出字符串已经是URL友好的了!缺点是它需要系统。如果您的项目不是Web项目,那么请进行Web组装。

A full example:

一个完整的例子:

byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters

string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1);  // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results

string s2 = BitConverter.ToString(bytes);   // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
    decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes

string s3 = Convert.ToBase64String(bytes);  // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes

string s4 = HttpServerUtility.UrlTokenEncode(bytes);    // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes

#3


20  

A general solution to convert from byte array to string when you don't know the encoding:

当你不知道编码时,从字节数组转换成字符串的通用解决方案:

static string BytesToStringConverted(byte[] bytes)
{
    using (var stream = new MemoryStream(bytes))
    {
        using (var streamReader = new StreamReader(stream))
        {
            return streamReader.ReadToEnd();
        }
    }
}

#4


11  

Definition:

定义:

public static string ConvertByteToString(this byte[] source)
{
    return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}

Using:

使用:

string result = input.ConvertByteToString();

#5


8  

Converting a byte[] to a string seems simple but any kind of encoding is likely to mess up the output string. This little function just works without any unexpected results:

将字节[]转换为字符串看起来很简单,但是任何一种编码都可能使输出字符串出错。这个小函数的工作没有任何意外的结果:

private string ToString(byte[] bytes)
{
    string response = string.Empty;

    foreach (byte b in bytes)
        response += (Char)b;

    return response;
}

#6


7  

Using (byte)b.ToString("x2"), Outputs b4b5dfe475e58b67

使用(字节)b.ToString(x2)、输出b4b5dfe475e58b67

public static class Ext {

    public static string ToHexString(this byte[] hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return string.Empty;

        var s = new StringBuilder();
        foreach (byte b in hex) {
            s.Append(b.ToString("x2"));
        }
        return s.ToString();
    }

    public static byte[] ToHexBytes(this string hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return new byte[0];

        int l = hex.Length / 2;
        var b = new byte[l];
        for (int i = 0; i < l; ++i) {
            b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
        }
        return b;
    }

    public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
    {
        if (bytes == null && bytesToCompare == null) return true; // ?
        if (bytes == null || bytesToCompare == null) return false;
        if (object.ReferenceEquals(bytes, bytesToCompare)) return true;

        if (bytes.Length != bytesToCompare.Length) return false;

        for (int i = 0; i < bytes.Length; ++i) {
            if (bytes[i] != bytesToCompare[i]) return false;
        }
        return true;
    }

}

#7


4  

There is also class UnicodeEncoding, quite simple in usage:

还有类unicodecoding,用法很简单:

ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My Secret Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);

Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));

#8


1  

Alternatively:

另外:

 var byteStr = Convert.ToBase64String(bytes);

#9


1  

A Linq one-liner for converting a byte array byteArrFilename read from a file to a pure ascii C-style zero-terminated string would be this: Handy for reading things like file index tables in old archive formats.

将从文件中读取的字节数组byteArrFilename转换为纯ascii c格式的零终止字符串的Linq一行程序是这样的:方便读取旧归档格式的文件索引表之类的东西。

String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
                              .Select(x => x < 128 ? (Char)x : '?').ToArray());

I use '?' as default char for anything not pure ascii here, but that can be changed, of course. If you want to be sure you can detect it, just use '\0' instead, since the TakeWhile at the start ensures that a string built this way cannot possibly contain '\0' values from the input source.

我用的吗?“作为非纯ascii码的默认字符,当然也可以修改。”如果你想确定你能检测到它,那就用“\0”来代替,因为开始的时候,你可以确保用这种方式构建的字符串不可能包含来自输入源的“\0”值。

#10


1  

BitConverter class can be used to convert a byte[] to string.

位转换器类可用于将字节[]转换为字符串。

var convertedString = BitConverter.ToString(byteAttay);

Documentation of BitConverter class can be fount on MSDN

位转换器类的文档可以在MSDN上找到

#11


1  

To my knowledge none of the given answers guarantee correct behavior with null termination. Until someone shows me differently I wrote my own static class for handling this with the following methods:

据我所知,没有一个给定的答案保证零终止的正确行为。在有人以不同的方式向我展示之前,我编写了自己的静态类来处理这个问题:

// Mimics the functionality of strlen() in c/c++
// Needed because niether StringBuilder or Encoding.*.GetString() handle \0 well
static int StringLength(byte[] buffer, int startIndex = 0)
{
    int strlen = 0;
    while
    (
        (startIndex + strlen + 1) < buffer.Length // Make sure incrementing won't break any bounds
        && buffer[startIndex + strlen] != 0       // The typical null terimation check
    )
    {
        ++strlen;
    }
    return strlen;
}

// This is messy, but I haven't found a built-in way in c# that guarentees null termination
public static string ParseBytes(byte[] buffer, out int strlen, int startIndex = 0)
{
    strlen = StringLength(buffer, startIndex);
    byte[] c_str = new byte[strlen];
    Array.Copy(buffer, startIndex, c_str, 0, strlen);
    return Encoding.UTF8.GetString(c_str);
}

The reason for the startIndex was in the example I was working on specifically I needed to parse a byte[] as an array of null terminated strings. It can be safely ignored in the simple case

startIndex的原因是在我正在处理的示例中,我需要将一个字节[]解析为一个空终止字符串数组。在简单的情况下可以安全地忽略它

#12


0  

Try this:

试试这个:

string myresult = System.Text.Encoding.UTF8.GetString(byteArray);