Hex编码

编码原理

Hex编码就是把一个8位的字节数据用两个十六进制数展示出来，编码时，将8位二进制码重新分组成两个4位的字节，其中一个字节的低4位是原字节的高四位，另一个字节的低4位是原数据的低4位，高4位都补0，然后输出这两个字节对应十六进制数字作为编码。Hex编码后的长度是源数据的2倍，Hex编码的编码表为

 0 0     1 1     2 2     3 3

 4 4     5 5     6 6     7 7

 8 8     9 9    10 a    11 b

12 c    13 d    14 e    15 f

比如ASCII码A的Hex编码过程为

ASCII码：A (65)

二进制码：0100_0001

重新分组：0000_0100 0000_0001

十六进制：        4         1

Hex编码：41

丁

e4b881

代码实现

使用Bouncy Castle的实现

下面的代码使用开源软件Bouncy Castle实现Hex编解码，使用的版本是1.56。

import java.io.UnsupportedEncodingException;

import org.bouncycastle.util.encoders.Hex;

public class HexTestBC {

    public static void main(String[] args)

            throws UnsupportedEncodingException {

        // 编码

        byte data[] = "A".getBytes("UTF-8");

        byte[] encodeData = Hex.encode(data);

        String encodeStr = Hex.toHexString(data);

        System.out.println(new String(encodeData, "UTF-8"));

        System.out.println(encodeStr);

        // 解码

        byte[] decodeData = Hex.decode(encodeData);

        byte[] decodeData2 = Hex.decode(encodeStr);

        System.out.println(new String(decodeData, "UTF-8"));

        System.out.println(new String(decodeData2, "UTF-8"));

    }

}

程序输出

使用Apache Commons Codec实现

下面的代码使用开源软件Apache Commons Codec实现Hex编解码，使用的版本是1.10。

import java.io.UnsupportedEncodingException;

import org.apache.commons.codec.DecoderException;

import org.apache.commons.codec.binary.Hex;

public class HexTestCC {

    public static void main(String[] args)

            throws UnsupportedEncodingException,

                DecoderException {

        // 编码

        byte data[] = "A".getBytes("UTF-8");

        char[] encodeData = Hex.encodeHex(data);

        String encodeStr = Hex.encodeHexString(data);

        System.out.println(new String(encodeData));

        System.out.println(encodeStr);

        // 解码

        byte[] decodeData = Hex.decodeHex(encodeData);

        System.out.println(new String(decodeData, "UTF-8"));

    }

}

源码分析

Bouncy Castle实现源码分析

Bouncy Castle实现Hex编解码的是org.bouncycastle.util.encoders.HexEncoder类，实现编码时首先定义了一个编码表

protected final byte[] encodingTable =

{

    (byte)'0', (byte)'1', (byte)'2', (byte)'3',

    (byte)'4', (byte)'5', (byte)'6', (byte)'7',

    (byte)'8', (byte)'9', (byte)'a', (byte)'b',

    (byte)'c', (byte)'d', (byte)'e', (byte)'f'

};

然后编码的代码是

public int encode(

    byte[]                data,

    int                    off,

    int                    length,

    OutputStream    out)

    throws IOException

{

    for (int i = off; i < (off + length); i++)

    {

        int    v = data[i] & 0xff;

        out.write(encodingTable[(v >>> 4)]);

        out.write(encodingTable[v & 0xf]);

    }

    return length * 2;

}

解码的实现稍微复杂一点，在HexEncoder的构造方法中会调用initialiseDecodingTable建立解码表，代码如下

protected final byte[] decodingTable = new byte[128];

protected void initialiseDecodingTable()

{

    for (int i = 0; i < decodingTable.length; i++)

    {

        decodingTable[i] = (byte)0xff;

    }

    for (int i = 0; i < encodingTable.length; i++)

    {

        decodingTable[encodingTable[i]] = (byte)i;

    }
decodingTable[<span class="hljs-string">'A'</span>] = decodingTable[<span class="hljs-string">'a'</span>];

decodingTable[<span class="hljs-string">'B'</span>] = decodingTable[<span class="hljs-string">'b'</span>];

decodingTable[<span class="hljs-string">'C'</span>] = decodingTable[<span class="hljs-string">'c'</span>];

decodingTable[<span class="hljs-string">'D'</span>] = decodingTable[<span class="hljs-string">'d'</span>];

decodingTable[<span class="hljs-string">'E'</span>] = decodingTable[<span class="hljs-string">'e'</span>];

decodingTable[<span class="hljs-string">'F'</span>] = decodingTable[<span class="hljs-string">'f'</span>];

}

解码表是一个长度是128的字节数组，每个位置代表对应的ASCII码，该位置上的值表示该ASCII码对应的二进制码。具体到Hex的解码表，第48-59个位置，即ASCII码0-9的位置保存了数字0-9，第65-70个位置，即ASCII码A-F的位置保存了数字10-15，第97-102个位置，即ASCII码a-f同样保存了数字10-15。解码表为

比如array[65] = A

  -1      -1      -1      -1      -1      -1      -1      -1

  -1      -1      -1      -1      -1      -1      -1      -1

  -1      -1      -1      -1      -1      -1      -1      -1

  -1      -1      -1      -1      -1      -1      -1      -1

  -1    ! -1    " -1    # -1    $ -1    % -1    & -1    ' -1

( -1    ) -1    * -1    + -1    , -1    - -1    . -1    / -1

0  0    1  1    2  2    3  3    4  4    5  5    6  6    7  7

8  8    9  9    : -1    ; -1    < -1    = -1    > -1    ? -1

@ -1    A 10    B 11    C 12    D 13    E 14    F 15    G -1

H -1    I -1    J -1    K -1    L -1    M -1    N -1    O -1

P -1    Q -1    R -1    S -1    T -1    U -1    V -1    W -1

X -1    Y -1    Z -1    [ -1    \ -1    ] -1    ^ -1    _ -1

` -1    a 10    b 11    c 12    d 13    e 14    f 15    g -1

h -1    i -1    j -1    k -1    l -1    m -1    n -1    o -1

p -1    q -1    r -1    s -1    t -1    u -1    v -1    w -1

x -1    y -1    z -1    { -1    | -1    } -1    ~ -1      -1

解码的过程实际上就是获取连续两个字节，取这两个字节解码表中对应的数值，然后将这两个数值拼接成一个8位二进制码，作为解码的输出。源码如下：

public int decode(

    byte[]          data,

    int             off,

    int             length,

    OutputStream    out)

    throws IOException

{

    byte    b1, b2;

    int     outLen = 0;
int     <span class="hljs-keyword">end</span> = off + length;

<span class="hljs-keyword">while</span> (<span class="hljs-keyword">end</span> &gt; off)

{

    <span class="hljs-keyword">if</span> (!ignore((char)data[<span class="hljs-keyword">end</span> - <span class="hljs-number">1</span>]))

    {

        <span class="hljs-keyword">break</span>;

    }

    <span class="hljs-keyword">end</span>--;

}

int i = off;

<span class="hljs-keyword">while</span> (i &lt; <span class="hljs-keyword">end</span>)

{

    <span class="hljs-keyword">while</span> (i &lt; <span class="hljs-keyword">end</span> &amp;&amp; ignore((char)data[i]))

    {

        i++;

    }

    b1 = decodingTable[data[i++]];

    <span class="hljs-keyword">while</span> (i &lt; <span class="hljs-keyword">end</span> &amp;&amp; ignore((char)data[i]))

    {

        i++;

    }

    b2 = decodingTable[data[i++]];

    <span class="hljs-keyword">if</span> ((b1 <span class="hljs-params">| b2) &lt; 0)

    {

        throw new IOException("invalid

              characters encountered <span class="hljs-keyword">in</span> Hex data");

    }

    out.write((b1 &lt;&lt; 4) |</span> b2);

    outLen++;

}

<span class="hljs-keyword">return</span> outLen;

}

其中ignore方法的代码如下，解码时会忽略首、尾及中间的空白。

private static boolean ignore(

    char    c)

{

    return c == '\n' || c =='\r' || c == '\t' || c == ' ';

}

示例代码中的Hex工具类持有HexEncoder的实例，并通过ByteArrayOutputStream类实现对byte数组的操作，此外不再赘述。

public class Hex

{

    private static final Encoder encoder = new HexEncoder();

    public static byte[] encode(

        byte[]    data,

        int       off,

        int       length)

    {

        ByteArrayOutputStream    bOut = new ByteArrayOutputStream();
    <span class="hljs-keyword">try</span>

    {

        encoder.encode(data, off, length, bOut);

    }

    <span class="hljs-keyword">catch</span> (Exception e)

    {

        <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> EncoderException(<span class="hljs-string">"exception encoding Hex string: "</span>

                  + e.getMessage(), e);

    }

    <span class="hljs-keyword">return</span> bOut.toByteArray();

}

......

}

Apache Commons Codec实现源码分析

Apache Commons Codec实现Hex编码的步骤是直接创建一个两倍源数据长度的字符数组，然后分别将源数据的每个字节转换成两个字节放到目标字节数组中，Apache Commons Codec支持设置的要转换为大写还是小写。

private static final char[] DIGITS_LOWER =

    {'0', '1', '2', '3', '4', '5', '6', '7',

     '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};

private static final char[] DIGITS_UPPER =

    {'0', '1', '2', '3', '4', '5', '6', '7',

     '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};

public static char[] encodeHex(final byte[] data) {

    return encodeHex(data, true);

}

public static char[] encodeHex(final byte[] data,

                               final boolean toLowerCase) {

        return encodeHex(data,

                toLowerCase ? DIGITS_LOWER : DIGITS_UPPER);

}

protected static char[] encodeHex(final byte[] data,

                                  final char[] toDigits) {

    final int l = data.length;

    final char[] out = new char[l << 1];

    // two characters form the hex value.

    for (int i = 0, j = 0; i < l; i++) {

        out[j++] = toDigits[(0xF0 & data[i]) >>> 4];

        out[j++] = toDigits[0x0F & data[i]];

    }

    return out;

}

Apache Commons Codec实现Hex解码的步骤是首先创建一个原字符串一半长度的字节数组，然后依次将两个连续的十六进制数转换为一个字节数据，转换时使用了JDK的Character.digit方法。

public static byte[] decodeHex(final char[] data)

           throws DecoderException {

    final int len = data.length;

    if ((len & 0x01) != 0) {

        throw new DecoderException("Odd number of characters.");

    }

    final byte[] out = new byte[len >> 1];

    // two characters form the hex value.

    for (int i = 0, j = 0; j < len; i++) {

        int f = toDigit(data[j], j) << 4;

        j++;

        f = f | toDigit(data[j], j);

        j++;

        out[i] = (byte) (f & 0xFF);

    }

    return out;

}

protected static int toDigit(final char ch, final int index)

        throws DecoderException {

    final int digit = Character.digit(ch, 16);

    if (digit == -1) {

        throw new DecoderException(""

                + "Illegal hexadecimal character "

                + ch + " at index " + index);

    }

    return digit;

}

      </div>

    </div>

原文地址：https://www.jianshu.com/p/57c4e8d3f035

posted @
2019-06-12 16:49
星朝
阅读(...)
评论(...)
编辑
收藏