类型化数组是否有助于JIT更好地优化?

时间:2022-12-25 17:19:42

My question is the following:

我的问题如下:

Its usual for Java code to have generic collections implemented like:

通常用于Java代码的通用集合实现如下:

public class GenericCollection<T> {
    private Object[] data;

    public GenericCollection () {
        // Backing array is a plain object array.
        this.data = new Object[10];
    }

    @SuppressWarnings( "unchecked" )
    public T get(int index) {
        // And we just cast to appropriate type when needed.
        return (T) this.data[index];
    }
}

And used like this for example:

像这样使用例如:

for (MyObject obj : genericCollection) {
    obj.myObjectMethod();
}

Since the generic type of genericCollection is erased, the JVM doesn't seems to have a way to know that really inside 'data' array of genericCollection there are only MyObject instances, since the actual type of the array is Object, there could be a String in it, and calling 'myObjectMethod' on it would raise an exception.

由于擦除了genericCollection的泛型类型,JVM似乎没有办法知道在genericCollection的'data'数组中真正只有MyObject实例,因为数组的实际类型是Object,可能有一个其中的字符串,并在其上调用'myObjectMethod'会引发异常。

So I'm assuming the JVM has to do some runtime checking gymnastics to know what really is inside that GenericCollection instance.

所以我假设JVM必须做一些运行时检查体操才能知道GenericCollection实例中究竟是什么。

Now look at this implementation:

现在看看这个实现:

public class GenericCollection<T> {
    private T[] data;

    @SuppressWarnings( "unchecked" )
    public GenericCollection ( Class<T> type ) {
        // Create a type specific array.
        this.data = (T[]) Array.newInstance( type, 10 );
    }

    public T get ( int index ) {
        // No unsafe casts needed.
        return this.data[index];
    }
}

In this case we create a type specific array through reflection, so the JVM could infer there could be only be T objects inside that array in a given context, making the unsafe casts and possible expensive type checks redundant.

在这种情况下,我们通过反射创建一个特定于类型的数组,因此JVM可以推断在给定的上下文中该数组中只能存在T对象,这使得不安全的转换和可能的昂贵类型检查变得多余。

My question would be, given the things HotSpot can do, would it help in any way, performance-wise, to implement generic collections with a "proper" type specific backing array?

我的问题是,考虑到HotSpot可以做的事情,它会以任何方式,性能方面帮助实现具有“适当”类型特定支持数组的通用集合吗?

For example, does it helps HotSpot in removing unnecessary type checks or casts? Maybe possibly enabling it to more easily inline methods given it knows the backing array is of a specific type?

例如,它是否有助于HotSpot删除不必要的类型检查或强制转换?也许可能使它更容易内联方法,因为它知道支持数组是特定类型的?

2 个解决方案

#1


6  

Not in this particular case.

不是在这种特殊情况下。

Generic array T[] is erased to Object[] in the bytecode. The array getter for Object[] always returns Object, so it does not need check the actual type of array. Hence there is no benefit in having T[] instead of Object[] for array get operation. In both cases there is aaload instruction followed by checkcast, and it works the same way.

通用数组T []被删除到字节码中的Object []。 Object []的数组getter总是返回Object,因此不需要检查实际的数组类型。因此,对于数组获取操作,使用T []而不是Object []没有任何好处。在这两种情况下都有aaload指令,然后是checkcast,它的工作方式相同。

Meanwhile array setter will perform worse for typed array rather than Object[], because aastore must check that the value matches the actual array component type.

同时,数组设置器对于类型化数组而不是Object []的性能会更差,因为aastore必须检查该值是否与实际的数组组件类型匹配。

That is, your proposed modification works equally for get, but performs worse for set. This can be confirmed by the following JMH benchmark.

也就是说,您提出的修改对于get来说同样有效,但对于set来说表现更差。这可以通过以下JMH基准确认。

package bench;

import org.openjdk.jmh.annotations.*;

import java.lang.reflect.Array;

@State(Scope.Benchmark)
public class Generics {
    private ObjectArray<String> objectArray;
    private GenericArray<String> genericArray;
    private StringArray stringArray;
    private int index;

    @Param("100000")
    private int length;

    @Setup
    public void setup() {
        genericArray = new GenericArray<>(String.class, length);
        objectArray = new ObjectArray<>(length);
        stringArray = new StringArray(length);

        for (int i = 0; i < length; i++) {
            String s = Integer.toString(i);
            objectArray.set(i, s);
            genericArray.set(i, s);
            stringArray.set(i, s);
        }
    }

    @Benchmark
    public String getGenericArray() {
        return genericArray.get(nextIndex());
    }

    @Benchmark
    public String getObjectArray() {
        return objectArray.get(nextIndex());
    }

    @Benchmark
    public String getStringArray() {
        return stringArray.get(nextIndex());
    }

    @Benchmark
    public void setGenericArray() {
        genericArray.set(nextIndex(), "value");
    }

    @Benchmark
    public void setObjectArray() {
        objectArray.set(nextIndex(), "value");
    }

    @Benchmark
    public void setStringArray() {
        stringArray.set(nextIndex(), "value");
    }

    private int nextIndex() {
        if (++index == length) index = 0;
        return index;
    }

    static class GenericArray<T> {
        private T[] data;

        @SuppressWarnings("unchecked")
        public GenericArray(Class<T> type, int length) {
            this.data = (T[]) Array.newInstance(type, length);
        }

        public T get(int index) {
            return data[index];
        }

        public void set(int index, T value) {
            data[index] = value;
        }
    }

    static class ObjectArray<T> {
        private Object[] data;

        public ObjectArray(int length) {
            this.data = new Object[length];
        }

        @SuppressWarnings("unchecked")
        public T get(int index) {
            return (T) data[index];
        }

        public void set(int index, T value) {
            data[index] = value;
        }
    }

    static class StringArray {
        private String[] data;

        public StringArray(int length) {
            this.data = new String[length];
        }

        public String get(int index) {
            return data[index];
        }

        public void set(int index, String value) {
            data[index] = value;
        }
    }
}

And the results:

结果如下:

Benchmark                 (length)  Mode  Cnt  Score   Error  Units
Generics.getGenericArray    100000  avgt   40  5,212 ± 0,038  ns/op  <- equal
Generics.getObjectArray     100000  avgt   40  5,224 ± 0,043  ns/op  <-
Generics.getStringArray     100000  avgt   40  4,557 ± 0,051  ns/op
Generics.setGenericArray    100000  avgt   40  3,299 ± 0,032  ns/op  <- worse
Generics.setObjectArray     100000  avgt   40  2,456 ± 0,007  ns/op  <-
Generics.setStringArray     100000  avgt   40  2,138 ± 0,008  ns/op

#2


2  

No. The type erasure Java Tutorial explains

不是。类型擦除Java Tutorial解释

Generics were introduced to the Java language to provide tighter type checks at compile time and to support generic programming. To implement generics, the Java compiler applies type erasure to:

泛型被引入到Java语言中,以便在编译时提供更严格的类型检查并支持泛型编程。为了实现泛型,Java编译器将类型擦除应用于:

  • Replace all type parameters in generic types with their bounds or Object if the type parameters are unbounded. The produced bytecode, therefore, contains only ordinary classes, interfaces, and methods.
  • 如果类型参数是*的,则将泛型类型中的所有类型参数替换为其边界或对象。因此,生成的字节码仅包含普通的类,接口和方法。

  • Insert type casts if necessary to preserve type safety.
  • 如有必要,插入类型铸件以保持类型安全。

  • Generate bridge methods to preserve polymorphism in extended generic types.
  • 生成桥接方法以保留扩展泛型类型中的多态性。

Thus, after compilation, the generic types are Object.

因此,在编译之后,泛型类型是Object。

#1


6  

Not in this particular case.

不是在这种特殊情况下。

Generic array T[] is erased to Object[] in the bytecode. The array getter for Object[] always returns Object, so it does not need check the actual type of array. Hence there is no benefit in having T[] instead of Object[] for array get operation. In both cases there is aaload instruction followed by checkcast, and it works the same way.

通用数组T []被删除到字节码中的Object []。 Object []的数组getter总是返回Object,因此不需要检查实际的数组类型。因此,对于数组获取操作,使用T []而不是Object []没有任何好处。在这两种情况下都有aaload指令,然后是checkcast,它的工作方式相同。

Meanwhile array setter will perform worse for typed array rather than Object[], because aastore must check that the value matches the actual array component type.

同时,数组设置器对于类型化数组而不是Object []的性能会更差,因为aastore必须检查该值是否与实际的数组组件类型匹配。

That is, your proposed modification works equally for get, but performs worse for set. This can be confirmed by the following JMH benchmark.

也就是说,您提出的修改对于get来说同样有效,但对于set来说表现更差。这可以通过以下JMH基准确认。

package bench;

import org.openjdk.jmh.annotations.*;

import java.lang.reflect.Array;

@State(Scope.Benchmark)
public class Generics {
    private ObjectArray<String> objectArray;
    private GenericArray<String> genericArray;
    private StringArray stringArray;
    private int index;

    @Param("100000")
    private int length;

    @Setup
    public void setup() {
        genericArray = new GenericArray<>(String.class, length);
        objectArray = new ObjectArray<>(length);
        stringArray = new StringArray(length);

        for (int i = 0; i < length; i++) {
            String s = Integer.toString(i);
            objectArray.set(i, s);
            genericArray.set(i, s);
            stringArray.set(i, s);
        }
    }

    @Benchmark
    public String getGenericArray() {
        return genericArray.get(nextIndex());
    }

    @Benchmark
    public String getObjectArray() {
        return objectArray.get(nextIndex());
    }

    @Benchmark
    public String getStringArray() {
        return stringArray.get(nextIndex());
    }

    @Benchmark
    public void setGenericArray() {
        genericArray.set(nextIndex(), "value");
    }

    @Benchmark
    public void setObjectArray() {
        objectArray.set(nextIndex(), "value");
    }

    @Benchmark
    public void setStringArray() {
        stringArray.set(nextIndex(), "value");
    }

    private int nextIndex() {
        if (++index == length) index = 0;
        return index;
    }

    static class GenericArray<T> {
        private T[] data;

        @SuppressWarnings("unchecked")
        public GenericArray(Class<T> type, int length) {
            this.data = (T[]) Array.newInstance(type, length);
        }

        public T get(int index) {
            return data[index];
        }

        public void set(int index, T value) {
            data[index] = value;
        }
    }

    static class ObjectArray<T> {
        private Object[] data;

        public ObjectArray(int length) {
            this.data = new Object[length];
        }

        @SuppressWarnings("unchecked")
        public T get(int index) {
            return (T) data[index];
        }

        public void set(int index, T value) {
            data[index] = value;
        }
    }

    static class StringArray {
        private String[] data;

        public StringArray(int length) {
            this.data = new String[length];
        }

        public String get(int index) {
            return data[index];
        }

        public void set(int index, String value) {
            data[index] = value;
        }
    }
}

And the results:

结果如下:

Benchmark                 (length)  Mode  Cnt  Score   Error  Units
Generics.getGenericArray    100000  avgt   40  5,212 ± 0,038  ns/op  <- equal
Generics.getObjectArray     100000  avgt   40  5,224 ± 0,043  ns/op  <-
Generics.getStringArray     100000  avgt   40  4,557 ± 0,051  ns/op
Generics.setGenericArray    100000  avgt   40  3,299 ± 0,032  ns/op  <- worse
Generics.setObjectArray     100000  avgt   40  2,456 ± 0,007  ns/op  <-
Generics.setStringArray     100000  avgt   40  2,138 ± 0,008  ns/op

#2


2  

No. The type erasure Java Tutorial explains

不是。类型擦除Java Tutorial解释

Generics were introduced to the Java language to provide tighter type checks at compile time and to support generic programming. To implement generics, the Java compiler applies type erasure to:

泛型被引入到Java语言中,以便在编译时提供更严格的类型检查并支持泛型编程。为了实现泛型,Java编译器将类型擦除应用于:

  • Replace all type parameters in generic types with their bounds or Object if the type parameters are unbounded. The produced bytecode, therefore, contains only ordinary classes, interfaces, and methods.
  • 如果类型参数是*的,则将泛型类型中的所有类型参数替换为其边界或对象。因此,生成的字节码仅包含普通的类,接口和方法。

  • Insert type casts if necessary to preserve type safety.
  • 如有必要,插入类型铸件以保持类型安全。

  • Generate bridge methods to preserve polymorphism in extended generic types.
  • 生成桥接方法以保留扩展泛型类型中的多态性。

Thus, after compilation, the generic types are Object.

因此,在编译之后,泛型类型是Object。