Generics are the best we've ever had

November 05, 2007

Several times during the NFJS conference talks, Ted Neward expressed his discontent, bordering on utter disgust, with Java 5 Generics. As he explained it, generics are not a guarantee. Bruce Eckel has given a talk and written a blog entry on the topic, dating back to 2004. If it gives you any indication, the title of Bruce's blog entry is Generics Aren't. What Ted helped to clarify is that by using reflection, it is possible to violate the restrictions that generics are designed to uphold. There isn't much to argue about there. However, I will make the point that while generics aren't, they do provide some intrinsic value. Let's start by defining the problem.

Java 5 Generics works by using "type erasure". This basically means that the information (read effort) that you put into your Java code in the form of angled brackets (no, not XML) is used by the compiler for type-safe checking, then promptly put down the garbage disposal. Thus, at runtime, the type guarantees are absent. It's not all bad though, because there is still a way to get at this information using reflection. But using reflection is a double edge sword since it is also the one responsible for allowing the types to be violated.

I am going to build on the example that Ted gave to show how type erasure allows reflection to get around constraints laid down by generics. Assume that we have a class containing a single property that holds a generic list of integers:

public class TypeErasureBean {
    private List<Integer> integers = new ArrayList<Integer>();

    public List<Integer> getIntegers() {
        return integers;
    }

    public void setIntegers(List<Integer> integers) {
        this.integers = integers;
    }
}

If we were to use this class "normally", meaning without fancy reflection techniques, it is impossible to put anything other than an Integer into the integers collection:

TypeErasureBean bean = new TypeErasureBean();
bean.getIntegers().add(1);
bean.getIntegers().add(2);
bean.getIntegers().add("three"); // invalid!

The compiler is happy as long as you play nice. It rewards you by making casting unnecessary when pulling items out of the collection:

for (Integer i : bean.getIntegers() {
    System.out.println("integer value: " + i);
}

The trouble is, the compiler allows you to be mean and nasty by unleashing reflection on the collection. Let's sidestep the type safety and put a String into the collection using a reflective method invocation:

TypeErasureBean bean = new TypeErasureBean();
bean.getIntegers().add(1);
bean.getIntegers().add(2);
Class listClass = bean.getIntegers().getClass();
Method method = listClass.getMethod("add", Object.class);
method.invoke(bean.getIntegers(), "three");

Now the collection contains a String value, but it doesn't realize it. Actually, it's that the type-erased list just doesn't care. Worse, because we passed the compiler's check, it doesn't have a problem with us making the assumption that the collection contains only Integer values. Thus, when we try to use the for loop this time, we are scolded with a ClassCastException:

integer value: 1
integer value: 2
Exception in thread "main" java.lang.ClassCastException:
  java.lang.String cannot be cast to java.lang.Integer

Clearly, this sucks. We get a false sense of type safety. Things get even uglier when using a language like Groovy. Such an environment doesn't even require us to go as far as using reflection, since Groovy will graciously step in that mud for us:

TypeErasureBean bean = new TypeErasureBean()
bean.integers = [1, 3, "three"]
bean.integers.each {
    Integer i = it // ClassCastException on third entry
}

Perhaps we were better off before when we had to double-check our assumptions using the instanceof operator when pulling objects out of a collection. At least then we knew the collection was potentially wild-west.

So why would I say that this is the best we have ever had? The reasons are simple. First and foremost, generics document the item type that belongs in a collection. Strong-typing is a much better way to state intent then relying on the descriptiveness of a property name. Granted, it is possible that this truth is false, but 99% of the time it won't be false, and even then, proper use of unit testing can get us to 100%.

The other reason is tool support. Without generics, a bean framework such as Spring or Seam would be lost when attempting to convert values from an XML descriptor for the purpose of assigning them to items of a collection. The framework has to behave itself, of course, by honoring the type information provided, but at least the necessary information is there. Here is an example of how to assign values to the integers property using Seam's components.xml:

<component name="typeErasureBean" class="TypeErasureBean">
  <property name="integers">
    <value>1</value>
    <value>2</value>
    <value>three</value>
  </property>
</component>

In this case, Seam throws a NumberFormatException when it gets to the last value (e.g. three) during component initialization. This value cannot be converted to an Integer. How does Seam know that it is supposed to be an Integer if the type of the collection is erased at compile time? Well, by "type erasure", we didn't really mean it entirely. Information is still retained in the class bytecode for the purpose of introspection. Here is how Seam determines the generic type of the items in a list:

// The beanClass, property, and base collection are passed in by parser
Class beanClass = TypeErasureBean.class;
String propertyName = "integers";
Class baseCollection = List.class;

String setMethodName = "set" +
    propertyName.substring(0, 1).toUpperCase() +
    propertyName.substring(1);
// throws java.lang.NoSuchMethodException if does not exist
Method setMethod = beanClass.getDeclaredMethod(
    setMethodName, baseCollection);

Type collectionType = setMethod.getGenericParameterTypes()[0];
if (!(collectionType instanceof ParameterizedType)) {
    throw new IllegalArgumentException(
        "collection type not parameterized");
}

Type[] typeArguments =
    ((ParameterizedType) collectionType).getActualTypeArguments();
if (typeArguments.length == 0) {
    throw new IllegalArgumentException(
        "no type arguments for collection type");
}

Type typeArgument = typeArguments[0];
if (!(typeArgument instanceof Class)) {
    throw new IllegalArgumentException(
        "type argument not a class");
}

Class genericTypeOfList = (Class) typeArgument;

Once the generic type has been determined, the value can be converted and added to the collection being accumulated. Finally, that collection can be assigned to the property of the instantiated bean.

Yikes! It is really difficult to play nice! Worse, if the type cannot be detected, you can either bail, as Seam does, or throw a stone and hope for the best. By the way, Spring uses similar logic to populate collection properties.

While type erasure clearly sucks, generics do have benefits that we never had before. They get us away from the use of vague collections. Oh, precious time, I feel you coming back to me! Extending beyond the human, generics enable bean creation frameworks, like Spring and Seam, to map configuration properties onto collections graciously. The downside is that this benefit comes at a very high cost in terms of runtime introspection. But hey, that's why we use the framework and don't try to code it ourselves! The framework also protects us from invalid types getting into the collections, in the absence of JVM enforcement. And people ask why you need frameworks. Ha!

Generics could have been done right, but they are the best we've ever had.

Posted at 01:16 AM in Java | Permalink Icon Permalink

5 Comments from the Peanut Gallery

1 | Posted by Ricky Clarkson on November 05, 2007 at 04:22 PM EST

It seems strange to jump to the conclusion that erasure sucks, instead of that reflection sucks, and that Groovy sucks. Surely without reflection or Groovy the type safety would be complete.

2 | Posted by Dan Allen on November 12, 2007 at 02:14 PM EST

I am saying that erasure sucks. I most certainly would not agree that either reflection or Groovy sucks. They are stuck with the information they have. If it weren't for erasure, there wouldn't be a need to have them tip toe around the problem.

3 | Posted by David Sachdev on December 03, 2007 at 08:01 PM EST

I think that you have done a great job of showing the problem...and then explaining how the groovy user is then affected. It is clear that erasure wasn't the way to go here - and I'm glad that we got that explanation from Ted, because without it I wouldn't have considered this problem.

4 | Posted by Joe Campbell on December 31, 2007 at 09:36 AM EST

I few people have mentioned that type erasure shouldn't have been done - if this were the case however then we would have had to less gracefully upgrade our code from the 1.4 to 1.5 JVM. The type erasure allows a 'one at a time' class upgrade without sacrificing much. Like most things it was a compromise to make things easier overall.

5 | Posted by Dan Allen on December 31, 2007 at 11:17 AM EST

Yeah, I agree it was probably the right move given the circumstances, which is why I say it was better than nothing. My hope is that it can be implemented correctly in a future JVM in a way that is compatible with the Java 5 syntax but corrects the type erasure. Obviously, Java 1.4 would have to be deprecated, so the time has to be right. We cannot live in the past forever.