A new default method CharSequence.isEmpty() was added in the just-released JDK 15. This broke the Eclipse Collections project. Fortunately, the EC developers were testing the JDK 15 early access builds. They noticed the incompatibility, and they were able to ship a patch release (Eclipse Collections 10.4.0) before JDK 15 shipped. They also reported this to the OpenJDK Quality Outreach program. As a result, we were able to document this change in a release note for JDK 15.
Kudos to Nikhil Nanivadekar and Don Raab and the Eclipse Collections team for getting on top of this issue!
What’s the story here? Aren’t new JDK releases supposed to be compatible? In general, yes, we try really hard to keep everything compatible. But sometimes incompatibilities are unavoidable, and sometimes we just miss stuff. To understand what happened, we need to discuss two distinct concepts: source incompatibility and binary incompatibility.
A source incompatible change is one where a source file compiles just fine on an earlier JDK release but fails to compile on a more recent JDK release. A binary incompatible change is one where a compiled class file runs fine on an earlier JDK release but fails at runtime on a more recent JDK release.
In development of the JDK, we put in quite a bit of effort to avoid binary incompatible changes, since it’s unreasonable to force people to recompile everything, and potentially maintain different artifacts, for different JDK releases. Ideally, we’d like to enable people to provide a single binary artifact (e.g., a jar file) that runs on all of the JDK releases that their project supports.
We are somewhat more tolerant of source incompatible changes. If you’re recompiling something, then presumably you have access to the source code in order to make a few minor adjustments. We’re willing to make minor source incompatible changes to the JDK if the change provides enough value to justify the incompatibility.
It turns out that adding a default method to an interface is potentially both a source and binary incompatible change. I was a bit surprised by this. What’s going on?
Let’s first set aside default methods on interfaces and look just at adding methods to classes. Making changes to a class potentially affects subclasses. In most cases, adding a method to a class is a binary compatible change, even if the subclass has methods that are apparently in conflict with the new method in the superclass. For example, consider this class compiled on JDK 8:
class MyInputStream extends InputStream {
public String readAllBytes() { ... }
...
}
This works fine. However, a method was added to InputStream on JDK 9:
public byte[] readAllBytes()
Now there is a conflict between InputStream and MyInputStream, since they have methods with the same name, the same parameters (none), but different return types. Despite this conflict, this is a binary compatible change. Any already-compiled classes that invoke the readAllBytes() method on an instance of MyInputStream will do so using this bytecode:
invokevirtual #6 // Method MyInputStream.readAllBytes:()Ljava/lang/String;
(I determined this by compiling a program that uses MyInputStream on JDK 8, and then running the javap -c
command on the resulting class file.) Roughly, this says “invoke the method named «readAllBytes» that takes no arguments and returns a String.” That method exists on MyInputStream and not on InputStream, so the method invocation works even on JDK 9.
However, this is a source incompatible change. When I try to recompile MyInputStream.java on JDK 9, the result is this:
MyInputStream.java:13: error: readAllBytes() in MyInputStream cannot override readAllBytes() in InputStream
public String readAllBytes() {
^
return type String is not compatible with byte[]
The compatibility analysis of adding methods to classes is fairly straightforward. There is only one path from the current class up the superclass chain to the root class, java.lang.Object
. Any conflicts among methods can only occur on this path.
Analysis of adding default methods to interfaces is more complicated, because a class or interface can inherit from multiple interfaces. This means that, looking upward from the current class, instead of there being a linear chain of superclasses up to Object, there is a branching tree (actually a DAG) of interface inheritance. This gives rise to several inheritance possibilities that cannot occur with class-only inheritance.
Also, since default methods are a relatively recent feature, the Java community has relatively less experience evolving APIs using default methods. Default methods were added in Java 8, which was released in 2014, so we have “only” six years of experience with it.
It was possible to have conflicts among interfaces, even before Java 8, for example, if two unrelated interfaces declared the same method but with different return types. Prior to Java 8, though, interfaces were essentially impossible to evolve, and so having such conflicts arise from interface evolution hardly occurred. Finally, in the pre-Java 8 world, interface methods were all abstract. If a class inherited the “same” method (same name, parameters, and return type) from different interfaces, that was OK, as both could be satisfied by a single implementation provided by the class or one of its superclasses.
With the addition of default methods in Java 8, a new problem arose: what if a default method were added to an interface somewhere, such that conflicts between method implementations might arise somewhere in the superclass and superinterface graph? More specifically, what if the superinterface graph contains two default implementations for the same method? The full rules are described in the Java Language Specification, sections 8.4.8 and 8.4.8.4, and there are lots of edge cases, but briefly, the rules are as follows:
- Methods inherited from the class hierarchy take precedence over default methods inherited from interfaces.
- Default methods in interfaces are allowed to override each other; the most specific override takes precedence.
- If multiple default methods are inherited from unrelated interfaces (that is, one doesn’t override the others), that’s a compile-time error.
Here are some examples of these rules in action:
class S {
public void foo() { ... }
}
interface I {
default void foo() { ... }
}
interface J extends I {
default void foo() { ... }
}
interface K {
default void foo() { ... }
}
Given this class and these interfaces, how do the inheritance rules work?
class C extends S implements I { }
// ok: class wins, S::foo inherited
class D implements I, J { }
// ok: overriding default method wins, J::foo inherited
class E implements I, K { }
ERROR: types I and K are incompatible;
class E inherits unrelated defaults for foo() from types I and K
So now we have to think harder about the compatibility impact of adding a default method. If a class already has the method, we’re OK. If there’s another interface that has a default method that overrides or is overridden by the default method we’re adding, that’s OK too. A problem can only occur if there is another default method somewhere in the interface graph inherited by some class.
That’s what’s going on with source compatibility. If you run through the examples above, you can see the kind of compilation error that might arise. What about binary compatibility? It turns out that the rules for binary compatibility with default methods are actually quite similar to those for source compatibility.
Here’s what the Java Virtual Machine Specification says about how invokevirtual finds the method to call. It first talks about method selection:
A method is selected with respect to [the class] and the resolved method (§5.4.6).
Section 5.4.6 says:
The maximally-specific superinterface methods of [the receiver class] are determined (§5.4.3.3). If exactly one matches [the method]’s name and descriptor and is not
abstract
, then it is the selected method.
OK, what if there isn’t exactly one match? In particular, what if there are multiple matches? Back in the specification of invokevirtual, it says:
If no method is selected, and there are multiple maximally-specific superinterface methods of [the class] that match the resolved method’s name and descriptor and are not
abstract
, invokevirtual throws anIncompatibleClassChangeError.
Thus, the JVM has to do quite a bit of analysis at runtime. When a method is invoked on some class, it has to not only search for that method up the class hierarchy. It also has to search the graph of interface inheritance to see if a default method might have been inherited, and that there is exactly one such method. Thus, adding a default method to an interface can easily cause problems for existing, compiled classes — a binary incompatibility.
We always examine the JDK for incompatibilities and avoid them if possible. In addition, we look at popular non-JDK libraries to see if problems might occur with them. This kind of incompatibility can occur only if a non-JDK library has a signature-compatible default method in an interface that is unrelated to the JDK interface being modified. It also requires that there be some class that inherits both that interface and the JDK interface. That seems pretty rare, but it can happen.
In fact, this is exactly the case that came up in Eclipse Collections! The Eclipse Collections library has an interface PrimitiveIterable
that implements a default method isEmpty
, and it also has a class CharAdapter
that implements PrimitiveIterable
and CharSequence
:
interface PrimitiveIterable {
default boolean isEmpty() { ... }
}
class CharAdapter implements PrimitiveIterable, CharSequence {
...
}
This works perfectly fine in JDK 14 and earlier releases. Consider some code that calls CharAdapter.isEmpty()
. The bytecode generated would be as follows:
invokevirtual #13 // Method org/eclipse/collections/impl/string/immutable/CharAdapter.isEmpty:()Z
This works on JDK 14, because invokevirtual searches all the superclasses and superinterfaces of CharAdapter
, and it finds exactly one default method: the one in PrimitiveIterable
.
On JDK 15, the situation is different. A new default method isEmpty()
was added to CharSequence
. Thus, when the same invokevirtual bytecode is executed, it searches the superclasses and superinterfaces of CharAdapter
, but this time it finds two matching default methods: the one in PrimitiveIterable
and the one in CharSequence
. That’s an error according to the JVM Specification, and that’s exactly what happens:
java.lang.IncompatibleClassChangeError: Conflicting default methods: org/eclipse/collections/api/PrimitiveIterable.isEmpty java/lang/CharSequence.isEmpty
What’s to be done about this? Fortunately, the fix is pretty simple: just add an implementation of isEmpty()
to the CharAdapter
class. (A couple other classes, CodePointAdapter
and CodePointList
, are in a similar situation and were also fixed.) In this case the implementations of isEmpty()
are so simple that the code this.length == 0
was just inlined. If for some reason it were necessary to have CharAdapter
inherit the implementation from PrimitiveIterable
, then the implementation in CharAdapter
could have been written like this:
@Override
public boolean isEmpty()
{
return PrimitiveIterable.super.isEmpty();
}
As mentioned above, this fix was delivered in Eclipse Collections 10.4.0, which was delivered in time for JDK 15. Again, thanks to the EC team for their quick work on this.
❧
OK, that’s how the JVM behaves. Why does the JVM behave this way? That is, why does it throw an exception (really, an Error) if it detects multiple default methods among the superinterfaces? Couldn’t it, for example, remember what method was called on JDK 14 (the one on PrimitiveIterable
), and then continue to call that method even on JDK 15?
The explanation requires understanding of some background about virtual methods. Consider a simple class hierarchy in a library:
class A {
}
class B extends A {
void m() { }
}
class C extends B {
}
Suppose further that an application has this code:
void exampleCode(B b) {
b.m();
}
What method is called? Clearly, this will invoke the B::m
. Now suppose that the library is modified as follows:
class A {
void m() { } // method "promoted" from B
}
class B extends A {
}
class C extends B {
void m() { } // a new overriding method
}
and the application is run again. Even though the code is invoking method m on B, we don’t know which method will actually be invoked. If the variable b is an instance of B, then A::m
will be invoked. But if variable b is an instance of C, then C::m
will be invoked.
The method that actually gets invoked depends on the class of the receiver object and the class hierarchy that has been loaded into in this JVM. There is nothing written down anywhere that says that the application used to call B::m
. In fact it would be a mistake for something to be written down that causes B::m
to continue to be invoked. When an overriding method is added to class C, calls that used to end up at B::m
should now be calling C::m
. That’s what we want virtual method calls to do.
It’s similar with superinterfaces (though more complicated of course). The JVM needs to do a search at runtime to determine what method to call. If it finds two default methods, such as PrimitiveIterable::isEmpty
and CharSequence::isEmpty
, there is no information to tell the JVM that the code used to call PrimitiveIterable::isEmpty
and that the CharSequence::isEmpty
method was added in the most recent release. All the JVM knows is that it’s been asked to invoke a method, it found two, and it has no further information about which to call. Therefore, the only thing it can do is throw an error.
❧
Finally, could this problem have been avoided in the first place? The JDK team had done some analysis to determine whether adding CharSequence.isEmpty()
would cause any incompatibilities. The analysis probably looked for no-arg methods with the same name but with a different return type. It might have looked for a method named isEmpty()
with a non-public access level, another cause of incompatibilities. But these are both source incompatibilites. Or maybe the analysis missed Eclipse Collections entirely.
One thing that future analyses ought to look for is interfaces with a matching default method. That would have turned up PrimitiveIterable
, and which runs the risk of binary incompatibility. By itself this isn’t a problem, but it would cause a problem for any class that implements both interfaces. It turns out that CharAdapter
(and related classes) do implement both, so that’s clearly a binary incompatibility.
Even if CharAdapter
and friends didn’t exist (and even now after they’ve been fixed) there is still a possibility that further incompatiblities exist. Consider some application class that happens to implement both PrimitiveIterable
and CharSequence
. That class might work perfectly fine with Eclipse Collections 10.3.0 and JDK 14. But it will fail with JDK 15. The problem will persist even if the application upgrades to Eclipse Collections 10.4.0, since the incompatibility is with the application class, not with CharAdapter
and friends. So, that application will have to be fixed, too.
Now that we’ve described the problem and the possibility of incompatibilities, does it mean that it was a mistake to have added CharSequence.isEmpty()
? Not necessarily. Even if we had noticed the incompatibility in Eclipse Collections prior to the addition of the isEmpty()
default method, we might have gone ahead with it anyway. The criterion isn’t to avoid incompatibility at all costs. Instead, it’s whether the value of adding the new default method outweighs the cost and risk of incompatibility. That said, it would have been better to have noticed the incompatibility earlier and discussed it before proceeding, instead of putting an external project like Eclipse Collections into the position of having to fix something in response to a change in the JDK.
In summary, adding a default method to an interface can result in source and binary incompatibilities. The possibility of the source incompatibility is perhaps obvious, but the binary incompatibility is quite subtle. Both of these have been a possibilities since Java 8 was delivered in 2014. But to my knowledge this is the first time that the addition of a default method has resulted in a binary incompatibility with a real project (as opposed to a theoretical exercise or a toy program). It behooves us to do a more rigorous search for potentially conflicting methods the next time we decide to add a default method to an interface in the JDK.
This would require an addition to the binary format, of course, which would take up extra disk space, but could there not be a tag added to new functions that says “this was added in JDK 15” and then the JVM sees one method without that tag and one method with that tag and chooses the method without the tag, since it is older? A similar solution would exist after the tag is added, where if you have two methods that both have a tag, you choose the one with the older tag.
Actually, if the binaries include what version of the JDK they were compiled by, then it should be possible to have the JVM say “PrimitiveIterable::isEmpty” is from a binary compiled by JDK 14 and “CharSequence::isEmpty” is from a binary compiled by JDK 15 (actually from JDK 15 itself), so use the older one. You still have the source incompatibility that would need to be fixed, but you at least wouldn’t have the binary incompatibility.
More information could certainly be added to the class files to try to avoid problems like this. I think the fundamental difficulty is that virtual method dispatch is already quite complex. See the sections of the JLS and JVMS that I cited, for example. Adding version information would probably solve some problems, at the risk of creating more. It would also potentially exhibit apparently incomprehensible behaviors. For example, consider a class file compiled on an old JDK but run on a new JDK. If you took the same source file and compiled it on the new JDK and ran it on the new JDK, it might end up behaving differently from the old class file on the new JDK. We generally avoid having recompilation change behavior, and that’s the kind of thing I’d be worried about if versioning information were embedded into class files.
I’ve updated the article to replace mentions of the “PrimitiveIterator” interface, which doesn’t exist, with mentions of the correct interface, “PrimitiveIterable”.
[…] leader Nikhil Nanivadekar details why the team had to do two releases within the same month. This post from Stuart Marks has more details on the […]