Feeds:
Posts
Comments

Sun → Oracle

And so it ends here. My small note of optimism from early last year is moot; the final chapter of Sun’s history has been completed, and it is filled with words like troubled, beleaguered, and embattled. Oracle closed its acquisition of Sun today. There seems to be a lot of sadness about this event. For example, see here and the nearly 1,000 comments that have been submitted since it was posted.

As I’ve been a Sun employee for over 23 years, one might think that I’d be sad about it. But I’m not. I suppose it is a little sad to see Sun disappear as an independent company. The acquisition needed to happen, and I’m glad it’s happened. How did Sun reach this point?

A lot has been made of Sun culture. You know, Scott McNealy’s “Kick butt and have fun.” The freewheeling engineering atmosphere. Is that still there? Sort of. There are still the long email flame wars, the gallows humor on our Skype chats, the occasional office prank, the gung-ho-take-no-prisoners-we-can-do-it attitude, etc. The courage to do something like inventing a new programming language or a new platform. Yes, that’s still there, at least a little bit.

But there are other aspects of Sun culture that have developed over the years that nobody seems to talk about.

  • Risk-aversion. A lot of internal processes have developed to reduce risk. Reducing risk also reduces reward, and it just slows everything down.
  • Decisions made by committee. I don’t think any individuals make any decisions: getting a decision seems to require the agreement of a roomful of people. You know how hard that can be. Sometimes it requires the agreement of people who aren’t even in the room. I’ve been in project meetings where nobody could make a decision — they all had to talk to their managers first.
  • Industry politics. Do I need to explain this one?
  • Internal politics. In a shrinking company, it often seems that more effort is expended defending one’s piece of shrinking turf than in moving projects forward.
  • Lack of innovation. What? Sun?! Yes. I’ve been on projects where developing innovative technology fell “below the line” in activities that would be staffed. What was above the line? (Hint: see “Industry politics.”)
  • Never canceling any projects. I’ve been on projects have languished for a long time because nobody could push through the decision to cancel them. The one or two customers might get angry. A former colleague mentioned to me that he had been on a “zombie” project at Sun for four years. Four years!! Well, maybe only two years. But still, two years working on a zombie project is a long time. What a waste.

Now, I don’t claim these are the reasons for Sun’s downfall. I’ve only known whatever corner of the company I’ve worked in. Maybe there are other, bigger reasons. But I’ve seen all of the things I mention above, and it would be hard for me to believe that they haven’t contributed to Sun’s demise.

Am I sad or angry? Yes; at least, I was. That’s all in the past. All the wasted time, the zombie projects, the stupid decisions (or indecisions)… Now, I’m no longer a Sun employee, I’m an Oracle employee. I don’t feel any different.

Well, maybe I do. The nine months of limbo is over. Oracle has said that they (we?) will invest aggressively in Java and in JavaFX. (I work on the JavaFX project.) Today I interviewed someone from outside for an open position on our team. It was the first time I’ve interviewed someone in… in I don’t know how many years. There’s a sense of opportunity in the air.

In Jonathan Schwartz’s farewell message to Sun employees, he asked us to emotionally resign from Sun. Many took some offense at that. Me? To the extent that Sun embodies those bullet points I listed above, I’m outta there. Let’s go Oracle!

One door opens, another shuts behind
One sun sets and another sun she rises
– Richard Thompson

Samuelson, first edition

Oh cool, Greg Mankiw wrote about getting a copy of Samuelson’s first edition, which I mentioned previously. Mankiw had the good fortune to work with Samuelson on a regular basis and so he had Samuelson inscribe his copy. My copy isn’t inscribed by Samuelson, but it is inscribed by my Dad, which is pretty cool too.

Paul Samuelson, Nobel laureate and noted professor of economics, has died. See the NYT obituary and some comments from Paul Krugman.

Samuelson is famous for his economics textbook that has been used by college students for decades. The first edition was published in 1948. My father used this textbook in college, and a copy sits on my bookshelf. (Well, it sat there until this morning when I pulled it down to read it.) Who cares about an old economics textbook? According to another Krugman post, it’s still relevant. Pretty cool, I was able to find the passage Krugman cited and read it in print myself.

A note on the title page states, “The quality of the materials used in the manufacture of this book is governed by continued postwar shortages.” Despite this, the book isn’t in bad shape for being over 60 years old. The cloth cover is somewhat scuffed, the pages are a bit yellowed, but they are quite readable and don’t feel like they’re going to fall out. I’ve seen many younger books in much worse condition.

Somehow my father also ended up with a copy of the sixth edition. It was on the bookshelf right next to the first edition. This was published in the 1960s. Not sure where this might have come from; nobody in the family was in college during those years. I also have a copy of the 12th edition (co-authored with Nordhaus) somewhere. I used it when I was in college in the 1980s. The most recent edition was the 18th, published in 2004.

Paul Samuelson, R.I.P.

In my earlier post on this topic I hinted that we had found a resolution to the issue surrounding the warning message, I hinted further in some of my replies to comments, and I even left it as sort of a cliffhanger as to what the resolution was. So, here’s the resolution.

We’ve decided that when a node is added to a group, that node is automatically removed from the group that previously owned it, if any. (Let’s call this the “auto-remove” feature.) We’ve also decided to turn off the warning message by default, but to have it be enabled optionally, possibly via a system property, for debugging purposes. Finally, we’ve relaxed the enforcement of some scene graph invariants in cases where the group’s content sequence is bound.

What’s the big deal? Well, there were a bunch of things pulling us in different directions. During the development of 1.2, we added code to prevent nodes from appearing more than once in the scene graph or creating cycles in the scene graph. One case in particular caused us a lot of trouble: adding a node to a group while the node was already a member of another group. We fully intended to disallow this case, and require that code remove a node from its old group before adding it to the new one. Unfortunately this caused a lot of our internal code to break. In versions 1.1 and prior, auto-remove was the specified behavior, and a surprising amount of code relied on this. Given the number of cases we ran across internally, we were sure that this would break a lot of external code. For this reason we decided to relax the restriction for this particular case, continue the auto-remove behavior temporarily, and issue a warning message instead. In a subsequent release, we were going to change the warning to an error and to remove auto-remove behavior.

During the development of our current release, we kept running into this issue. A couple of us wrote code that we thought was reasonable, yet it surprised us when the warning message came out! We had a few hallway conversations from time to time, but a clear-cut answer never emerged. Finally, we realized that we had to get the interested parties in a room and have a knock-down, drag-out meeting to resolve the issue. And so on October 7, 2009, Amy Fowler, Kevin Rushforth, Richard Bair, and I got into a conference room to decide the issue. Three hours later — with no breaks! — we had decided. Actually, it was a great meeting, without a lot of conflict. There were just a lot of issues to cover. Each of us came into the meeting with our initial opinions, but the issues were so close that I think each one of us switched sides at least once during the meeting.

Obviously I can’t reproduce the entire discussion here, but the gist of the arguments went something like this:

  1. It’s simpler, more efficient, and more consistent to disallow auto-remove.
  2. On the other hand, 1.1 allowed auto-remove, so this is incompatible.
  3. On the third hand, we expect people to set up scene graphs at initialization time and modify nodes in-place, instead of doing scene graph surgery.
  4. On the gripping hand, some applications really do need to move nodes around in the scene graph.
  5. Well, that’s not too difficult, just remove the node from the old group first.
  6. Sometimes (especially with bind) it’s difficult or even impossible to remove the node from the old group first.
  7. But most of the cases we’ve seen with bind are actually poor coding practices that we want to have emit a warning message.

And so on, back and forth, and around in circles.

Let’s start off with the topic of moving nodes around within the scene graph. What’s the problem? Suppose you have a node n that you wanted to insert into group g. If you do this:

insert n into g.content;

This might generate the warning message if n were already a member of another Group. To avoid this, you’d have to do:

if (n.parent instanceof Group) {
    delete n from (n.parent as Group).content;
}

This is a bit subtle but overall pretty straightforward. First we have to test that n.parent is a Group. (It might be a CustomNode instead.) Note also that if n has no parent, n.parent will be null and the instanceof test will fail. If the test succeeds, we can cast n.parent to a Group and then remove n from the content variable. This is a bit inconvenient but not too bad. We could wrap this up in a nice function and use it in a bunch of places. We even considered adding a utility function to the scene graph to handle this.

Things start to get hairy, though, when when you add bind into the mix. What are people doing with bind that makes it difficult or impossible to remove the node from the old group first? Which uses of bind are “poor coding practices” and which are legitimate?

In order to answer these questions, we’ll need to review the semantics of bind. Consider the following:

var p = bind q + r;

If either q or r changes, the expression is recomputed and assigned to p. The rule is that only the subtree of the expression affected by the variable change is re-evaluated. So if q changes, r isn’t recomputed, and instead its saved value is added to q to get the result that’s then assigned to p. This is hard to see if q and r are simple variables, so let’s make the expression a bit more complicated:

var p = bind f(q) + g(r);

where f() and g() are functions. In this expression, if q changes, f() is called with the new value of q, as one would expect. However, the function g() is not called again. Instead, the saved value of the previous call to g(r) is used, added to the new value of f(q), giving the result assigned to p. You can see this by putting println() statements into f() and g() to see when they’re called. Try it!

Now let’s throw an object literal into the expression:

var xval:Number = ...;
var yval:Number = ...;
var p = bind Point2D { x: f(xval) y: g(yval) };

What does this do? Initially it calls f(xval) and g(yval), then creates a new Point2D instance and initializes its x and y variables to the values obtained by calling the functions. Now suppose xval changes.  Naturally f(xval) has to be called again; as before, the saved value of g(yval) is used and function g() isn’t called again. These values are then used to construct a new instance of a Point2D, which is then assigned to p. What happens to the old instance? Well, it still exists (probably) but if it’s no longer referenced, it’ll eventually get garbage collected.

The important point here is that an object literal is an expression that creates a new instance of an object, not unlike calling a constructor. It’s kind of similar to something like this:

var p = new Point2D(f(xval), g(yval));

except that Point2D, being a JavaFX class, doesn’t actually have a (Number, Number) constructor.

Usually we don’t want to create new instances of objects when the bind-expression is re-evaluated. In some sense we might prefer to do something like this:

var p = Point2D { x: bind f(xval) y: bind g(yval) };

Instead of creating a new Point2D object each time xval or yval changes, it would create one object and mutate its variables in-place. This doesn’t work though, since the x and y variables of Point2D are public-init. To bind to a variable initializer in an object literal, that variable must be declared public so that code outside the class can modify it. So if you want to bind something that has type Point2D, you always end up creating new instances.

Most scene graph objects have public variables that can be bound. Consider this:

var rect = Rectangle {
    x: bind xval
    y: bind yval
    width: bind wval
    height: bind hval
};

This works quite nicely. One Rectangle instance is created and assigned to rect. If any of xval, yval, wval, or hval change, the variables in that single Rectangle instance are mutated, and the effect is that the Rectangle moves or changes size in-place in the scene graph. In turn, if you hook these values up to a Timeline or to one of the Transition classes, that’s how you get animations.

Now instead of writing all those binds, couldn’t we just do this?

var rect = bind Rectangle {
    x: xval
    y: yval
    width: wval
    height: hval
};

We could write this, and it would work, in that we’d get a Rectangle animating in the proper way. But it’s enormously wasteful. Each time any of xval, yval, wval, or hval changes, a new Rectangle instance is created and the old one is thrown away.

It gets worse.

var g = Group {
    content: bind [
        Rectangle {
            x: xval
            y: yval
            width: wval
            height: hval
        }
    ]
];

Now, when any of xval, yval, wval, or hval changes, a new Rectangle is created. Then, because the content sequence of the Group is bound, the old Rectangle is removed from the scene graph and the new Rectangle is inserted in its place. It’s quite common for an operation to change both the width and height of the Rectangle, by changing wval and hval. Let’s say wval changes first. This creates a new Rectangle and replaces the old Rectangle. Then, hval changes, and another new Rectangle is created and replaces the one that had just been created. Furthermore, replacing things in the scene graph is more expensive than moving pointers around. Additional calculations occur, such as recomputing bounds, invalidating and caching transformations, etc. Some of this can be done lazily (indeed, in the next release, more of it will be). But it’s undeniable that creating new objects, generating lots of garbage, and doing scene graph surgery because of where a bind is placed, is very expensive. It would be a lot more efficient to rewrite the above code like so:

var g = Group {
    content: [
        Rectangle {
            x: bind xval
            y: bind yval
            width: bind wval
            height: bind hval
        }
    ]
};

This mutates the Rectangle in-place, doesn’t generate any garbage, and doesn’t do any scene graph surgery. We’ve improved the efficiency of some of our code quite significantly just by moving bind around.

Now, what does this have to do with the auto-remove behavior that we’ve been debating? Let’s take a look at this variation:

var g = bind Group {
    rotate: angle
    content: [
        Rectangle {
            x: xval
            y: yval
            width: wval
            height: hval
        }
    ]
};

Note that the bind is outside the group. This might look convenient, since the scene graph will automatically be updated if any of angle, xval, yval, wval, or hval is modified. But look carefully: suppose that the value of angle changes. Since it’s inside the Group object literal, a new Group object is created and its rotate variable is set to the new the new value of angle. There’s no need to create another Rectangle object, though, since none of its values have changed. Instead, the same Rectangle object is placed into the content variable of the new Group, which is then assigned to g. The old Group is unreferenced and will eventually be garbage collected. Now, who removes the Rectangle from the old Group? Aha!

This is where the auto-remove issue comes up. In the above code fragment, changing the value of angle causes the same Rectangle instance to be placed into the new Group, but it’s not removed from the old Group. As I mentioned previously, in release 1.1 auto-remove was the specified behavior, so placing the Rectangle into the new Group would automatically and silently remove it from the old Group. Furthermore, there’s really no way for application code to get in there and remove the Rectangle from the old Group first; the bind processing pretty much happens all at once. This would seem to be an argument in favor of auto-remove.

But wait, this code is pretty wasteful. It turns out that the auto-remove warning message was actually pretty useful, since it pointed out a bunch of places in our code where we were doing stuff like this: generating useless garbage and performing wasteful scene graph surgery. We ended up rewriting the code along these lines:

var g = Group {
    rotate: bind angle
    content: [
        Rectangle {
            x: bind xval
            y: bind yval
            width: bind wval
            height: bind hval
        }
    ]
};

True, we had to write bind five times. But this made things much more efficient, since it didn’t create any excess objects and it avoided doing any scene graph surgery. It also got rid of the warning message! So that convinced us to leave the warning message in, to get people to fix their bad code, and eventually to turn the warning into an error and effectively disallow auto-remove.

Or did it?

I’m attending a couple days of Oracle OpenWorld 2009. This won’t be quite live-blogging, but I’ll be updating this entry with more notes and observations as I get chance to add them. Update: Sections are in inverse chronological order, but within sections the notes are in forward order. I’ve also added details on the Fusion application suite demo and on the Treasure Island party.

Customer Appreciation Event (evening Wed Oct 14)

As a speaker I received a full conference pass, including a pass for the big party on Treasure Island. This was quite a privilege, as apparently Oracle employees (not even speakers) don’t get to go to this party.

[7:30pm] Hundreds of people are lined up to take chartered buses to Treasure Island. There’s apparently no parking there so buses and cabs and taxis are the only way for party guests on and off the island. We arrive and the place is big. Staggeringly big. Mind-bogglingly big. It’s like they’ve set up a county fair here for just one evening. There are two big tents each with several  buffets, rows of booths with more food, more booths with an open bar, carnival midway games (ball tossing, whack a mole, darts etc.) with stuffed animals as giveaways, and carnival rides like a ferris wheel, a drop tower, swing carousel, etc. Oh, and then there was the music.

[9:15pm] Aerosmith arrives onstage at the outdoor theater. There are probably a couple thousand people standing in an open area in front of the stage, and there are a couple thousand more in bleacher seats around them. The bleachers are pretty tall, maybe 40 rows high. They even have luxury boxes at the top, for the conference sponsors of course.

There was more music than talk, but here are some choice quotes from banter with the audience:

We almost didn’t make it tonight, not because of the [B.S.] you heard about in the press, but because of some Mac event. But when we had to choose between Apples and Oracles we knew we made the right choice. — Steven Tyler

What’s the difference between Windows and viruses? Viruses keep getting better! — Steven Tyler [I think that joke would have worked better at the Apple party.]

This is the biggest frat party we’ve ever played. — Joe Perry

Not being a huge Aerosmith fan, I didn’t recognize several of the songs they played. But they did play their big hits. Here’s a partial setlist:

  • Dream On
  • Walkin’ the Dog
  • Love In An Elevator
  • Livin’ On The Edge
  • Walk This Way
  • Sweet Emotion (it looked like Perry played one note on a theremin)

– encore –

  • Joe Perry live duel against his Guitar Hero avatar
  • Train Kept A-Rollin’

[10:50pm] Aerosmith’s set ended, so I wandered over to where Roger Daltrey was playing. His stage was set up in a large indoor tent area. It wasn’t quite as large as the Aerosmith stage but there was probably space for a couple thousand people. Unfortunately the sets overlapped, so he was already in the middle of his set when I arrived. I did hear a couple songs: Ring Of Fire (Johnny Cash style) and Won’t Get Fooled Again.

Other bands up were the Wailers and Three Dog Night. It was getting pretty late at this point though so I decided to skip them head home.

Keynote (afternoon Wed Oct 14)

[2:30pm] The Moscone North foyer (bottom of the escalators) is incredibly crowded. The Oracle logo, is everywhere. Banners, carpets, every surface you can think of. It’s hard to believe this is the same place where JavaOne happens. It looks so different.

The shoe-shine stand is open. Hm, they didn’t have the shoe-shine stand at JavaOne. I wonder why? :-)

[2:50pm] Charles Phillips, Oracle President. Roger Daltrey stepped on stage and said a few words; promo for tonight’s “customer appreciation” concert.

[3:00pm] Kris Gopalakrishnan (founder InfoSys, a platinum sponsor) spoke on IT innovation in industry. Gap between intent and action in IT innovation: 78% of banks think innovation is important, but only 37% have an innovation plan. IT is about interconnection. We no longer should think about a value chain, but a value web. “No man is an island” — similarly, no enterprise is an island.

[3:35pm] Larry Ellison takes the stage.

  • Status update: Oracle Enterprise Linux and Virtual Machine. Very pleased with uptake. 65% of Linux installations running Oracle RDBMS run Oracle Enterprise Linux. Smaller percentages for Red Hat, SUSE, and others.
  • Exadata 2: Sun/Oracle Announcement. This is a Linux/Intel box. Different from the Sunday announcement, which set the TPC-C records, which is a SPARC Enterprise T5440 machine running Oracle Database 11g on Solaris. (IBM is challenging Oracle’s claims of 16x their performance; they say it’s really only 6x the IBM machine’s performance. “They might be right. IBM also forgot to point out that their machine consumes 6x the energy of ours.”) Exadata 2 shows really impressive numbers. For example, a single rack Exadata 2 will do 1 million random I/O operations per second. Two racks do 2m, etc. Exadata 2 is fault tolerant, while IBM is not; Exadata 2 costs about a quarter of the competing IBM product; Exadata 2 is modularly expandable, whereas IBM’s need to be replaced entirely to be upgraded. “The fastest business computer that has ever been built.”

[4:00pm] Special surprise guest: Arnold Schwarzenegger! “This conference is about pumping you up.” California is the world’s technology leader. Technology can help reduce errors in medical care; can help fight global warming; biotech and stem cell research can help Alzheimer’s sufferers; can grow algae for ship fuel; electric cars (Tesla); improved efficiency in the power grid (Smart Grid). Acknowledges two great California technology companies, two great success stories: Oracle and Sun, Larry Ellison and Scott McNealy [applause]. Employers of 16,000 people in CA and 150,000 people worldwide. Wishes the combined companies great success. California’s state IT infrastructure also improving. GIS being used to help firefighters’ helicopters drop fire retardant more accurately even if the area is obscured by smoke. People’s lives and homes at stake. Confident in the future: we face enormous challenges that can be overcome with technological innovation. Global warming negotiation going on in Copenhagen. We hope this goes well but it’s a political negotiation. The real work is done here, in technology. “When this conference is over, don’t go home. Stay in California and spend money! We need the revenue!”

[4:30pm] Larry Ellison returns to the stage.

  • New Product Support System. Proactive problem prevention. Keeps track of configurations; can alert potential problems experienced by other customers with similar configurations. Unification of Enterprise Manager and MyOracle service. A couple demos of service management by Richard Sarwal who has returned to Oracle from VMware.
  • Fusion Applications. Customers have made a huge investment in Siebel, JD Edwards, Peoplesoft,etc. Committed to support these product lines for a decade. [Mild applause.] Oracle has $3bn R&D budget. Can maintain software you’re running today, but also develop new software you can migrate to tomorrow, next year, or in ten years. Fusion is brand new. SOA. Easily connected to existing suites. Fusion coexists with existing applications, can replace individual apps as needed or augment with new ones. Not developed in isolation; developed in close collaboration with customers. Has a “modern UI.” Enormous project: 6000 tables, 20,000 views. Fusion v1 code complete, in test with customers, will be delivered next year. Only suite built on standards-based middleware. All standards-based, all Java-based Fusion middleware. No custom components. UI is business-intelligence driven; “exception-based”. Steve Miranda, Chris Mayo: demo of Fusion. The demo included fixing a problem with an Exadata V2 order from Stark Industries. “We really need to fix this problem, because you know what happens when Tony Stark gets mad.” I don’t think anybody got the joke though.

[5:15pm] Session ends. I now have some dead time until the evening event. This is totally not a geek conference. There are bean-bag chairs at the foot of the escalators in Moscone North (kind of like JavaOne) but they’re only half occupied. Power and network are easy to come by. There are still guys getting their shoes shined.

Oracle Bloggers Meetup (evening Tue Oct 13)

Pythian sponsored a bloggers meetup at a bar in the Metreon this evening. Thanks to Alex Gorbachev for arranging it. Richard and I both went. This was pretty cool. We met a few Oracle folks there. There were also some Sun folks there (including blogger extraordinaire Tim Bray).

It’s not like this was a substantial session or anything but some of the conversations got me thinking about Oracle’s strategy in buying Sun. One perspective is that Oracle is an Enterprise Software company. After all, that’s what they do today, and they have a $23bn business doing it. From this standpoint Oracle would mine out Sun’s software assets (primarily Java) and jettison the rest. But that rests on the assumption that Oracle is simply going to continue to be an Enterprise Software company. That’s not necessarily the case. So, where will Oracle go next?

Some clues are emerging at OpenWorld. Sun hardware has played quite a strong role here. First there were the Sunday announcements of the TPC-C record on Sun’s T5440. Then there’s the Exadata 2 announcement, which is based on Sun hardware. (I’m writing this in retrospect, but Ellison’s Wednesday keynote segment on the Exadata 2 was all about hitting IBM repeatedly and very, very hard.) So maybe Oracle is no longer going to be an enterprise software company, but instead an enterprise systems company.

What does this mean for JavaFX? Of course, nobody knows for sure. But any enterprise systems company is going to have to figure out what to do about the client side. Will Oracle settle for open technologies such as Ajax, or will they rely on other companies’ proprietary technologies like Flash and Silverlight? When Ellison spoke at JavaOne he talked about how important Java and JavaFX are. So maybe that’s a hint at the answer.

Our JavaFX Talk at Oracle Develop (afternoon Tue Oct 13)

Richard Bair and I gave our introduction to JavaFX talk in the “Develop” stream of Oracle OpenWorld. The talk went pretty well. We deviated from the slides a lot though… Rich spent a lot of time typing live code into NetBeans and demonstrating immediate results. This actually went quite well. I think the audience (which consisted entirely of Java developers) appreciated the live coding exercise much more than slideware.

I did my part with a live demo of the JavaFX Production Suite, exporting artwork from Adobe Illustrator using the Suite, and bringing it into a simple JavaFX app in NetBeans and displaying and rotating it.

We were in a fairly small room (capacity 100) and there were about 25 attendees. This seemed small to us compared to JavaOne, where a small room fits 500. But I think this was typical for the Develop talks. While OpenWorld is a huge conference — I think I heard 37,000 attendees — it’s a business conference, not a developer conference. Oracle Develop was relegated to the “ghetto” of the Hilton San Francisco. It was crowded, but the space was much smaller than Moscone. I think Oracle Develop might have had only on the order of a thousand attendees. Compared to JavaOne, a developer conference that usually attracts over 10,000 and has had up t0 25,000, Oracle Develop is quite small.

Having a small audience had its advantages. There’s a lot less pressure, it’s more interactive, and while not quite intimate, it was easier to create personal connections with people in the audience. There were some good questions and a couple attendees hung around with us afterward and had in-depth conversations. I think they learned a lot from our talk, and in turn we learned a lot from talking to them. In that respect the talk was a success.

Have you ever run across an ugly and irritating warning message that looks like this?

WARNING * WARNING * WARNING * WARNING * WARNING
An attempt has been made to add node to a new group without
first removing it from its current group. See the class
documentation for javafx.scene.Node for further information.
This request will be granted temporarily but it will
be refused in the future. Please change your code now.

Maybe you were writing some code that seemed innocuous. Or, you had some code that seemed to work just fine under 1.1 and all of a sudden it emits this warning message under 1.2. What happened?

Since I’m the guy who added this warning message, I guess I need to explain it.

First, a bit of history. Even though the JavaFX graphics runtime has what we call a scene graph, it’s really a scene tree. Any node has at most one parent. It would be really cool if you could have a node or even a subtree of nodes in more than one place in the tree (I mean, graph). You could construct some complicated structure of nodes and then have it appear multiple places in the graph, stamped out several times like a stencil. What’s more, you could make a change once to this subtree and then have the change be visible simultaneously in a bunch of different places. Sweet! Alas, this doesn’t actually work.

In principle one could build a graphics system this way, but it turns out that it’s hard to make such a system go fast. Things are potentially cached at each level of the tree in order to avoid recomputation. If a node could appear twice in the graph, for example, its coordinate system couldn’t be cached. It would in fact have two coordinate systems — which one is used might depend upon the path one takes from the root to this node. Input events also have to be rethought. If you get a mouse-clicked event on a node that occurs several times in the scene graph, in which one did it occur? All of them simultaneously?

It might be possible to rearrange the implementation to allow multiple parents, and to redefine the semantics of mouse input, and a whole bunch of other stuff, but the rule stands: a node can occur at most once in the scene graph. Attempting to use a node in more than one place is an error, or at least, there has to be code in the scene graph to deal with the possibility of an application trying to put a node in more than one place.

When I first approached JavaFX during 1.0 development, I recall thinking some thoughts similar to what I wrote above. What happens when I put a node in several places in the scene graph? Nothing terribly bad happened. On the other hand, things didn’t behave as I expected either.

It turns out that one case was well-defined. If a node was in group, and you added that node to another group, it would automatically be removed from the first group. But there were a bunch of other cases that weren’t defined. What if you returned a node from CustomNode’s create() function, and that node already belonged to another group? Or had already been returned by another CustomNode’s create() function? Or what if you tried to use a node as a clip of another node and also put it into a group?

In JavaFX 1.1 and prior, most of these cases weren’t defined. Even though they were illegal, there was little code to check for or enforce these cases. It was even possible to introduce a circularity into the scene graph! If you did this, the system would go into infinite recursion doing painting or calculating bounds, until it hit a stack overflow. This is, in a word, bad.

During 1.2 development we decided to tackle this issue. There are three different places where nodes occur that affect the scene graph structure:

  1. in a Group’s content sequence
  2. returned from CustomNode.create()
  3. in the clip variable of a node

In each code path that alters one of these structural relationships, we added code to ensure that (a) the node wasn’t already being used elsewhere, and (b) the change didn’t introduce a circularity into the scene graph. For example, try this out:

var g1 = Group { };
var g2 = Group { content: g1 };
g1.clip = g2;

This introduces a circularity, since g2’s content includes g1, but g1’s clip is g2. This code will generate a nice IllegalArgumentException. Another example occurs with CustomNode. Consider the following code:

var r = Rectangle { width: 10 height: 10 };
class MyCustomNode extends CustomNode {
    override function create() { r }
}
MyCustomNode { }
MyCustomNode { }

This is also an error, since it creates two CustomNodes that are attempting to have the same node (the rectangle) as their child.

When we added the enforcement code, we were surprised that it uncovered a few bugs and quite a number of questionable coding practices in our code, both in our runtime library and in our samples. In one case a node from the scene graph was also being used as a clip. This was illegal and didn’t actually work, but nobody had noticed up to that point. As for questionable coding practices, the enforcement turned up quite a number of cases where a scene graph was being constructed with some initial structure, and some code later on would rearrange the nodes into a different structure for no good reason. This caused the scene graph machinery to do a lot of extra work. The fix was to rewrite the code to create the scene graph in the desired structure, avoiding the rearranging and any error messages that the old code had caused.

Enforcing these scene graph invariants was mostly uncontroversial, except for one case: what should happen if you added a node to a group, and that node was already a child of another group? There were basically two camps, which I’ll call the purist camp and the pragmatic camp. The purist camp said, you should always create the scene graph in the right structure in the first place, and rearranging it is usually an error or a performance problem. The pragmatic camp said, the 1.1 specification allowed moving nodes from group to group, and there’s a lot of code out there that does this, so we ought to allow it. (I’m not actually sure which camp I was in. I probably switch camps several times. In fact, I think everybody involved in the discussion switched camps at least once.)

The time came to ship 1.2, and the purist and pragmatic camps hadn’t reached a resolution, so we settled on a compromise. Instead of allowing or disallowing moving a node from one group to another, we decided to allow the behavior temporarily and also add a warning that the behavior would change in the future. That’s where the warning message came from, and that’s where things stand in 1.2.

By the way, if you do run across the need to move a node from one group to another, you can do it without generating the warning. You just have to add some code that “unparents” the node from its old group before adding it to the new one. Consider this code for example:

var n:Node = ...;
var g1 = Group { content: n };
// **
var g2 = Group { content: n };

As it stands, this code will generate the warning message, though n will end up as a child of g2 as the code intended. To avoid the warning message, replace the line marked ** with the following code :

delete n from (n.parent as Group).content;

(This is a bit inconvenient. The type of n.parent is Parent, so you have to cast it to Group first in order to get access its content variable.)

The story doesn’t end here. There are some cases where it’s impossible to delete a node from its old Group before adding it to a new Group, and some of these cases are so useful that we want to preserve them. We also like the fact that the warning message flushes out bad code. And the argument rages on. I think we’ve come to a resolution, but that story will have to wait for another time.

Oracle OpenWorld 2009

I’m speaking at Oracle OpenWorld 2009 on Tuesday Oct 13th. This is something new. As part of the still-pending acquisition of Sun by Oracle, the companies have moved to partner more closely in a lot of areas. As part of this effort a handful of Sun talks were marshaled into Oracle OpenWorld kind-of at the last minute. I work on JavaFX, and that’s one of the technologies Oracle is interested in, so I was “invited” to give a technical session there. Pretty cool actually. It wasn’t really at the last minute; we heard about this a couple weeks back. But we bypassed all the usual submission and acceptance processes. This will be quite an adventure, as it’s been a long time since I’ve spoken at a conference other than JavaOne.

Richard Bair is my co-speaker and the talk is: Introduction to JavaFX: Amazing RIA Capabilities for Developers. (Sorry about the title; we didn’t write it, but if we had written the title, I’m sure it would have been much more boring.) The session ID is S312809 and it’s on Tuesday Oct 13th at 4pm in the Hilton Golden Gate 8 room. It’ll be a whirlwind 60-minute tour throughthe entire JavaFX platform. Come join us! It’ll be fun.

The Sunday Business section of the San Jose Mercury News had a nice article today about Rich Internet Application (RIA) platforms, including JavaFX. Also discussed were Adobe’s Flash/Flex/AIR and Microsoft’s Silverlight.

The best thing about the article was that it mentioned Sun Microsystems without using any of the words “troubled,” “beleaguered,” or “embattled.”

California Budget Crisis

Congratulations to Abel Maldonado for helping to break the logjam in California’s (latest) budget crisis. The budget crisis, which now seems to be an annual event, was especially bad this time, both fiscally and politically. The “gap” in the budget amounted to $41bn. The political situation is not much better.

The classic argument seems to be between the Democrats, who want to close the gap by raising taxes, and the Republicans, who want to close the gap by cutting spending. Of course, things are more complicated than this, but that’s essentially it. The problem is that the situation is so politicized — and so polarized — that nobody can find any common ground, and so progress is extremely difficult. The California Senate was locked in its chambers two nights in a row, and only after that was one senator (Maldonado) bribed enough to cross party lines to support the budget.

When I say “bribed” I don’t literally mean that they paid him off. The Democrats made a number of concessions to win his vote. Most notably they agreed to support open primaries, and they agreed to drop the proposed 12c/gallon gas tax from the budget. The Republicans are not likely to look kindly on Maldonado’s move. After all, earlier in they week, they ousted minority leader Dave Cogdill and replaced him with “anti-tax hard-liner” Dennis Hollingsworth. The reason for Cogdill’s ouster? Because he cooperated with the Democrats in putting together the current proposal! So Maldonado is likely to be viewed as a traitor. They’d draw him and quarter him if they could get away with it.

It’s a risk, but of course there’s something in it for Maldonado. Crossing party lines is likely to put him in jeopardy with the Republican party machine. But if there’s an open primary, he can get Democratic voters to vote for him. After all, he’s the hero who helped save the budget, right? Also, he earned a lot of publicity. Who had heard of Maldonado before the budget vote? Probably nobody outside his district. With the press coverage he’s getting now, it would set him up for another run for a statewide office. (He ran unsuccessfully for state controllers in 2006.)

Unfortunately, at least one of the concessions to get his vote is terrible. I don’t really care that much about open primaries, but removing the gas tax (in preference to leaving the sales tax increase) is a mistake. With the current economic situation, raising the general sales tax is exactly the wrong thing to do. If you’re going to raise any tax, the gas tax is the one to raise. When gas was over $4/gallon last year, people really did change their behavior. They drove less, took public transit more, and stopped buying gigantic SUVs. Yes, it’s painful, but high gas prices will help improve air quality and reduce our dependence on foreign oil. So that’s why the California legislature made a move to keep gas cheap. Wonderful.

Unfortunately, it’s quite common for the legislature to do things that don’t make sense. The usual analysis of the oft-recurring budget problems concludes that the root causes of budget problems lie in Proposition 13 (from 1978) and the requirement of having a two-thirds majority in the legislature to pass a budget. I think these are indeed problems, but the analysis kind of misses the point. The real problem is that the legislature doesn’t know how to save money for the future. Remember the idea of saving money for a rainy day? It means, don’t spend it all when the sun is shining. But when the economy is booming and tax revenues are up, the legislature says “Great, we have all this money, let’s spend it on all those pet projects we’ve always wanted to do!” When the bust comes and tax revenues fall, we get a huge budget gap. I am slightly sympathetic to the Republicans when they say we don’t have a revenue problem, we have a spending problem. But I don’t hear them — or anyone — saying that we should save money (i.e., run a budget surplus) when times are good, so that we won’t have a problem when times are bad.

It’s politically difficult to do this, I know. But as Rahm Emanuel said recently, “You never want a serious crisis to go to waste.” What California needs to learn from this crisis is how to save money.

The other day I wrote an entry about using bind/trigger on a local variable and what can go wrong if you do this. But why would somebody want to do such a thing? Isn’t this just an obscure corner of the language with a curious behavior?

It turns out that this example came up in actual code, and it caused us quite a debugging headache.

Take a look at the HttpRequest class. It has a fairly complicated state machine. The current state of the object is visible both through state variables (started, connecting, doneConnect, etc.) and also through a series of callbacks (onDone, onConnecting, onDoneConnecting, etc.). Strictly speaking, these are redundant. An earlier version of this API didn’t have the callback interfaces. I’m not completely sure why, but I think that callbacks (like listeners) were viewed as a Java-like construct, and the designers of the API wanted an interface with more of a JavaFX-Script flavor. This is completely understandable; the shape of an API is intimately intertwined with the mechanisms and constructs available in the language.

In Java, writing a class with public fields is poor style. It allows uncontrolled writes to the field, and there is no way for anyone — neither the client nor the class’s implementation — to detect when such a field has been modified. Instead of exposing a field, you have to provide a getter and setter methods. If you want a client to be notified when your object’s state changes, you have to set up a listener of some sort. It’s very common to for classes to use listeners to notify clients of state changes. This has led to a proliferation of listener interfaces in the class library, which in turn has led to a proliferation of little listener methods in client code. Most listeners are very small. They usually just copy a value or call an update method. If you have a one-line listener, it requires half a dozen lines of inner class boilerplate and a fair number of confusing braces and parentheses. I think this has contributed quite a bit to Java’s reputation as a verbose language.

For example, in Java, if you have a Rectangle rect that you want always to be 50 pixels over and 100 pixels down relative to the location of otherRect, you’d do something like this. (This doesn’t correspond to any actual Java class, but you can see the point.)

otherRect.addStateChangeListener(
    new StateChangeListener() {
        public void stateChanged(Rectangle otherRect) {
            rect.setLocation(otherRect.getX() + 50, otherRect.getY() + 100);
        }
    }
);

By contrast, in JavaFX Script, an object’s variables can be made read-only to the general public using the public-read access modifier. Furthermore, clients can detect changes to another object’s variables by using the bind mechanism. This leads to style of object coupling where objects expose state via publicly-readable variables, and where clients bind to them in order to pick up state changes. Binding works great if your object’s state variables are updated as a function of some other object’s state variables. The rectangle location updating code would look like this in JavaFX Script:

var rect = Rectangle {
    x: bind otherRect.x + 50
    y: bind otherRect.y + 100
}

This is really cool. I think we’d all agree that the JavaFX Script example is much more concise, powerful, readable, and understandable. Great!

The problem is that, while bind works well when updating values as functions of other values, it doesn’t work so well when you want to take action (that is, perform a procedure) upon certain state changes. Let’s imagine that the HttpRequest object had no onInput callback (as was the case in the past). When the request body becomes available, the input field of the HttpRequest object is set to an InputStream from which the data can be read. In this style of API, instead of callbacks, clients of the HttpRequest class are expected to use bind to detect state changes. Let’s try to write some code that does this.

We want to bind to the request’s input field… but bind can only appear as the initializer of a variable declaration, or as the initializer within an object literal. So we’ll have to cook up another variable upon which to hang the bind:

var req = HttpRequest { ... };
var xyz = bind req.input;

This doesn’t do us much good; all it does is update the xyz variable when req.input changes from null to a valid InputStream. Recall that a bind expression causes re-evaluation of the portions of an expression that are affected by a change to a bound value, including function calls if a bound value is a parameter to a function. So we could try something like this:

var req = HttpRequest { ... };
var xyz = bind handleInput(req.input);

function handleInput(is: InputStream) {
    ...
}

This doesn’t really work, however. We have to return a value from the handleInput() function, and this has to match the type of the xyz variable. But this value is essentially unused and is merely a distraction:

var req = HttpRequest { ... };
var xyz: Boolean = bind handleInput(req.input);

function handleInput(is: InputStream): Boolean {
    ...
    return true;
}

You can leave off the Boolean type declarations on xyz and for the return value of handleInput(), because the compiler will infer the proper type. Still, it’s a bit clunky that you have to declare a useless variable and return a useless value from the handleInput() function.

Isn’t there a better way? There sure is. JavaFX Script has a trigger mechanism (which is spelled on replace) that allows some arbitrary code to be executed when a variable’s value changes. If we were to use a trigger, it would look something like this:

var req = HttpRequest { ... };
var input = bind req.input on replace {
    // read from input here
};

This is quite a bit better. We don’t have to cook up a function with a new name, and we don’t have to declare a useless variable and return a useless value from our function. The on-replace code is tied directly to the new variable, the one that’s bound to the variable we’re interested in. This is pretty concise and powerful.

I’m starting to see this idiom pop up in a lot of code. It’s useful under the following circumstances: a) you want to write code that’s triggered on a readable variable in another object, and b) that other object doesn’t provide a callback function or listener. Ideally, in some sense, you’d want to install a trigger on the variable in the other object. But you can’t do that: you can only install a trigger at the declaration of a new variable. So, you have to declare a new variable of your own, use bind to copy the value of the other object’s variable, and use an on replace trigger to have your code run when the other object’s variable is updated.

This technique reminds me of the Introduce Foreign Method refactoring, where you can’t add a method to another class, so you add it outside and treat it idiomatically as if it were a new method on that class. I’ll therefore call this technique the foreign trigger idiom.

This is all sort-of moot now, since my example is based on the older version of the HttpRequest API that didn’t have callbacks. As of 1.0, the HttpRequest has callbacks, so instead of using a foreign trigger you’d just supply a function as the value of the onInput variable of HttpRequest. But there’s still a need for foreign triggers in other parts of the API. Consider the Image class. This class allows images to be loaded in the background, by setting the backgroundLoading variable. How can you tell when the image is done loading? There’s no callback function, but the progress variable is updated and reaches 100 when the image is finished loading. So you could do something like this:

var img = Image {
    backgroundLoading: true
    ...
};
var progress = bind img.progress on replace {
    if (progress == 100) {
        // take action now that img is done loading
    }
};

You can see this idiom in use in various JavaFX samples, such as the one here.

All well and good. But what does this have to do with the stuff I was talking about earlier, regarding the lifetime of local variables?

If you’re writing a simple script, you typically tend to declare your variables at top level. These variables live as long as your script is running, and objects they refer to aren’t garbage collected for the lifetime of your application. So the foreign trigger idiom works perfectly well for these cases.

Now suppose you’re writing a program where HttpRequest operations are performed repeatedly. For example, you might want to fetch all the photos in a particular photo set, or you might want to fetch all the calendar entries for each day of the month. Clearly, you don’t want to declare separate variables for each of these requests. You’d want to wrap things up in a function, and have this function called repeatedly as often as necessary. The code would look something like this:

function getEntryForDate(date: Date) {
    var req = HttpRequest { ... };
    var input = bind req.input on replace {
        // process input and convert to a calendar entry
    }
}

BANG! Can you see the bug? If not, look again!

The problem is that the trigger was declared on a local variable, and this local variable is subject to garbage collection. This code sometimes works and sometimes doesn’t work. In fact, this is the most insidious kind of bug. You can take the code and isolate it into its own script (using script-level variables) and it will work perfectly. If you use the function as-is and call it from a simple test program, it will almost always work. That’s because in a simple test program, not much else is going on, and GC probably won’t occur. But put this into a big application, and call it 30 times to get all the appointments for a month. GC happens, and suddenly and randomly your triggers stop firing.

The consequences are fairly dire for HttpRequest, since it requires the InputStream to be closed to indicate that processing of the request has been completed. This processing is usually handled by a trigger, but if the trigger is GC’d, processing of the request never completes. The HttpRequest implementation has a limit on the number of outstanding requests. Eventually the pending request limit will be reached, no new requests will be issued, and the system will grind to a halt. We tore our hair out for about a week until we figured out what was going on.

Since the foreign trigger idiom is so common, and since it’s so easy and dangerous to use it on local variables, I’ve filed a bug (JFXC-2168) on this problem. The solution isn’t obvious. There’s some discussion about potential solutions in the bug report.

Moreover, the problem isn’t confined to local variables. If you use the foreign trigger idiom on an object’s variables, you have to make sure the object itself doesn’t get garbage collected. If you don’t keep a reference around to the object in question, it’s liable to get collected itself along with your trigger, and you end up with exactly the same problem. More on that later.

Older Posts »