Jeffrey Bosboom's Blog

[blog] [projects] [about]

PLDI 2013 trip report, part 5: colocated workshops

After PLDI 2013 proper, there were a handful of co-located workshops. In this post I’ll share a pair of papers I found interesting. (And with this, I’m finally done with my PLDI 2013 trip report.)

Towards Hinted Collection (Philip Reames, George Necula)

ACM DL, gratis PDF

Garbage collected languages free the programmer from freeing memory, but the unpredictable timing of garbage collection requires that other limited resources (e.g., file descriptors) must still be manually managed to ensure they are released promptly. This is done in both ad-hoc (calling close() or Dispose()) and scope-structured (Java’s try-with-resources or C#’s using) ways. Why not extend these constructs to optionally manage memory?

This paper shows that hinted collectors (garbage collectors given hints that a given object will be dead at the next collection) can be asymptotically more efficient when using multiple threads. Because the garbage collector already has a means to verify if an object is live, incorrect hints only lose the performance advantage rather than triggering a use-after-free bug. Importantly, hints can be added to a program on an as-needed basis at program points where large objects or data structures become dead to tune garbage collection performance. The authors prove the concept by running C and C++ programs under a garbage collector, interpreting free() and delete as deallocation hints, and report a large reduction in pause times.

Hinted collectors would be particularly useful in efficently implementing value semantics (as in C++) on a language runtime that only provides heap allocation (like the JVM). Escape analysis can convert some allocations to stack-local memory, but in production implementations (that I know of), any calls must fully inline for the escape analysis to demote the object to the stack, a serious limitation, especially when combined with virtual calls. Hinted collection would allow the compiler to provide useful lifetime information.

A Bloat-Aware Design for Big Data Applications (Yingyi Bu, Vinayak Borkar, Guoqing Xu, Michael J. Carey)

ACM DL, gratis PDF

This paper observes that because Java applications pay a large abstraction penalty when representing data as objects, traditional Java design principles do not scale to “big data”-processing applications due to heap exhaustion or high GC overhead. The authors propose reducing the abstraction penalty by combining objects with similar lifetimes and by manually managing data in untyped memory (e.g., java.nio.ByteBuffer).

This paper’s findings don’t surprise me at all (indeed, I’d seen similar work in these slides from a PLDI 2009 tutorial), but users of the studied data-processing systems certainly seem surprised at their scalability limits. After years of work on JVM JIT compilers, Java compute performance is pretty good, but memory performance is still poor. (And for those saying “just buy more memory”, note that CPU cache sizes have been flat for years now, and latency to main memory is improving very slowly, the latter especially relevant for Java where everything is a pointer (reference).)