How To Present

Posted by Nicholas Chen Thu, 26 Nov 2009 03:24:19 GMT
How To Present

I gave a talk on "How to Present" a couple of weeks ago as part of the Grad Academic Council at the University of Illinois. Here's the abstract:

Making The World A Better Place, One Presentation At A Time

You've been to your fair share of presentations; some of them were wonderful, some were pedestrian and some were just awful. Sooner or later you will have to give your own presentation, hopefully one that you and your audience will feel good about. So, what can you do to give your best presentation each time?

In this session, I'll discuss general guidelines on what to do before, during and after the presentation. And I'll illustrate my points by giving real examples (sources anonymized, of course, to protect the innocent) of mistakes that all of us have made from time to time. I'll also focus on specific techniques for CS presentations because we are a different breed of presenters with all our graphs, source code and UML diagrams.

If you are going to be making a major presentation for the first time, stop by to pick up some new techniques. And if you are a seasoned presenter, stop by to see if I've made any faux pas in my own presentation.

Here's the PDF version of the slides; and here's the Keynote version (you need to rename the file from to HowToPresent.zip to HowToPresent.key; if you use a browser that automatically unzips the file into a folder, just rename the folder from HowToPresent to HowToPresent.key). The slides are released under the Creative Commons Attribution 3.0 license. The pictures on the slides are from iStockphoto.com and I ask that you don't copy-and-paste them from the slides but purchase you own copies of them to support the artists.

The presentation was recorded and you can watch it here. The high-speed version shows me as I present (in addition to the slides) so you can see the gestures that I am making - my gestures help illustrate the points that I make.

I learned a lot from giving this presentation. And I definitely see plenty of ways to improve the next time I give it. The only way to get better at presenting is to practice by giving more presentations! And, of course, by listening to your audience. So, I value any feedback that you have. Please send them to me at nchen_AT_illinois.edu.


If you are watching the video, my presentation itself ended at around 40 minutes. After that I was taking questions but I forgot to repeat those questions into the microphone so you can't hear them in the video. I have summarized the important points for you below:
  • Roy Campbell mentioned how for certain job talks, the audience expect to see at least one equation on your slides. On my slides, I said that equations should be avoided unless really necessary. The case that Roy mentioned is an example of an exception. Again, it's all about what your audience expects. So do some investigation first on the expectation of the audience before preparing your slides.

  • The person sitting beside Roy made a comment about how much you should have on the slides. In the extreme case, you really don't want to have anything – that way, everyone's attention would be on you. However, for most CS presentations, it really helps to have some diagram/text. It's much easier to describe a system with UML diagrams. Additionally, sometimes you want filler images on your slides – these actually pique your audience's interest from time to time so that they don't get too bored with what you are saying.

  • Danny made a comment about making sure that the last slide that you leave on the screen is the conclusion slide which summarizes your ideas. You should leave this slide on the screen even if you are taking questions (as opposed to have a slide with the word "Questions?" on it). This gives anyone who comes in late the opportunity to at least see a summary of your work.

  • Danny also made an interesting comment about how you should use the first slide to introduce yourself – introduce where you are from, what's your background, etc. This gives the audience the ability to build some rapport with you. And more importantly, it gives them the chance to familiarize themselves with your accent and style before going into the technical parts of the presentation.

  • One person at the back made a very important comment about displaying graphs: you need to mention the units that you are using on the axes. This gives the audience an idea on what the data actually means.

Lessons From Parallelizing Matrix Multiplication

Posted by Nicholas Chen Tue, 24 Nov 2009 05:36:58 GMT

Before you dismiss this as yet another boring matrix multiplication example, consider that this, this, this and a lot of other tutorials on the web all use a naive version of matrix multiplication and then attempt to parallelize that version of it to demonstrate how to speed things up using some technology.

What do I mean by a naive version (since there are many naive ways of doing things depending on the system that we are talking about)? This is the basic C++ code for matrix multiplication used in most examples:

void matmult(int m, int n, int p, double **A, double **B, double **C) {
  for (int i = 0; i < m; i++) {
    for (int j = 0; j < n; j++) {
      for (int k = 0; k < p; k++) {
        C[i][j] += A[i][k] * B[k][j];
      }
    }
  }
}

And this is the smarter version which shouldn't surprise anyone who has taken a computer architecture course:

void matmult(int m, int n, int p, double **A, double **B, double **C) {
  for (int i = 0; i < m; i++) {
    for (int k = 0; k < p; k++) {
      for (int j = 0; j < n; j++) {
        C[i][j] += A[i][k] * B[k][j];
      }
    }
  }
}

What's the difference? The difference is pretty subtle so you might have missed it: the inner loops have been swapped i.e. the j and k iterations.

What can changing two lines do? After all, the results are the same right? Well, take a look at this output from the Shark profiler tool in OS X.

Profiling Matrix Multiplication in Shark
L2 Cache Misses Using Shark on OS X 10.5

The version on the left uses the naive algorithm and has 3 484 436 L2 cache misses for a 1024 x 1024 matrix multiplication. The version on the right uses the smarter algorithm and has 27 176 L2 cache misses. On my Core Duo MacBook Pro, the naive version runs in 52 133 ms and the smarter algorithm runs in 13 730 ms. That's nearly a 4x speedup and we didn't even parallelize anything yet!

Hopefully that last line caught your attention. That's the result on my MacBook Pro with a fairly old two-core processor. Let's run some experiments on a 24-core machine (6 cores per socket) using Intel Xeon CPU E7450 @ 2.40GHz running Linux 2.6.18. The code was compiled using g++-4.2 with -O3.

The naive version runs in 15 962 ms. The smarter version runs in 1 860 ms. Let's try parallelizing the naive version with OpenMP:

void matmult(int m, int n, int p, double **A, double **B, double **C) {
#pragma openmp parallel for
  for (int i = 0; i < m; i++) {
    for (int j = 0; j < n; j++) {
      for (int k = 0; k < p; k++) {
        C[i][j] += A[i][k] * B[k][j];
      }
    }
  }
}

Graph
Execution Time of Parallelized Matrix Multiplication.

Notice that we need to use ~9 threads to break even with the smarter version. Incidentally, if we use the smarter algorithm and parallelize it using 24 threads, it runs in 151 ms!

So what are the lessons from this?

  • Low level details matter for performance - I think it's important to mention those details when parallelizing. We needed quite a few threads (9 threads) before you can break even with the smarter algorithm. And most machines today don't have that many cores yet. And as long as the future manycore machines continue to use hierarchical memory (L1, L2, L3 caches), such memory optimization techniques will always be useful. In fact such optimizations help improve performance even if you are running Java or .NET on a virtual machine. However, ...
  • Low level optimizations can get tricky - this example is simple since we are only swapping the iterations to reduce cache misses. But some other optimizations e.g. tiling require more invasive transformations which is why ...
  • Always advice your users to use well-tuned libraries whenever possible. Matrix multiplication is a well-known algorithm and there are plenty of better ways to do it (look at BLAS and ATLAS). It's a disservice to give users the impression that they should write their own matrix multiplication routine when well-tuned ones are easily available. And that's why, we need to ...
  • Pick better and more interesting examples to parallelize. Matrix multiplication is embarrassingly parallel; the hard part is trying to wrestle with the memory hierarchy to ensure performance. For some other problems, it's hard to see what each thread should do and how to coordinate those threads. So instead, people should put in effort into documenting those more complicated parallelizing techniques into patterns for parallel programming. That's the real test of how good your parallel library/framework/language is.


OOPSLA '09: Onward!

Posted by Nicholas Chen Sat, 07 Nov 2009 19:16:00 GMT

Onward! this year was a bigger conference compared to previous years; it spanned the entire 5 days of OOPSLA. I had a hard time deciding whether to attend the research tracks at OOPSLA or the presentations at Onward! The research tracks at OOPSLA are dedicated to concrete ideas which have been implemented and evaluated; whereas the Onward! tracks are dedicated to newer, interesting and sometimes out-of-this-world ideas with prototypes and preliminary evaluation.

Pretty much everything was interesting at Onward! – yes, even the Harmony-Oriented Programming papers though I have yet to be convinced whether it's more of an deliberate (but interesting) juxtaposition of current programming ideas or something new and useful.

There are two papers that I found particularly relevant to my interests and I will summarize them here for you. Definitely take a look at the other papers too.

Software Evolution and the Moving Picture Metaphor

When we think of software evolution, we usually think of something like the image below ...

Merging in a typical version control system like Mercurial

... where we have a timeline that shows the different commits to the version control system. However, if you think about it, each of those commits are merely snapshots of the software as it evolves. What happens in between those snapshots is lost. We don't have an idea of the debugging sessions, the testing sessions and the trial-and-error sessions that a programmer goes through before committing the code.

Viewing the evolution of software through those snapshots is like watching a movie except that we only get to glimpse at snapshots of the movie at certain points. For the most part, we can definitely get a good idea of the story behind the movie – though we might miss some of the more subtle points.

It's the same with software evolution. Each commit in the version control system tells part of the story. And sometimes that might be enough for what we need since we only care about interesting chunks of software development. But sometimes we care a bit more about the things that happen between snapshots. Using our movie metaphor again, you can think of this as viewing a movie with the director's commentary.

I don't think anyone has explored the idea of capturing software evolution as a movie (instead of snapshots). There has been some work on visualizing software evolution but no one has tried to capture software evolution itself as a movie. People have tried recording entire development sessions – how people design the architecture on whiteboards, how requirements are gathered, how they communicate with one another, etc. For instance, there is the DOLLI project that filmed the entire software development process of a group of university students.

So if we were to create a movie of software development, what would it look like? Here are some interesting questions for an idea like this:

  • What to capture? - Do we capture only the code in the IDE, the debugging sessions, the testing sessions, etc? Or we capture more context like what Tasktop is doing?
  • How much to capture? - Capturing too much information will lead to very boring movies very quickly. We need to find out what's really important to the programmer and filter out everything else.
  • How would we do this non-intrusively? - The programmer just wants to continue programming so we would need to capture this information without affecting how she programs.
  • How to replay this information? - It will be very boring to play it from the beginning to the end. How would we edit the movie to make it interesting and useful?
  • Would something like this be useful? - Who would be interesting in using it? The programmer herself when she is reflecting on her program? Or someone learning the system for the first time?

Like all Onward! papers, this is a preliminary idea. And there is a lot work that needs to be done to see how useful this will be.

Pi: A Pattern Language

Pi: A Pattern Language

I missed the actual presentation but the authors have been nice enough to post a .mov file of their presentation online.

Short summary: You are trying to solve a problem. You know how to express your problem and solution but your programming language just doesn't let you express it succinctly. Nonetheless you persevere and manage to code it within 500 lines though you have now lost the essence of your solution amidst the lines of code. If only you could easily define some language constructs to capture the essence of the problem you are trying to solve and its solution.

This trend isn't new. The DSL folks are very happy to create custom languages. People in Lisp have been doing this for years. And with the rise of interest in intentional programming, language workbenches and language-oriented programming, we might just see more of such systems.

The paper itself does a good job of talking about related work so you can read it and have a good idea of how it compares to a system that you are familiar with. One related work that the authors missed is the K Framework developed at UIUC by the FSL group. The K Framework is worth mentioning because it is a nice and modular approach - using continuations - for solving the context problem while designing your own language constructs. As you design your own constructs there is always the risk of interference especially if you are trying to reuse the same symbols to mean different things for brevity.

To reflect on this trend, it might be good to read (or re-read) the popular essay by Paul Graham on the 100-year language. As languages evolve, they seem to gravitate toward Lisp.

Paul Graham, The Hundred-Year Language:

"The trend is not merely toward languages being developed as open-source projects rather than "research", but toward languages being designed by the application programmers who need to use them, rather than by compiler writers. This seems a good trend and I expect it to continue."

Now the next question to ask would be how to easily develop tools that can grow as your language grows as well. Imagine having a syntax highlighter that is smart enough to recognize new syntax as you define it. Imagine having refactoring tools (my colleague, Jeff, is working on something related) that can help you refactor from an older construct to a newer one. How would you test a system like this: you would have to test your language and also your program written in that language. How would you be able to debug something like this?

PS: The paper has been picked up by reddit and lambda-the-ultimate as well so it might be interesting to read the comments there.


OOPSLA '09: Workshop on Refactoring Tools (WRT)

Posted by Nicholas Chen Tue, 27 Oct 2009 15:56:00 GMT
Extracing Concurrency

There were two major themes at the WRT this year: concurrency and polyglot refactoring.

Concurrency

I actually prefer the term transforming instead of refactoring for converting sequential code to parallel code but I will use the term refactoring here since other people in the community have embraced it.

Two papers were presented that dealt with refactoring to concurrency in the X10 language:

Both papers presented some prototype refactorings. I really liked the presentation of the second paper since it relates directly with the work I am doing on optimization patterns for parallel programming. As you can see from the image above, the tiling operation is one such optimization. It's important to provide such a transformation since you may need to do it to get good performance. Such transformations cannot be done by the compiler since it needs more feedback from the programmer on the size of the tiles which depends on the algorithm and data structures for that particular problem that you are trying to solve. And sometimes tiling might not even solve the problem so you need to know when to tile and when not to tile.

We had a breakout session on the topic of concurrency refactorings and here were the interesting issues (read as future research questions) that came up.

  • How to discover what to refactor?
    Profilers and trace based executions can help guide you in this area but they aren't always that intuitive to use. And you don't want to introduce concurrency blindly since it only matters if you are optimizing the real bottlenecks.
  • What are some ways to suggest potential refactorings?
    Now that you know where the bottlenecks are, what are some refactorings that you can apply? I believe that parallel patterns are the way to go.
  • How do you check that your refactoring is correct?
    Refactoring engines for sequential code have bugs. And with concurrency refactoring you have to be even more careful to not introduce subtle concurrency bugs that can be hard to track down. How would you check the correctness of your transformations statically?
  • How do you fine-tune your refactorings?
    Unlike sequential refactorings, concurrency refactorings require more feedback from the user. For instance, the dimensions of the tiles (see picture above). It would help if the IDE can suggest some parameters and then run the profiler automatically to give more feedback.

The work on concurrency refactoring is definitely something that my advisor, Ralph Johnson, and his group at UIUC are interested in as part of our vision for an IDE that can help support parallel programming.

Polyglot Refactoring

Polyglot refactoring is not a standard term. It's one that I coined to describe how refactorings must be aware of all the different components that you use in your program - frameworks, libraries, etc. All these different components are described using different mini DSLs. For instance, the Hibernate configuration files are described in XML, your testing framework uses a set of APIs and conventions that you must conform to, and sometimes you use different languages (Groovy, JRuby, etc) that run on the JVM but share data.

It's important that as you perform the traditional refactoring such as rename, move, etc that you preserve the intended behavior in these other components as well. I think this is not an area that many people have looked at. Researchers focus on having new refactorings for the core language but forget that those other components need to be refactored too leaving a broken program that the user has to fix by hand.

There are three papers that deal with this issue:


OOPSLA '09: Dynamic Languages Symposium (DLS)

Posted by Nicholas Chen Tue, 27 Oct 2009 14:45:00 GMT

There were three interesting talks that I attended at the DLS this year:

Of Scripts and Programs: Tall Tales, Urban Legends, and Future Prospects

How People Use Dynamism in Javascript

(Unfortunately, I missed the first 10 minutes of the talk where Jan was talking about his experiences with the Pluto language).

Jan presented the investigation that he and his team did to measure how much dynamism programmers used in Javascript – the most ubiquitous scripting/programming language on the web today. He had a good sample size, much bigger than most experiments that I have seen. And definitely not like the paltry, non-representative data that most other research papers have in their evaluation sections. His team surveyed major web applications that utilized Javascript heavily including 280slides, GMail, Facebook, Flickr and Apple Me.

Some aspects of their evaluation include:

  • Size of code (in megabytes)
  • Prototype chain lengths – Almost like inheritance chains in Object-oriented languages
  • Variadity – Javascript allows you to call functions of n arguments with m arguments. So how apps out there use this feature?
  • eval(...) – which can used sanely to grab data from JSON or used dangerously to evaluate any string as code.

The conclusion? Some of those apps really use almost every dynamism that Javascript has to provide!

His solution? Yet another programming language (but, of course :-)). His language is called Thorn. I don't have the time to look at it yet, but it reminds me a lot of Strongtalk language with optional typing.

Jan's slides should be available from the DLS website soon.

I also had the chance to ask someone from Google about how they use Javascript. And he basically said that they used a more disciplined approach to it – certain features just aren't used or they are used very judiciously. Some people really like the dynamism of Javascript and others just use it almost like another Java.

Optimization of Dynamic Languages Using Hierarchical Layering of Virtual Machines

Lua on the Lua VM on the ActionScript VM

Here's a crazy idea: what if you wanted to run Lua on the Lua VM on the ActionScript VM?! Now, why would anyone do this? To test out their optimization technique for layering VMs, of course.

It's a rather circuitous way to evaluate this but it seems to be the easiest route. After all, you don't want to write a VM just to be able to try this out. You just want to use an existing one. And the Alchemy project provides an easy way to translate the original C-based Lua VM to ActionScript.

Of course, the caveat of such an evaluation is that the implementation will be so slow in the first place (the presenter was honest and mentioned this early) that it is relatively easy to get speedups no matter how trivial or naive your technique is.

Hosting an Object Heap on Manycore Hardware: An Exploration

This is a continuation of Dave Ungar's work which he presented at the Squeak BOF at OOPSLA 2008.

In a nutshell, he implemented a simple benchmark (baton passing) between different Squeak Smalltalk VMs running on the 56 (it has 64 cores but some of them are reserved) cores of the Tilera multicore processor.

His goal with this experiment was to investigate what kinds of problems we would run if we are going to use manycore (not just multicore) processors. With manycore processors, you have non-uniform memory access and it becomes infeasible to maintain true cache coherence between those cores. In other words, you do need to be aware of the location of your data (much like what the distributed computing folks have been doing).

This is in contrast to what Intel's multicore chips are doing. Even in Larrabee with 10+ cores, all the cores still maintain cache coherence.

I liked the talk because it presented a different view from all the other research papers that I have been reading where they try to solve the problem (or push the solution as far as it can go) for multicore systems and just implicitly assume that it will scale to manycore. People are trying to hold on to the shared memory model for as long as they can and trying very hard to not use a message passing system.


How Complex is "Personal Computing"? – by the Fine Folks from VPRI

Posted by Nicholas Chen Fri, 23 Oct 2009 03:46:00 GMT
Normal’ Considered Harmful (Or, How Much Like Frogs Are We Computerists?

Alan Kay, of Smalltalk fame, and a few of his colleagues from Viewpoints Research Institute gave two wonderful talks at the Computer Science department at UIUC today. It's been a while since I have heard such interesting and inspiring talks so I'd summarize them for you :-)

How Complex is "Personal Computing"?

The video is now available from here.

This was a joint talk by four people:

Alan Kay

Alan started the talk by asking a simple question: how large a t-shirt would we need to contain the essence of Computer Science?

Maxwell's Equations of Electromagnetism.

It's a serious question. The foundations of electromagnetism is contained in Maxwell's equations which can easily fit onto a t-shirt. What about Computer Science? Imagine trying to fit the source code for Microsoft Windows onto a t-shirt. How many t-shirts would we need?

With that in mind, the folks at VPRI are trying to find those fundamental equations for computer science as well. Three of his colleagues present their attempts at three different domains.

Dan Amelang

If you have seen the algorithms and code for doing antialised rendering, you'd know that it's a very complex component. However, it's also a very important component for computer graphics. Almost everything that we see on our screens now is antialised.

How would we describe the fundamentals of antialised rendering in simpler terms? Dan has come up with a set of basic equations to do that. And he even created a domain-specific language called Nile to make it easier to describe those operations. For a glimpse of Nile (and I don't claim to even understand most of it), you can peek at Dan's code on GitHub.

Nile is still work in-progress; to compile it into executable code builds upon the work done by ...

Alex Warth

Alex is the creator of OMeta – an object-oriented language for pattern matching. In short, it's a language for rapid prototyping of languages. OMeta allows you to easily model existing programming languages and also create new ones easily to enable you to experiment with new features instead of messing around with different lexers, parser, code generators, etc.

By applying Occam's razor to computer languages, Alex hopes to enable everyone to easily grasp the fundamentals of one of the most important components of Computer Science: expressing your ideas in ways that the computer can understand. OMeta is succint, you can express a good portion of javascript in less than 200 lines. And like all good metalanguages, OMeta can be used to describe and bootstrap itself.

You can try out the Javascript version of OMeta here. I also strongly recommend reading Alex's thesis (very readable and very interesting as far as Ph.D theses go) where he talks about OMeta and some of his other work.

Hesam Samini

Hesam is tackling the problem of program correctness using executable program specifications. What I really like about his work is the idea of using program specifications to fix program behavior whenever possible.

Most unit testing today try to test what can go wrong with your program and how your program would handle those aberrant behavior. However, it's very hard to specify everything that can go wrong. It's much easier to say what's the intended behavior of a program. And once you have that, if you ever violate what is means to behave correctly, you can use that correctness specification to guide your program back to correctness.

There are many ways to try to correct your program. Here's my example on this: take the case of a malformed string; perhaps the user accidentally puts a space i the string. A naive way would be to randomly just permute the string until you get a version that is right and fits the constraint of correct behavior. A smarter way might be to apply delta debugging to find the cause of the problem and then rectify it. Or you can try the techniques that Martin Rinard uses for rectifying bad input for email messages..

In either case, specifying the correct and intended behavior in your program allows people to understand it more easily. Compare this with the arduous task of having to read through lines and lines of if...else and exception handling and having a hard time understanding the crux of your program.

Normal Considered Harmful (Or, How Much Like Frogs Are We Computerists?)

This was the second talk by Alan which was intended for a more general audience. This talk was recorded and is available from here.

Anyway, here are the three main points that I liked from the talk:

Learning from History

Who am I?

According to Alan, if you show people the picture above, they would most likely know that it's a picture of Albert Einstein. Show them the picture below and many won't know that it's a picture of Doug Engelbart, the inventor of the computer mouse among other things. Why is it that Computer Scientists don't know the historical figures in their field?

Who am I?

And because we don't bother learning about our history, we miss a lot of grand ideas that came about in the 60s. Here, most of us are spending time reinventing the wheel and not even doing a very good job of it. The ideas from the 60s were revolutionary (pun intended); the ideas of today pale in comparison. We aren't daring enough to explore. And because of that ...

We Aren't Solving the Important Problems

Research today is very parochial. Look at the papers at most conferences. It's almost like they follow some cookie cutter template: problem(a cleverly invented one sometimes that no one really cares about), motivation, solution, case study, evaluation. Researches pat each other's back for solving related problems. However, look closer and you would see that the problems being solved are small and narrow. The big picture is missing.

People aren't exploring revolutionary ideas because the system (tenure, conference acceptance, funding, grants) doesn't welcome those ideas. Doing something revolutionary is risky. Of course it is. So people do the easy things and never explore beyond their fields. They become complacent. Instead they need to see that ...

All Understanding Begins With Our Not Accepting The World as it Appears – Susan Santog

We need to be bolder and explore more. We need to come up with ideas that border on the insane. Some insanity is required. After all, only 1% of the population comes up with such ideas; the other 99% like to keep things in order as they are now. So it's up to those 1% to come up with new ideas that the other 99% will eventually accept.


A Pattern Language for Screencasting

Posted by Nicholas Chen Thu, 09 Jul 2009 21:35:00 GMT
TV photo purchased from iStockPhoto

I have written a pattern language for screencasting based on my experiences producing and observing various screencasts on the internet. Through my experiences I have noticed various patterns that myself and other screencasters have used. Applying those patterns would certainly help the presentation of your screencast but, more importantly, I believe that the patterns enhance the teaching-learning experience of screencasts. And ultimately that is what matters most.

Interested readers may view the patterns here (older version can be found here). The pictures that I have included are based on software for the Mac since that is what I use for my screencasting.

At the moment, it's a rather complete version of what I had originally envisioned for such a document. It can definitely be improved but I would like to get a version out on the web for people to read in case anyone is interested in providing feedback or suggestions. This pattern language is in the shepherding stage of PLoP '09.

I do maintain copyright on the document.


ParaPLoP'09: First Workshop on Parallel Programming Patterns

Posted by Nicholas Chen Wed, 03 Jun 2009 03:08:00 GMT

I'll be attending ParaPLoP 2009 over the next few days. It's a patterns workshop along the traditions of PLoP but focusing exclusively on patterns for parallel programming.

I am the author and co-author of the following three patterns paper that you can find on the program page:

  • Barrier Synchronization
  • Patterns for Collective Communication
  • Patterns for Topology Aware Mapping

Most of the patterns submitted are part of the Our Pattern Language (OPL): A Design Pattern Language for Engineering (Parallel) Software catalog proposed by Kurt Keutzer (EECS UC Berkeley) and Tim Mattson (Intel). OPL is based on Tim's earlier book Patterns for Parallel Programming.

While we have already identified quite a few patterns, there's definitely still a lot of work that needs to be done to produce a sizable pattern language that covers most of the patterns that most programmers will encounter in their programming career. However, I think that the layered approach that the Berkeley folks are advocating has great potential. The layered approach allows programmers to focus on different patterns depending on their skill sets and contributions to their project. At the top layer, application programmers focus on understanding the high level patterns and take advantage of the parallelism by using libraries and frameworks. And at the bottom layer, platform programmers focus more on understanding the low level patterns and develop libraries and frameworks to be used by other programmers.

Writing good software is hard. And writing good parallel programs is even harder. Patterns help make the task easier by showing the best practice principles that novice parallel programmers can learn from. And it's great that both UIUC and UC Berkeley are working together to catalog such patterns.

Ralph Johnson is leading the patterns project as part of the UPCRC at UIUC. And one of his projects is to mine for patterns by examining how different algorithms are expressed in different languages and frameworks, in particular those developed at UIUC (e.g. Deterministic Parallel Java, Charm++, Actor Foundry). Kent Beck, Samira Tasharofi, Amin Shali and I will be contributing to this project and our preliminary results can be found here.


Newspeak Prototype

Posted by Nicholas Chen Sat, 28 Feb 2009 17:48:00 GMT

Room 101: Newspeak Prototype Escapes into the Wild:

"The Newspeak prototype is now available at http://newspeaklanguage.org/downloads/ . "
Newspeak, first boot on my machine.

Yes, the long awaited Newspeak is finally released! It's far from being complete but at least something that runs is out. I was in a class about Actor Programming Languages and Systems last semester and we were supposed to talk about Newspeak (and possibly do a group project on it if people were interested). Unfortunately, at that time, only the language specifications were released. And it's not as fun just reading about the specs and proving things about it.

The other good news? It runs on an Intel Mac. I was initially under the impression that it would only run on Windows.

Other initial impression: the UI is much improved compared to other Squeak implementations (which it is based on). Like the Pharo project, Newspeak is probably aiming at creating an professional, open-source Smalltalk platform. Unlike Pharo, it still has remnants of eToys though.

Looks like I might be canceling some initial reading plans to play around with this over the weekend.


Statistical Debugging: Using Statistics for Good not Evil

Posted by Nicholas Chen Sun, 01 Feb 2009 18:53:00 GMT

This article is a summary of these three papers about Statistical Debugging: Holmes: Effective Statistical Debugging via Efficient Path Profiling (technical report from Microsoft to appear in ICSE 2009), Scalable Statisical Bug Isolation and Bug Isolation via Remote Program Sampling.

Statistics is often used (and misused) as an effective way to finagle people into believing something that isn't really significant. There are many such cases (and you see them a lot on TV too). The one that I remember off the top of my head is a poster by Hi51 where they had a nice bar graph which showed that they were the "fastest growing social network in 2007" compared to other sites such as MySpace, Facebook, etc. You have to read their claim (in quotes) properly to actually see what they are trying to claim and why it isn't really significant. Hint: they needed to use the word "growth". So, if hi5 had 100 members in 2007 and grew to 100 000 members in 2008, then its growth is 1000-fold (pretty impressive). But if Facebook had 100 000 members in 2007 and grew to 1 000 000 in 2008, then its growth is only 10-fold. So hi5 wins in terms of "growth" but in terms of users, that is pretty insignificant (how much more can Facebook grow once almost everyone you know is already on it?) So, what exactly is hi5 trying to sell to the people who are reading the poster? Trying to hire undergraduate students to work for them? Trying to show that they are going to overtake Facebook soon?

Statistics: Good or Evil?

As the example shows, you can use statistics for your own personal gain. And still keep your conscience clean because you are not really lying at all. And it is up to the reader himself to sieve through those numbers and decide whether they even mean anything significant. I believe that inundating the reader with such insignificant numbers is a prime example of using statistics for evil.

So it is refreshing to read a paper where statistics is used as part of the fundamentals of the technique itself and presented up-front in a convincing and compelling manner; Statistical Debugging (SD) is the use of statistics for good where numbers can actually help programmers locate pernicious bugs that manifest themselves infrequently and can be hard to track down.

SD relies on the fundamental idea of sampling a program. There are many tools out there that can sample programs but they are mostly used in profiling the hotspots of a program; in other words, the parts that are frequently encountered. However, such sampling is not useful for debugging since pernicious bugs are often not in the hotspots of a program. Thus, for SD to work, you need a different kind of sampling – one where it would be possible to sample the infrequently encountered parts of the program itself. This sounds simple but it can be prohibitively expensive if you are trying to observe all those little facts.

Solving the issue of collecting useful debugging information is the focus of their first paper, Bug Isolation via Remote Program Sampling where the authors present a novel way of doing the sampling so that they can collect interesting properties of the system. They identify a set of predicates (statements that can be true or false for a particular execution of a program) that is useful for C/C++ programs2. And for each run of the program, they would sample each predicate by modeling a Bernoulli process (like flipping a coin whenever each predicate is encountered to see if its needs to be sample). Basically, this means that for each execution of a program, the predicate has a fair chance of being sampled. Contrast this to the typical way of doing a profile i.e. using a periodic timer. If the execution of a program is short, some of the important predicates might be missed. So using a Bernoulli process guarantees a fair sample for each predicate.

Instrumenting a program so that we can collect information about each of its predicate can be very expensive because they can easily be a million predicates3 for a typical program. And that is where the authors introduce a novel method for collecting the information which relies on the Law of Large Numbers. It is infeasible to collect all the results of all those predicates from just one person on one machine. Instead it would be better to collect the information from as many people as possible. And each version of the program that runs on a different machine is designed to monitor a subset of all the predicates. Given enough machines, it is possible to merge all the results and get a useful matrix.

A sample matrix might look like the following:

Executions Predicate #1 Predicate #2 Predicate #3 Predicate #4
Pass True False True Not observed
Fail False False True Not observed
Fail True True Not observed Not observed
Pass True True True False
.. .. .. .. ..

For each execution, we check the value of each predicate: true, false or not observed. And from that matrix (which should be much larger) we try to find a correlation between program failure and a particular predicate. The paper Scalable Statistical Bug Isolation discusses how such a correlation is found and how noise is reduced. This technique is usable even if they are multiple bugs in the same program; SD is able to generate a correlation for each different bug. And it's also usable when you have non-deterministic bugs. SD works best when you have a large set of data for each execution of the program.

It's important to note that SD does not tell you that a particular predicate is the root cause of a bug. All it tells you is that there is a high probability (80% or higher) that this predicate is the root cause of a particular bug. And the programmer can then start focusing his efforts on figuring out why that particular predicate is a point of failure. Sometimes the bug itself does not lie in that particular predicate, but it is usually within close proximity so examining the predicate itself is a good start.

This is useful enough because sometimes it is almost impossible to detect such bugs otherwise since they manifest themselves infrequently. And because the overhead of collecting such information is low (remember that each instrumented program is specialized to only monitor a subset of the predicates) it is possible to run the collection on a user machine where SD can capture the execution of the program under the actual environment of a real user (vs. a controlled environment on the developer's machine).

The latest paper in SD, Holmes: Effective Statistical Debugging via Efficient Path Profiling evaluates using a different set of predicates (path profiles) that might help programmers identify the bug more easily. The previous method of using predicate will report the location (file and line number) of the predicate that is believed to be the root cause. While the predicate is usually within close proximity to the actual bug, the programmer still has a lot of code to sieve through. Path profiles provide more information about the execution of the program and contains the actual execution path that the program went through. This is useful for identifying bugs that have complex control flows through the program.

In conclusion, SD is a novel idea in helping programmers find subtle bugs that might be hard to identify otherwise. Good unit tests can usually help eliminate a lot of bugs but it is hard write tests for everything (and the cost of maintaining such detailed tests is very expensive). SD is not perfect nor does it claim to be. Instead it is another useful tool in a programmer's arsenal for hunting down bugs when trying to manually locate them has proven to be unfruitful.

The papers also impress me because of their honesty in doing their own evaluations and experimentations; they explicitly state that their experiments are done under controlled environments. And they actually have an on-going project where they are evaluating SD in real applications.


  1. I think that it is fair to use this example since it was a public poster that was posted on the public walls of the Siebel Center for CS at UIUC. Moreover, I maintain disinterest in hi5.
  2. The set of predicates might be different if we were to conduct the experiment on programs written in different languages or programming styles (procedural vs. OOP).
  3. For a C/C++ program the set of predicates includes branches, integer return values and assignments. Integer return values are useful in C/C++ programs because the return value is usually used to indicate if the operation succeeded (return 0) or if it failed (return -1).