Wednesday, November 16, 2011

Please don't make me read my email during your presentation

When I go to a conference, I typically find the "hallway track" to be most rewarding: technical conversations with colleagues whom I don't often get to see. This is something that I can't get from reading the papers in the conference proceedings.

Nonetheless, I also look forward to attending the technical sessions. Oftentimes the presentation offers a different spin on the material than the paper does. The authors have had more time to think about their approach and how to explain it, or the time constraints may force them to focus on the most important and high-level ideas. (A presentation cannot and should not try to convey all the detail that a technical paper does.)

Unfortunately, I am sometimes disappointed by the quality of the conference talk, and I end up zoning out, looking over the program to decide what talks to attend next, or even -- and I shudder to admit it -- reading my email, since my inbox always get out of control while I am traveling. But, I would much rather be paying attention to a talk that conveys insight in an engaging way!

(The audience's first responsibility is to the speaker. An audience member who gets distracted gets less from the talk, too. Even watching a bad talk can give value if you devote your full attention to it. But if you lose the thread, it's very hard to regain it, and then the audience member has an even better incentive to stay distracted. If the beginning of the talk is good or even mediocre, then this negative spiral never occurs.)

I am perplexed by why people don't spend more time preparing and giving excellent talks at conferences. The rules for doing so are relatively simple, and are well-explained in a variety of locations, including my own article about giving a technical talk.

There's no question that it takes significant time to produce a quality talk. For example, you have to think deeply about how to present the material, which is different than the best way to present it in a paper (though this increases your impact). Additionally, you have to do multiple practice talks (many more than you think you need!) to hone how you present your message. But, the results are well worth the effort. The effort is small compared to the amount of time spent on the research and on writing the paper. You are likely to understand your own work better after preparing a good talk. And, you have the attention of a lot of smart, interested people who want to hear about your work and may make their impression of you and your work based on your talk.

So, work hard on your talks. Audiences will be grateful, and it will also pay off in other ways.

Tuesday, November 8, 2011

Verification Games work mentioned on Wired "Danger Room" blog

Formal verification is typically a tedious and costly affair performed by highly-trained and highly-paid engineers. We would like to change that, making it as fun as a game and accessible to people without any knowledge of computer science. We would like people to prove properties of programs while they wait for the bus, by playing a game on their phones.

To that end, we are creating a system, which we call Verification Games, to crowd-source program verification. Our system takes as input a program, and produces as output a game. When a person finishes a level of the game, then the final configuration of board elements can be translated into a proof of a property about the program. Then, the player can move on to a different level, which corresponds to a different property about the program, or a property about a different program.

Wired's Danger Room blog recently mentioned this work (see the end of the article). The DARPA slide has a screenshot of our game Pipe Jam. DARPA has announced a new Crowd-Sourced Formal Verification program that is inspired by our work.

Saturday, October 29, 2011

What is defensive programming?

(There is a summary of this article, in three short bullet points, at the end.)

The concept of defensive programming is misunderstood by many people, which is a shame, because it is an elementary and basic notion.

I was shocked to recently hear some graduate students in programming languages state that it's a matter of opinion whether a particular run-time check is defensive or not. Some said (incorrectly) that handling expected error conditions, such as a web server being down, qualifies as defensive programming.

Understanding defensive programming requires understanding the concept of a specification. A specification indicates the legal inputs to a component, and indicates the required behavior/output.

If an input is legal according to the specification, then an implementation must adhere to the specification. Multiple implementation behaviors may be possible. Consider a procedure that returns an approximation to the square root of its argument, accurate to within .001: for each input, many different outputs are acceptable. Likewise, an iterator over the contents of a set may return the elements in an arbitrary order.

If an input is illegal according to the the specification, then an implementation is permitted to do anything at all. The implementation may return a nonsensical output, or crash, or delete all your files.

The procedure's caller has no recourse to complain about this behavior: the specification serves as a contract in which the caller is required to satisfy the input requirements and the callee is required to satisfy the output requirements. If one party to a contract (the caller) breaks the contract, then the other party (the callee) is no longer bound by it.

When the input is illegal, the implementer of the procedure is free to do whatever is convenient. However, the implementer may choose, out of the goodness of his/her heart, to go to extra effort to give a useful result (such as a useful error message) even when the client has violated the contract. This is defensive programming. The client may find this useful, but can't depend on it. The implementer may drop the extra effort at any time, and the implementation remains in conformance to the specification.

If it's important to the caller that — even when the caller has screwed up — the procedure behaves in a particular way, then the caller should choose a library with a different specification that gives that guarantee. Such a specification would dictate behavior for all possible inputs instead of just specifying behavior for a subset of possible inputs.

Why don't all specifications dictate behavior for all possible inputs? It may be inconvenient, or it may be inefficient, to do so. Consider the example of binary search, which takes as an input a sorted array and a value, and reports whether the value appears in the array. If the input is not sorted, the implementation is allowed to do anything, including reporting that a given value is not present (even though it does appear in the (unsorted) array), or crashing. A sloppy client that sometimes incorrectly passes a non-sorted array to the binary search routine might desire to get an exception indicating the mistake, rather than other behaviors. In this case, the client should either do the checking itself, or choose a different library whose specification guarantees throwing the exception. Such a library will be much less efficient than one that does not check whether the array is sorted! Even simple checks of inputs can impact performance, code size, and readability, and it is reasonable that implementers choose not to always implement these optional defensive checks.

An interesting property is that you can't tell whether a given check is defensive programming just based on the code. You have to examine the specification.

What if a given piece of code has no (written-down) specification? In that case, it's rather odd to even argue about whether it is correct. Asking whether the code is correct is like asking whether "42" is the correct answer to a question, without specifying what the question is. If you have code that lacks an unambiguous written specification, then it makes no sense to point fingers in deciding who is at fault, because that question is subjective. I assure you that no matter how "obvious" you think the intended specification is, reasonable people can and will interpret it differently. So, in the absence of a specification, first write one. You will reap generous rewards in better understanding of how your system works.

I should note that the benefits of writing a specification are quite distinct from the benefits of defensive programming. You should always use the former, and use the latter whenever appropriate.

In summary:

Defensive programming changes the behavior of a module in the face of client errors, when a client does something that is explicitly forbidden by the module's contract.
Whether a given check is defensive programming depends on the module's specification. If the specification mentions given behavior, then that behavior is never defensive programming — it is just programming.
If a client depends on certain behavior, then the client should choose a module whose specification guarantees that behavior, rather than hoping that the module happens to perform defensive checks that the client needs.

Conference acceptance rates are too low! No, they are too high!

What should the conference acceptance rate be?

People who are dissatisfied with conference reviewing periodically call for conferences to change their acceptance rate.

People who have had papers rejected fulminate against the short-sightedness of the program committee: the committee applies an unreasonable standard that prevents innovative work from reaching an audience. But, I have heard the very same people also complain about bad papers that appeared: such papers degrade the conference and waste the time of the community (both at the conference and subsequently in reading and evaluating the proceedings).

These people want an exclusive and thus high-status venue, at which their own papers are uncritically accepted.

Those who propose a change to the rules of the game are wasting their breath with complaints about the unjustness of the universe. They would do better to: recognize that their opinion of their own papers is biased, deeply understand the referees' concerns (both explicit and implicit), and improve their research and its presentation.

My advice on writing a technical paper contains tips about benefiting from rejection.

Saturday, September 24, 2011

One of ten papers you should read

Michael Fogus's blog posting "10 Technical Papers Every Programmer Should Read (At Least Twice)" includes my 1998 paper "Predicate Dispatch: A Unified Theory of Dispatch".

This was joint work with Craig Kaplan and Craig Chambers (who has two papers on the list!).

Wednesday, September 14, 2011

PhD Working Group at ESEC/FSE 2011

The PhD Working Group at ESEC/FSE 2011 had an admirable goal: students would learn about current trends in software engineering research and summarize the results to the rest of the attendees, and throughout the process students would interact with more senior researchers. Unfortunately, it was organized in such a way as to benefit neither students nor researchers. On the plus side, none of my students took part in this fruitless exercise.

I'll summarize what I saw of the PhD Working Group and offer some suggestions that might have produced better results.

Students were split into 7 working groups, each with its own topic such as “Agile Development”. Each group was assigned to learn about this topic from conference attendees. So far, so good.

The students presented a multiple-choice survey form and collected the answers — or just asked attendees to take the survey online, but the attendee invariably forgot to do so. So, the poor students would circle like hungry sharks, asking for survey participants and even interrupting conversations, and I saw some attendees trying to avoid anyone who looked like they might ask survey questions. This made the relationship between junior and senior researchers antagonistic, which prevented rather than encouraged conversations (not to mention the time wasted with the surveys, which could have been spent on meaningful communication instead).

A multiple-choice survey that is taken by both people expert and ignorant of the topic — and the students emphasized that they wanted both types of opinions — conveys nothing of value about current and future research directions. Popularity polls may be favored by the evening news when they do not want to do real reporting, but even there I don't see any value. I would rather understand the justification for a particular conclusion than just see that 27% of people agree with it. I hope none of the students came away thinking that a public poll is a valid methodology to learn about software engineering research. As was predictable, the student presentations in a plenary session were a waste of time.

Another serious problem was ambiguous and nonsensical questions on the survey form. I completed several surveys, but for one survey I gave up in the middle. It was full of questions with answers that were non sequiturs (they had nothing to do with the question), or that omitted choices that would be preferred by any expert, or that I couldn't interpret at all. For the most successful survey I took, the student interpreted the questions and I dictated my answers, rather than me working alone ticking off the multiple choice boxes. In fact, on several occasions the student changed my answers when I remarked that the question didn't make sense — the student said that my proposed alternative question is what they had meant to say. So much for the answers meaning anything. I conclude that the only redeeming result of the entire exercise is that a bright and thoughtful student might have learned something about how not to do questionnaire design; but proper questionnaire design should have been taught from the beginning.

As I mentioned, the sentiment behind the PhD Working Group is a noble one. Here is a different way it could have been run instead, which would have avoided some of the pitfalls that befell it this time around. The organizers could have given each group a list of 5 or so researchers at the conference who were expert in that area. The students would interview those people for 30 minutes or so — no one would get interviewed on more than one topic — and the group would evaluate and synthesize the responses, including adding their own opinions or justifications. With this design, the students have meaningful interactions with senior researchers, the students learn something, they provide a summary from which others might learn something, and everyone spends less time, and is interrupted less, than with the present model. There may be flaws in this approach, too — feel free to discuss them, and how to correct them, in the comments to this blog posting.

Friday, August 12, 2011

Document mark-up and correction with voice recognition

I spend a lot of my time commenting on document drafts. Traditionally, I do this with a red pen, and I hand the marked-up copy back to the author.

The traditional approach works well, for several reasons:

Marking up with a pen gives great flexibility to draw pictures and to relate chunks of text with freehand arrows.
It is easy and natural to flip among pages and to amend previous comments.
There is no need to be connected to a computer, so it does not contribute to my hand and eye strain.

The traditional approach also has some disadvantages:

When my collaborator doesn't work in my building, I have to send the comments by postal mail, or else scan them in color and email the scan, but the scanned version is invariably much harder to read than the hardcopy.
Giving comments to multiple people on a collaborative project requires photocopies/scans, or else sharing a single hardcopy.
My handwriting is sometimes hard to read.

I still frequently use the traditional approach, but I also sometimes give back comments electronically, using voice recognition.

I load a PDF onto my computer, point to some text of interest, and speak my comments about that text. My comments are transcribed into text annotations in the PDF, which I can email back to the author.

Because I use a tablet computer and a stylus, I can do all this while reclining on my couch, which saves me the eye and hand strain of sitting at the computer and typing. For long comments, this approach is considerably faster than typing, even accounting for correcting occasional speech recognition mistakes. For shorter comments, it's about the same speed, but the greater comfort, and the convenience of the electronic form, makes it well worth doing. It has improved an activity that I spend many hours on each week.

If you haven't used voice recognition recently, you owe it to yourself to give it another try. I was really impressed with the accuracy, especially compared to even a few years ago.

There are three key components to my setup:

1. A tablet computer with a stylus

I use a ThinkPad X61s, though this is an older model which has since been replaced by newer ones.

You want a real computer with a decent CPU, not a “slate computer” or “tablet” such as the iPad and its rivals. The reason is that voice recognition software is extremely CPU-intensive, and your computer will be going all-out to provide you accurate speech recognition.

My setup would work with any laptop/notebook computer, not just a tablet computer, but I love being able to get away from my desk and change my posture.

2. Dragon NaturallySpeaking

Dragon's products is so dominant — and so good! — that there isn't much competition in this product space. I tried to find a usable speech recognition program that would work under Linux, but there doesn't seem to be one.

Microsoft Windows 7 comes with built-in speech recognition that is pretty good. However, it is not quite as good as Dragon NaturallySpeaking. More importantly for me, the Windows 7 built-in speech recognition is not able to type into all text boxes in all applications, including pop-ups in Acrobat Professional.

Thanks to this new competition, NaturallySpeaking retails for only $200 ($100 for the “home” version, which I have not tried), though you can find it even cheaper and there is also a 50% academic discount. It's well worth the price.

Interestingly, NaturallySpeaking works better the faster you talk. That is, it works best when you speak in full sentences, without pauses in the middle of a sentence. The reason is that it uses context to determine what word you meant to say. When marking up with a pen, I typically write one part of a sentence at a time and think about the best way to convey the rest of my thought. This still works, but you may end up doing a bit more correction of the voice recognition than you would if you didn't use the pauses.

3. Adobe Acrobat Professional

When you select text, Acrobat lets you apply annotations/comments/markup to it, which highlights the text and associates a pop-up note with it. You can choose among different types of annotations: a comment, replacement text, crossing out, etc. Here are examples of how it looks:

After you have added your annotations, you can save the file and send it to your colleague. Your colleague can then click on each highlighted snippet of text to see your comments. This is a bit of a pain, because they have to be in front of a computer, have to click on each one individually, and can easily forget which ones they have already read. Some people I work with like this format, but most do not. Therefore, I always send my comments in two formats: both the original PDF, and also what Acrobat calls the “comments summary”. This is a format, suitable for printing, in which the original document is displayed along with a list of comments.

Acrobat 8 and 9 have a menu item “Comments > Summarize comments”, which offers 4 ways to summarize comments, including the one in the linked images (which is my favorite). Acrobat X has lesser functionality, and that functionality is harder to find: there is only one way to create the comments summary, and it appears as the “Summarize comments” button in the “File > Print” dialog box. Adobe has documentation about printing a comment summary.

I tried using free or cheaper products, such as those from Foxit, but their text annotation features are much more limited. They didn't have as many types of annotations, were less versatile, were visually uglier, and, most importantly, didn't seem to have functionality analogous to Acrobat's “summarize comments”.

Wednesday, June 29, 2011

The best exercise for mountain climbing

I used to teach mountain climbing. People would ask, "What exercise should I do to strengthen myself for climbing?" There was only one answer: climbing itself. Other exercises (including going to the climbing gym) can help some, but don't give anything like the same benefit.

The same advice applies to programming, research, or any other activity. Go do it, and thoughtfully learn from your successes and failures, and you will get better at it.

PATPAT: Program analysis, the practice and theory