Friday, September 30, 2016

Applying for a graduate fellowship

It is the season to apply for graduate fellowships, with the first deadline for most students being the NSF Graduate Research Fellowship.

I maintain a webpage with advice about applying for a fellowship.  I have recently updated it, and you may find it useful when you apply.  Good luck!

Monday, September 26, 2016

JavaOne 2016 talks

I first spoke at JavaOne in 2008, and I have presented most years since.  It's a great way to communicate a message about practical software development tools.  An academic can even squeeze in a few few nuggets of theory as enrichment, so long as the fundamentals of usefulness to practitioners is covered.

Last week, I gave three talks (together with Werner Dietl), spanning 4 hours, at JavaOne 2016.  The common theme was the use of Java 8's type annotations feature to improve code quality.  This was the most rewarding year yet:  with ever-increasing adoption of Java 8, there was lively interest and the sessions were very interactive with lots of questions and comments.

The slides are now available (though you always get more from being at the talk, hearing the spiel, and seeing the demos):

Tutorial TUT3422:  Preventing Errors Before They Happen

Conference talk CON5739: Disciplined Locking: No More Concurrency Errors

Birds-of-a-feather session BOF3427: Using Type Annotations to Improve Your Code

Thursday, September 22, 2016

Predicting the 2016 presidential election

My "data programming" class, UW CSE 160, is a popular introductory computer science class.  Its examples and assignments are taken from the real world and use datasets from science, engineering, business, etc. -- not from abstract math, puzzles, or programming itself, such as computing the Fibonacci sequence or implementing a linked list.  These real-world examples are more compelling to students, and they better prepare students for the programming they will do in the future.  The class's assignments can also be integrated into an existing class without fully adopting the methodology of real-world examples.

One particularly popular assignment asks students to predict the outcome of the 2012 election, based on polling data.  Preceding the 2012 election, many political pundits, working from their gut feel, predicted a Romney win or said the election was "too close to call".  Nate Silver of the website FiveThirtyEight had been predicting an Obama win for months, and he correctly predicted the outcome of every state.

The key to Silver's approach is to combine different polls using a weighted average. In a normal average, each data point contributes equally to the result. In a weighted average, some data points contribute more than others. Silver examined how well each polling organization had predicted previous elections, and then weighted their polls according to their accuracy: more biased pollsters had less effect on the weighted average.

The concepts are simple enough for beginning programmers to complete successfully after just 3 or 4 weeks of instruction.  The assignment is interesting enough to be assigned later in the term, too.  Since most of the assignment is provided and students just have to implement 10 functions, this assignment also gives students practice in the critical skill of reading code.

When this assignment was first handed out in January 2013, the 2012 election was a fresh memory.  Now it may seem dated to students. Therefore, you could update the assignment to use polling data for the 2016 election.

Doing so requires collecting and cleaning polling data.  You can find information about how we collected and cleaned data for the 2008 and 2012 elections, in file https://courses.cs.washington.edu/courses/cse140/13wi/homework/hw3/raw-data.zip. (Students:  this doesn't give any hints about how to solve the assignment.)

If you adapt the Python programs in that zip file and collect polling information about the 2016 election, please share the fruits of your labor by emailing me.  Other instructors and their students will appreciate it!

Friday, September 9, 2016

Python evaluation rules

You cannot program in a programming language if you don't understand how the language works.  Some novices find programming a frustrating, opaque endeavor because they don't understand how the computer executes their programs.  When their program does not work, they make wild guesses about what changes might improve the program, and they try out their guesses by running the program.

If a programmer understands the language, then the programmer can understand what the program is doing and why it is producing the observed output.  The programmer can determine a proper fix or devise meaningful experiments to better understand what values are being manipulated at run time.

Unfortunately, not all students are taught these simple, effective techniques.  I have been horrified to see instructors (even at my own institution!) teach bad habits by telling students, "The only way to know what this program does is to run it."  Many programming textbooks and websites are just as bad:  they don't explain the programming model, or they do so in vague English.

The Python Evaluation Rules document gives precise semantics for much of the Python language.  It presents step-by-step rules for executing a Python program.  Every skilled programmer uses such rules to reason about the effects of Python code.  This document helps beginners to become experts more quickly.  With this knowledge, a programmer finds it easier to write correct code, debug incorrect code, and read unfamiliar code.

The document applies to both Python 2 and Python 3. It does not cover every possible Python expression, but does cover the most frequently-used ones.

This document has been in use for 5 years, with great success.  I recently moved to a new GitHub repository, where source code is available and you can submit bug reports or make pull requests.