Wednesday, April 8, 2026

AI and the future of CS careers

There is a lot of fear that software development may no longer be a stable, well-paid career. Those jobs aren't going away, though they are changing a lot. Here is a message to that effect from the directory of the UW CSE Allen School, Magda Balazinska, which I'm reproducing with her permission. (Update: CNN has a story about Magda's message.)

AI and the future of CS careers


I keep hearing concerns about AI and the future of CS careers. So I wanted to share my honest perspective.

AI is expanding your opportunities, not shrinking them. AI is a disruptive technology that is dramatically changing the toolset that we have at our disposal when creating new systems and applications. I view this disruption as an exciting one: A change that is opening tremendous opportunities. AI is shifting where engineers spend their time — away from tedious tasks like writing boilerplate tests or reverse-engineering legacy code, and toward the creative work that actually matters: design, prototyping, and solving hard problems. Until recently, in industry, a small fraction of the time would be spent on the most exciting work and the majority of the time would go toward the necessary but often tedious tasks. Today, AI is changing that equation. One engineer told me recently: "I finally have the job I always dreamed of." That's not a threat. That's progress.

The "companies are cutting engineers due to AI efficiency" narrative is misleading. Whenever you hear that narrative, pause and question it. If AI makes your engineers more productive, any rational company will use that advantage to ship more, not fewer, products; to set more ambitious goals for themselves and achieve them. AI lets us aim higher. In some companies, layoffs and new hires are happening simultaneously because companies are reshuffling to align with new priorities and toward engineers who embrace AI. Net hiring and workforce numbers remain encouraging [See, for example, total workforce numbers for Microsoft, Amazon, Google over time; and this article about the general tech hiring landscape]. The companies cutting headcount aren't responding to AI efficiency — something else is going on in those businesses. But, as a CEO or VP (or as a writer for the popular press), it’s so much easier to blame AI efficiency. 

You are exactly who the market wants right now. Who has the time and hunger to master tools that have existed for two years or even less? You do. And employers know it. A venture capitalist I've spoken with recently said that most new startups are either founded by or primarily hire CS graduates. Use your internships and side projects to go beyond the classroom — especially with agentic AI systems. Software engineers who demonstrate fluency with these tools, even if their primary role is not AI-focused, have an advantage in the job market.

The Allen School is also adjusting: We are here to help you, of course. We have started to roll-out courses such as AI-Assisted Software Development, which we piloted last Fall and will offer again (in one or more new forms) next year. You can see more details on our page summarizing AI education in the Allen School. And more courses are to come next year, including a course on Human-AI Interaction and a course on Systems for Machine Learning. Many of our capstone courses are encouraging you to leverage new AI tools. Take those opportunities and also go much further on your own. 

"Anyone can vibe code" — but that's precisely why you matter. Yes, anyone can generate code with a prompt. But who can evaluate whether that code is correct, secure, scalable, and maintainable? Who understands prompt injection vulnerabilities, cascading failures in agentic systems, or how to design interfaces that actually serve users? You will by the time you graduate. That expertise is what separates a software engineer from someone who got lucky on a demo. If you were founding a startup tomorrow, would you rather have on your team someone who understands systems deeply and uses AI — or someone who only uses AI

What this means for you: lean in. The degree itself is not enough — the knowledge is key. Those who learn the material in our courses deeply, embrace new tools, and build real skills will find many doors open. Those who coast will not — the degree itself, while important, doesn’t get you the job; deeply learning and applying the material in our courses does. Take every opportunity the Allen School offers, then go further on your own. 

The data on new-grads employment aligns with all this: Our graduates do many things including a range of jobs and grad school, with the most-common outcome being software engineering in industry. Over the last couple of years, despite all the changes AI is bringing to our world in general and the tech industry in particular, the strong job outcomes of Allen School graduates have been completely stable. The vast majority of our students continue to take full-time software engineering positions in the tech industry that they start shortly after graduation. This is true even for students who graduated just this last quarter. If you lean in, you will be very well-positioned to join them.


So I hope that you will roll-up your sleeves, embrace the opportunity, and learn as much as you can in your time in the Allen School. Then continue to learn throughout your career. There will be many more significant technological advances between now and when you retire. And that’s what contributes to making our profession so fun to be in!

Saturday, April 19, 2025

Run time, run-time, and runtime

 Here are three variants of the word "run time" in computer science.  It's easy to mix them up, but your writing will be clearer if you do not.

  • run-time: adjective ("the run-time performance")
  • run time: noun -- a moment in time ("occurs at run time"), or an amount of time ("the run time is 8 hours")
  • runtime: noun -- a program that runs/supports another program ("the runtime handles memory allocation"); a synonym for "runtime system"

Defects4J wins the ISSTA 2024 Impact Award

 UW PLSE members RenĂ© Just, Darioush Jalali, and Michael Ernst have been awarded the ISSTA 2024 Impact Award for their ISSTA 2014 paper “Defects4J: A Database of existing faults to enable controlled testing studies for Java programs”.  The Impact Award is given to the single best paper presented at a conference, 10 years later when the impact of all the papers is better known.  ISSTA is the ACM SIGSOFT International Symposium on Software Testing and Analysis, a premiere research venue.

Defects4J is a dataset of defects (bugs) mined from the version control repositories of open-source programs.  It has become an essential resource in any research related to program defects, including test case generation and prioritization, fault localization, automated program repair, mutation analysis, and more.  Google Scholar reports 1700 publications that mention Defects4J.

When building a tool to help programmers, the gold standard of quality is whether the tool is effective on real problems in real code.  In the past, such an evaluation was so difficult that most papers used fake defects, such as randomly changing part of a program.  Such defects are quite different than the ones that arise in real-world programs, raising questions about the generality of the research results.  Defects4J collected real defects and organized them in a way that made them easy to use.

For each of 835 defects (a number that is always increasing), Defects4J contains information from the source code repository:

  1. A defective version of the program.
  2. A fixed version of the program.
  3. A test suite.

The Defects4J authors did significant work to make these defects more useful to researchers.

  • They manually pruned out irrelevant changes between the program versions.  For instance, if the program's developers both fixed the defects and performed a refactoring, the Defects4J authors removed the refactoring.
  • They removed tests that fail on the fixed version of the program.  They also removed flaky tests, which are non-deterministic or dependent on test order.  If there is no triggering test (a test that fails before the fix and passes after the fix), they did not include the defect in Defects4J.
  • They wrote scripts that present a common interface to every defect for compiling, testing, computing code coverage, and other tasks.  This common interface abstracts away from the program's version control system, build system, library dependencies, and other details.  This common Defects4J interface makes running experiments with Defects4J much easier than with other collections of defects.

The Defects4J authors (RenĂ©, Darioush, and Mike) were motivated by their own experiments:  they wanted to ensure that their evaluations were realistic and accurate.  Their work in creating Defects4J ended up having a big influence on many research areas, as recognized by the Impact Award.

Defects4J is freely available at https://defects4j.org/.


Monday, August 21, 2023

Pluggable type inference for free

Testing cannot ensure correctness, but verification can.  The problem with a specify-and-verify approach is the first part:  the specification.  Programmers are often reluctant to write specifications because it seems like extra work, beyond writing the code.

In the context of pluggable type-checking, the specifications are annotations.  Writing @Nullable annotations lets a verifier prove your program has no null pointer exceptions.  Writing @MustCall andd @Owning lets a verifier prove your program has no resource leaks.  And so forth.

Sometimes we can infer specifications from program executions or from code comments.  An even better source is the source code itself.  So far, writing a new type inference tool has been a big effort, which has to be re-done for each type system.

Help is on the way.  Martin Kellogg, his students, and I have just published a paper, "Pluggable type inference for free" (in ASE 2023), that shows how to convert any type-checker into a type inference.  In other words, every type-checker now has an accompanying type inference tool.

To build our tool, we observed that pluggable type systems already do local type inference (within a method).  Our whole-program inference (called WPI) wraps a fixed-point loop around this.  There are many complications beyond this idea, which you can read about in the paper.

In addition to being published in an academic conference, the technique is practical.  It is distributed with the Checker Framework.  This means that WPI has already created type inference tools for dozens of type systems, which you can use immediately.  

The manual section on WPI tells you how to use it.  You can get started by running the wpi.sh script, which automatically understands many Ant, Maven, and Gradle build files.  Give it a try!  We look forward to your comments, suggestions, and even your bug reports.

Thursday, March 1, 2018

Generalized data structure synthesis with Cozy

For most of your programming, a few data structures suffice:  lists, sets, and maps/dictionaries.  But sometimes, your program requires a data structure with a more sophisticated interface or a more efficient algorithm.  Then you need to implement it yourself -- often in terms of lists, sets, and maps/dictionaries.

Implementing a data structure is time-consuming, error-prone, and it is difficult to achieve high efficiency.  Our Cozy tool improves this situation by creating the data structure for you.

Cozy's input is small, easy to write, and easy to write correctly.  You provide as input a trivial implementation, which uses an inefficient representation and inefficient algorithms.  For example, you might define a graph data structure as a set of <node, node> pairs (one for each edge).  Every operation, such as determining whether two nodes are connected or determining the out-degree of a node, requires searching the set for the given pair of nodes.  You can also define (inefficient) update operations, such as adding or removing graph edges or nodes.  You write the specification declaratively, by writing a set comprehension rather than an iteration.

Cozy's output is an implementation of the data structure.  For instance, it can automatically generate an implementation of a graph that uses adjacency lists.  The implementation runs fast because it uses an efficient representation.  For each update operation, the implementation automatically updates the representation; programmers often make mistakes in the update code.

We used Cozy to replace the implementations of data structures in 4 open-source programs.  Cozy requires requires an order of magnitude fewer lines of code than manual implementation, makes no mistakes even when human programmers do, and generally matches the performance of hand-written code.

At its heart, Cozy searches over all possible implementations of the user-provided specifications and determines whether each one is a correct implementation of the specification.  (Cozy has to synthesize both a representation and operations over it.)  A key idea in Cozy is to use the technique of "wishful thinking".  When Cozy notices that it would be convenient to have access to the value of a particular expression, then Cozy just uses that expression and then tries to synthesize an efficient implementation of it.  In the end, Cozy outputs the most efficient overall implementation that it has discovered.

A paper on Cozy will appear in ICSE 2018.  Compared to a previous version of Cozy described in PLDI 2016, the current version is more general; it does not need a custom outline language (it handles arbitrary expressions, and its search over implementations is more powerful) nor a tuning step over a user-supplied benchmark (it contains a symbolic cost model).  It can generate more complex data structures that involve multiple collections and aggregation operations.

Cozy was built by a team led by Calvin Loncaric.
Cozy is publicly available at https://github.com/CozySynthesizer/cozy.  Give it a try!

Tuesday, May 9, 2017

Ski rental, and when should you refactor your code?

A codebase often accumulates technical debt:  bad code that accumulates over time due to poor design or due to taking shortcuts to meet a deadline.

Sometimes, you need to take a break from implementing features or fixing bugs, and temporarily focus on repaying technical debt via activities such as refactoring.  If you don't do so, then the technical debt will accumulate until you find it impossible to make any changes to your software, and even refactoring to reduce the technical debt may become impractical.  But, you don't want to spend all your time refactoring nor to do so prematurely:  you want to spend most of your effort on delivering improvements that are visible to users, and you cannot be sure of your future development plans and needs.

Typically, the decision about when to refactor or clean up code is based primarily on gut feel, emotions, or external deadlines.  The ski rental algorithm offers a better solution.

Ski rental is a canonical rent/buy problem.  When you are learning to ski and unsure of whether you will stick with the sport, each day you have the choice of renting or buying skis.  Buying skis prematurely would be a waste of money if you stop skiing.  Renting skis for too long would be a waste of money if you ski for many days.

Here is an algorithm that guarantees you never spend more than twice as much as you would have, if you had known in advance how many days you would ski.  Rent until you have spent as much money as it would cost to buy the skis, and then buy the skis.  If you quit skiing during the rental period, you have spent the minimum possible amount.  If you continue skiing until you purchase skis, then you have spent twice as much as you would have, were you omniscient.  (Randomized algorithms exist that give a better expected expected amount of money lost, but do not guarantee a limit on the amount of wasted money.)

You can apply the same approach to refactoring.  For any proposed refactoring, estimate how much time it would take to perform, and estimate the ongoing costs of not refactoring (such as extra time required to add features, fix bugs, or test).  Use the ski rental algorithm to decide when to perform the refactoring.

The problem with the ski-rental approach is that programmers are notoriously bad at cost estimation, and ski rental requires you to compare two different estimates.  However, the alternative is to continue to make your decisions based on how much the code smells bother you -- an approach that is likely to waste your time, even if it satisfies your emotions.

Monday, May 1, 2017

Plotters, pantsers, and software development

Last week, I gave two talks at TU Delft and also had the privilege to hear a talk by Dr. Felienne Hermans that analogized programming to story-writing.  One source of inspiration for her was the observation that when kids program, their programs might not contain any user interaction, but only show a story, somewhat like a movie.

There are two general types of fiction writers:  plotters and pantsers.  A plotter outlines the story and plans its structure and characters before beginning to write and filling in the details.  By contrast, a pantser prefers to write by the seat of the pants, discovering the story as they write and later revising to achieve consistency.  There are great writers who are plotters and great writers who are pantsers (and most writers are probably some combination of the two personalities).  Each approach requires heavy work.  Plotters do their heavy work during the planning stage.  Pantsers do their heavy work during rewriting stages.

Most recommendations about software development come from a plotter mentality.  The developer should determine user requirements and decide on an architecture and specifications of components before writing the code.  Extreme Programming can be viewed as a reaction to this "Big Design Up Front" attitude.  Extreme Programming forbids pre-planning:  it encourages taking on one small task at a time and doing only enough work to complete that task.  It advocates refactoring during development -- similar to rewriting a text -- as the developers discover new requirements or learn the limitations of their design.

Perhaps Extreme Programming is a reaction from pantsers who feel alienated by the dominant software development approach.  Perhaps Extreme Programming is their attempt to express and legitimize their own style of thinking.  Perhaps by respecting those mental differences, we can improve education to attract more students and make them all feel welcome.  And perhaps both plotters and pantsers can understand the other in order to avoid needless religious wars over the right approach to software development.

Felienne wasn't able to answer my questions about plotters and pantsers, such as the following.  Can we look at a finished piece of writing and tell whether it was created by a plotter or a pantser?  How should we teach writing differently depending on the learner's preferred style?  Is one's personal style innate or learned?  Can people be trained to work in the other style, and what is the effect on their output?  For novices, which approach produces more successfully-completed manuscripts and fewer abandoned efforts?  Are different styles more appropriate for different genres, or for series rather than individual books?

Analogies can be useful, especially in sparking ideas, but they should not be taken too far.  For example, the frequent analogies between civil engineering and software engineering have led to unproductive "bridge envy" and incorrect comparisons.  Although many bridges are built each year, most of them do not require imaginative design because they are similar or identical to previously-built bridges.  By contrast, every new program is fundamentally different from what exists -- otherwise, we would just reuse or modify existing code.  Therefore, the design challenges are much greater for software.

I also have questions about the analogy between fiction writing and programming.  (I want to admit to you and to myself that I am a plotter, so these questions may reflect my personal bias.)  Although plotting and pantsing may both produce great novels when practiced by great writers, would they both produce great nonfiction -- or, for that matter, great bridges?  Extreme Programming has been shown to work in certain circumstances, but few people practice it in its pure form, and it does not scale up to large development efforts.  It is commonly said that you can't refactor a program to make it secure or to give it certain other desirable properties; is this actually true, and if so what does it say about the utility of pantsing in software engineering?  Can pantsing work well within the confines of a well-understood domain -- such as writing a period romance novel or building a website based on a framework you have used before?

Whatever the benefits of the writing analogy for software engineering, it is a thought-provoking alternative to the civil engineering analogy.  It reminds us to be aware of the many ways of thinking, not just our own.