Sunday, August 31, 2003

Lately, many people are linking to these pictures of a Windows error dialog in a highly visible location. Of course this is nothing new but here are a few of my thoughts on the subject:

Just because Windows crashes are frequently observed in highly visible locations doesn't necessarily mean it is less reliable than other operating systems. Suppose operating system A runs on 90% of computers and crashes 1% of the time, and O/S B runs on only 10% of the computers but crashes twice as often. If you observe computers at random, you will notice more 'A' crashes than 'B' crashes even though 'A' is in fact more reliable.

Not every highly visible error message can be blamed on the operating system. Looking closely at the pictures, we see that the actual message is "Your system is low on virtual memory. Windows is increasing the size of your virtual memory paging file. During this process, memory requests for some applications may be denied." The most likely cause of this message is an application program leaking memory, not a problem with the operating system. Even the smallest memory leak in an application that runs continuously, like the one displaying whatever should have been on the Macy's sign, will gradually add up. But the operating system cannot distinguish memory that a process has leaked from storage that it just hasn't used in a long time and so it must continue to keep track of it. When the size of the leaking process exceeds the space available for paging, the operating system must handle the condition in some way. Linux, for example, uses a heuristic to select a process to terminate with the intention of freeing up the most virtual memory. The designers of Windows, as explained in the error message above, decided to increase the size of the paging file while dening further allocation requests in the meantime. In both cases, the operating system is doing it's best to continue despite an error in the running application. Programmers: Be careful about resource management when writing long lived applications, like servers or embedded systems. Little mistakes add up over time.

One mistake that the designers of Windows did make however, is to assume that system errors should always be reported via a graphical dialog on the system console. In many computer applications, either there is no console, or, as in the case observed by New Yorkers on the corner of 34th and 7th in Manhattan, the users "at" the console are not the ones best equipped to handle the problem. I'm sure important system messages also end up in the Windows Event Log, but the lesson here is that popping up a dialog is not always appropriate. Programmers: Do not assume that the software you write will be run in an interactive environment.

No comments: