
Please feel free to forward this newsletter to as many friends as possible. I have a personal goal of having one million readers. I can achieve it with your help.
the Software View: What, me worrY2K? or How I stopped worrying and learned to love the Year-2000 problem. (Part I)
Welcome back, gentle readers. Your intrepid reporter and faithful correspondent has attempted to travel forward in time. All in an effort to bring you excellent content.
Welcome back, gentle readers. For those of you with Web access and a Netscape Navigator browser, please click here:
http://www.softwareview.com/
Scroll down the page and you will notice a link entitled, "Daily view weblog". The daily news page is also known as a "web log". It is en vogue and the fashion of these days to call it that. Click on the link, click "reload" on your browser or clear your browser cache to ensure that you always receive the freshest, hottest daily news concerning JavaTM, Linux, XML, and the software industry! The link never changes, but I will be updating the HTML file page behind it every day. Please, do take a gander at it every day.
Also, gentle readers, the Software View is an Associate Internet World Wide Web site of Amazon.com. I'd like to extend my sincere, heartfelt gratitude and thanks for your patronage. I'm offering links to books, et cetera that you can purchase from my web site. I'd greatly appreciate it if you would purchase software industry books from my web site. Help support my newsletter and web site by purchasing items from Amazon.com from my web site. Here is the URL (Uniform Resource Locator):
Click here
Now, dear readers, on with this week's episode of the Software View!
Your company's an aircraft carrier, and it could also become your Titanic. The iceberg's January 1, 2000 - Black Saturday. The good news is that you can see it coming. The bad news is that you're heading right for it. The worse news is that you can't turn in time. Now what? The countdown's begun and we've less than a year until we discover what "really" happens when the year becomes "00". Trust the software industry to shorten the "Year-2000" problem to "Y2K". It was this kind of thinking that caused the problem in the first place. It's The End Of The World As We Know It (TEOTWAWKI).
THE PROBLEM: BLIND DATE
As the entire world now knows, there's a problem with computer hardware, software, and data. Norman Shakespeare writes, "The Year-2000 problem's root cause is easily explained. The earliest computer programmers had so little memory to work with, that any trick for saving two bits was worthwhile. In the 1950's, who was even worried about how computers would handle data in the 2050's? The chances that a year entered into corporate records would need to begin with anything other than "19" seemed quite remote, so dropping the century digits was adopted as a memory-saving method. As computers became more powerful, this abbreviated dating convention continued to be the standard, mostly out of habit.
For many reasons, programmers have routinely used only two digits to represent the year in dates. Thus, "25" meant "1925". This works fine until 1999. After that, two-digits dates cause confusion because, if "25" means "1925", then "00" means "1900". This is called the Year-2000 problem - or Y2K for short, or sometimes the Millennium Bug (although it's not a bug at all). Year-2000 is a crisis without precedent in human history. We know exactly when it's going to occur. We also know that its effects will be global. We even know what's causing it and what to do about it. That's right: We can, if we all choose, solve it before it happens, although we probably won't. But here we are at the end of the twentieth century, a time when the inability of our machines to answer the simple question "is it the twentieth century or the twenty-first?" could result in the collapse of the communications, financial, filing, monitoring, security, and manufacturing systems that our entire economy relies upon.
The timing'll be fortunate, giving businesses the weekend to accommodate the possible onslaught. New Year's Eve 1999 will fall on a Friday evening. January 1 is a Saturday. So if the world comes to an end for a couple of days, it'll be okay. We've all had weekends like that.
THE MYTH OF ORDER
Ellen Ullman writes, "The real lesson of the Year-2000 problem is that software operates just like any other natural system: out of control. Y2K has uncovered a hidden side of computing. It's always been there, of course, and always will be. It's simply been obscured by the pleasures we get from our electronic tools and toys, and then lost in the zingy glow of techno-boosterism. Y2K is showing everyone what technical people have been dealing with for years: the complex, muddled, bug-bitten systems we all depend upon, and their nasty tendency toward the occasional disaster.
It's almost a betrayal. After being told for years that technology's the path to a highly evolved future, it's come as something of a shock to discover that a computer system isn't a shining city on the hill - perfect and ever new - or a gleaming glass tower of academia, but something more akin to an old farmhouse built bit by bit over decades by non-union carpenters.
The reaction has been anger, outrage even - how could all you programmers be so stupid? Y2K has challenged a belief in digital technology that has been almost religious. But it isn't surprising. The public has had little understanding of the context in which Y2K exists. Glitches, patches, crashes - these are as inherent to the process of creating an intelligent electronic system as is the beauty of an elegant algorithm, the satisfaction of a finely tuned program, the gee-whiz pleasure of messages sent around the world at light speed. Until you understand that computers contain both of these aspects - elegance and error - you can't really understand Y2K.
WHY Y2K?
Technically speaking, the "millennium bug" isn't a bug at all, but what is called a "design flaw". Programmers are very sensitive to the difference, since a bug means the code is at fault (the program isn't doing what it was designed to do), and a design flaw means it's the designer's fault (the code is doing exactly what was specified in the design, but the design was wrong and/or inadequate). In the case of the millennium bug, of course, the code was designed to use two-digit years, and that is precisely what it's doing. The problem comes if computers misread the two-digit numbers - "00", "01", et cetera. Should these be seen as "1900" and "1901", or as "2000" and "2001"? Two-digit dates were used originally to save space, since computer memory and disk storage were prohibitively expensive. The designers who chose to specify these two-digit "bugs" weren't stupid, and perhaps they weren't even wrong. By some accounts and estimates, the savings accrued by using two-digit years will have outweighed the entire cost of fixing the code for the year 2000.
But Y2K didn't even begin its existence as a design flaw. Up until the mid-1980's - almost thirty years after two-digit years were first put into use - what we now call Y2K would've been called an "engineering trade-off," and a good one. A trade-off: To get something you need, you give up something else you need less urgently; to get more space on disk and in memory, you give up the precision of the century indicators. Perfectly reasonable. The correct decision. The surest sign of its correctness is what happened next: Two-digit years went on to have a long, successful life as a "standard." Computer systems could not work without standards - an agreement among programs and systems about how they will exchange information. Dates flowed from program to program, system to system, from tape to memory to paper, and back to disk - it all worked just fine for decades.
Though not for centuries, of course. The near immortality of computer software has come as a shock to programmers. Ask anyone who was there: they never expected this stuff to still be around.
Bug, design flaw, side effect, engineering trade-off - programmers have many names for system defects, the way Eskimos have many words for snow. And for the same reason: They are very familiar with the thing and can detect its fine gradations. To be a programmer is to develop a carefully managed relationship with error. There's no getting around it. You either make your accommodations with failure, or the work'll become intolerable. Every program has a bug; every complex system has its blind spots. Occasionally, given just the right set of circumstances, something will fail spectacularly. There is a Silicon Valley company, formerly called Failure Analysis, whose business consists of studying system disasters. The company's sign used to face the freeway like a warning to every technical person heading north out of Silicon Valley: Failure Analysis.
No one simply accepts the inevitability of errors - no honest programmer wants to write a bug that'll bring down a system. Both engineers and technical managers have continually looked for ways to normalize the process, to make it more reliable, predictable - schedulable, at the very least. They have talked perennially about certification programs, whereby programmers would have to prove minimal proficiency in standard skills. They have welcomed the advent of reusable software components, or "objects," because components are supposed to make programming more accessible, a process more like assembling hardware than proving a mathematical theorem. They have tried elaborate development methodologies. But the work of programming has remained maddeningly undefinable, some mix of mathematics, sculpting, scrupulous accounting, and wily, ingenious plumbing.
In the popular imagination, the programmer is a kind of traveler into the unknown, venturing near the margin of mind and meatspace. Maybe. For moments. On some extraordinary projects, sometimes - a new operating system, a newly conceived class of software. For most of us, though, programming is not a dramatic confrontation between human and machine; it is a confused conversation with programmers we will never meet, a frustrating wrangle with some other programmer's code called maintenance.
Most modern programming is done through what are called application programming interfaces, or API's. Your job is to write some code that will talk to another piece of code in a narrowly defined way using the specific methods offered by the interface, and only those methods. The interface is rarely documented well. The code on the other side of the interface is usually sealed in a proprietary black box. And below that black box is another, and below that another - a receding tower of black boxes, each with its own errors. You can't envision the whole tower, you can't open the boxes, and what information you have been given about any individual box could be wrong. The experience is a little like looking at a madman's electronic bomb and trying to figure out which wire to cut. You try to do it carefully but sometimes things blow up.
At its core, programming remains irrational - a time-consuming, painstaking, error-stalked process, out of which comes a functional but flawed piece of work. And it most likely will remain so as long as we are using computers whose basic design descends from ENIAC, a machine constructed to calculate the trajectory of artillery shells. A programmer is presented with a task that a program must accomplish. But it is a task as a human sees it: full of unexpressed knowledge, implicit associations, allusions to allusions. Its coherence comes from knowledge structures deep in the body, from experience, memory. Somehow all this must be expressed in the constricted language of the API, and all of the accumulated code must resolve into a set of instructions that can be performed by a machine that is, in essence, a giant calculator. It shouldn't be surprising if mistakes are made.
There is irrationality at the core of programming, and there is irrationality surrounding it from without. Factors external to the programmer - the whole enterprise of computing, its history and business practices - create an atmosphere in which flaws and oversights are that much more likely to occur.
The most irrational of all external factors, the one that makes the experience of programming feel most insane, is known as "aggressive scheduling." Whether software companies will acknowledge it or not, release schedules are normally driven by market demand, not the actual time it would take to build a reasonably robust system. The parts of the development process most often foreshortened are two crucial ones: design documentation and testing. There is a senior consultant - a woman who has been in the business for some thirty years, someone who founded and sold a significant software company - who explains why she would no longer work with a certain client. She had presented a software development schedule to the client, who received it, read it, then turned it back to her, asking if she'd remake the schedule so that it took exactly half the time. There were many veteran programmers in the room; they nodded along in weary recognition.
Even if programmers were given rational development schedules, the systems they work on are increasingly complex, patched together - and incoherent. Systems have become something like Russian nesting dolls or gift boxes within gift boxes, with newer software wrapped around older software, which is wrapped around software that is older yet. We have come to see that software programming code doesn't evolve; it accumulates.
A BUG'S LIFE
A young Web company founder - very young; Scott Hassan of eGroups.com - suggests that all programs should be replaced every two years. He's probably right. It would be a great relief to toss all our old code into that trash container where we dumped the computer we bought a couple of years ago. Maybe on the Web we can constantly replenish our code: The developer never lets go of the software; it sits there on the server available for constant change, and the users have no choice but to take it as it comes.
But software doesn't follow Moore's Law, doubling its power every eighteen months. It is still the product of a handworked craft, with too much meticulous effort already put into it. Even eGroups.com, founded only nine months ago, finds itself stuck with code programmers have no time to redo. Said Carl Page, another of its founders, "We're living with code we wish we'd done better the first time."
The problem of old code is many times worse in a large corporation or a government office, where whole subsystems may have been built twenty or thirty years ago. Most of the original programmers are long gone, taking their knowledge with them - along with the programmers who followed them, and ones after that. The code, a sort of palimpsest by now, becomes difficult to understand. Even if the company had the time to replace it, it is no longer sure of everything the code does. So it's kept running behind wrappers of newer code - so-called middleware, or quickly developed user interfaces like the Web - which keeps the old code running, but as a fragile, precious object. The program runs, but isn't understood; it can be used, but not modified. Eventually, a complex computer system becomes a journey backward through time. Look into the center of the most slick-looking Web banking site, built a few months ago, and you are bound to see a creaky database running on an aged mainframe.
Adding yet more complexity are the electronic connections that have been built between systems: customers, suppliers, financial clearinghouses, whole supply chains interlinking their systems. One patched-together wrapped-up system exchanges data with another patched-together wrapped-up system - layer upon layer of software involved in a single transaction, until the possibility of failure increases exponentially.
It is from deep in there - somewhere near the middle-most Russian doll or gift box in the innermost layer of software - that the millennium bug originates. One system sends it on to the next, along with the many bugs and problems we already know about, and the untold numbers that remain to be discovered. One day - maybe when we switch to the new version of the Internet Protocol, or when some router somewhere is replaced - one day the undiscovered bugs will come to light and we will have to worry about each of them in turn. The millennium bug isn't unique; it is just the flaw we see now, the most convincing evidence yet of the human fallibility that lives inside every system.
It's hard to overstate just how common bugs are. Every week, the computer trade paper InfoWorld prints a little box called "The Bug Report," showing problems in commonly used software, some of them very serious. And the box itself is just a sampling from "www.bugnet.com", where one day's search for bugs relating to "security" yielded a list of sixty-eight links, many to other lists and to lists of links, reflecting what may be thousands of bugs related to this keyword alone. And that is just the ones that are known about and have been reported.
If you think about all the things that can go wrong, it'll drive you crazy. So technical people, who can't help knowing about the fragility of systems, have had to find some way to live with what they know. What they've done is develop a normal sense of failure, an everyday relationship with potential disaster.
To be continued ...
Sincerely,
Mark Kuharich
Join my free e-mail newsletter called the Software View by clicking here or by sending an e-mail to thesoftwareview-owner@west-point.org