Sunday, July 24, 2016


The Institutional Repository (IR) is obsolete. Its flawed foundation cannot be repaired. The IR must be phased out and replaced with viable alternatives.

Lack of enthusiasm. The number of IRs has grown because of a few motivated faculty and administrators. After twenty years of promoting IRs, there is no grassroots support. Scholars submit papers to an IR because they have to, not because they want to. Too few IR users become recruiters. There is no network effect.

Local management. At most institutions, the IR is created to support an Open Access (OA) mandate. As part of the necessary approval and consensus-building processes, various administrative and faculty committees impose local rules and exemptions. After launch, the IR is managed by an academic library accountable only to current faculty. Local concerns dominate those of the worldwide community of potential users.

Poor usability. Access-, copy-, reuse, and data-mining rights are overly restrictive or left unstated. Content consists of a mishmash of formats. The resulting federation of IRs is useless for serious research. Even the most basic queries cannot be implemented reliably. National IRs (like PubMed) and disciplinary repositories (like ArXiv) eliminate local idiosyncrasies and are far more useful. IRs were supposed to duplicate their success, while spreading the financial burden and immunizing the system against adverse political decisions. The sacrifice in usability is too high a price to pay.

Low use. Digital information improves with use. Unused, it remains stuck in obsolete formats. After extended non-use, recovering information requires a digital version of archaeology. Every user of a digital archive participates in its crowd-sourced quality control. Every access is an opportunity to discover, report, and repair problems. To succeed at its archival mission, a digital archive must be an essential research tool that all scholars need every day.

High cost. Once upon a time, the IR was a cheap experiment. Today's professionally managed IR costs far too much for its limited functionality.

Fragmented control. Over the course of their careers, most scholars are affiliated with several institutions. It is unreasonable to distribute a scholar's work according to where it was produced. At best, it is inconvenient to maintain multiple accounts. At worst, it creates long-term chaos to comply with different and conflicting policies of institutions with which one is no longer affiliated. In a cloud-computing world, scholars should manage their own personal repositories, and archives should manage the repositories of scholars no longer willing or able.

Social interaction. Research is a social endeavor. [Creating Knowledge] Let us be inspired by the titans of the network effect: Facebook, Twitter, Instagram, Snapchat, etc. Encourage scholars to build their personal repository in a social-network context. Disciplinary repositories like ArXiv and SSRN can expand their social-network services. Social networks like, Mendeley, Zotero, and Figshare have the capability to implement and/or expand IR-like services.

Distorted market. Academic libraries are unlikely to spend money on services that compete with IRs. Ventures that bypass libraries must offer their services for free. In desperation, some have pursued (and dropped) controversial alternative methods of monetizing their services. [Scholars Criticize Proposal to Charge Authors for Recommendations]

Many academics are suspicious of any commercial interests in scholarly communication. Blaming publishers for the scholarly-journal crisis, they conveniently forget their own contribution to the dysfunction. Willing academics, with enthusiastic help from publishers, launch ever more journals.[Hitler, Mother Teresa, and Coke] They also pressure libraries to site license "their" journals, giving publishers a strong negotiation position. Without library-paid site licenses, academics would have flocked to alternative publishing models, and publishers would have embraced alternative subscription plans like an iTunes for scholarly papers. [Where the Puck won't be] [What if Libraries were the Problem?] Universities and/or governments must change how they fund scholarly communication to eliminate the marketplace distortions that preserve the status quo, protect publishers, and stifle innovation. In a truly open market of individual subscriptions, start-up ventures would thrive.

I believed in IRs. I advocated for IRs. After participating in the First Meeting of the Open Archives Initiative (1999, Santa Fe, New Mexico), I started a project that would evolve into Caltech CODA. [The Birth of the Open Access Movement] We encouraged, then required, electronic theses. We captured preprints and historical documents. [E-Journals: Do-It-Yourself Publishing]

I was convinced IRs would disrupt scholarly communication. I was wrong. All High Energy Physics (HEP) papers are available in ArXiv. Being a disciplinary repository, ArXiv functions like an idealized version of a federation of IRs. It changed scholarly communication for the better by speeding up dissemination and improving social interaction, but it did not disrupt. On the contrary, HEP scholars organized what amounted to an an authoritarian take-over of the HEP scholarly-journal marketplace. While ensuring open access of all HEP research, this take-over also cemented the status quo for the foreseeable future. [A Physics Experiment] 

The IR is not equivalent with Green Open Access. The IR is only one possible implementation of Green OA. With the IR at a dead end, Green OA must pivot towards alternatives that have viable paths forward: personal repositories, disciplinary repositories, social networks, and innovative combinations of all three.

*Edited 7/26/2016 to correct formatting errors.

Tuesday, January 20, 2015

Creating Knowledge

Every scholar is part wizard, part muggle.

As wizards, scholars are lone geniuses in search of original insight. They question everything. They ignore conventional wisdom and tradition. They experiment.

As muggles, scholars are subject to the normal rules of power and influence. They are limited by common sense and group think. They are ambitious. They promote and market their ideas. They have the perfect elevator pitch ready for every potential funder of research. They connect their research to hot fields. They climb the social ladder in professional societies. As muggles, they know that the lone voice is probably wrong.

The sad fate of the wizards is that their discoveries, no matter how significant, are not knowledge until accepted by the muggles.

Einstein stood on the shoulder of giants: he needed all of the science that preceded him. First, he needed it to develop special relativity theory. Then, he needed it as a starting point from where to lead the physics community on an intellectual journey. Without that base of prior shared knowledge, they would not have followed.

As a social construct, knowledge moves at a speed limited by the wisdom of the crowd. The real process by which scholarly research moves from the world of the wizard into the world of muggles is murky, complicated, longwinded, and ambiguous. Despising these properties, muggles created a clear and straightforward substitute: the peer-review process.

When only a small number of distinguished scholarly bodies published journals, publishing signaled that the research was widely accepted as valid and important. Today, thousands of scholarly groups and commercial entities publish as many as 28,000 scholarly journals, and publishing no longer functions as a serious proxy for wide acceptance.

Most journals are created when some researchers believe established journals ignore or do not sufficiently support a new field of inquiry. New journals give new fields the time and space to grow and to prove themselves. They also reduce the size of the referee pool. They avoid generalists critical of the new field. Gradually, peer review becomes a process in which likeminded colleagues distribute stamps of approval to each other.

Publishers thrive by amplifying scholarly fractures and by creating scholarly islands. As discussed in previous blog posts, normal free-market principles do not apply to the scholarly-journal market. [What if Libraries were the Problem] Without an effective method to kill off journals, their number and size keep increasing. Unfortunately, the damage to universities and to scholarship far exceeds the cost of journals.

Niche fields use their success in the scholarly-communication market to acquire departmental status, making the scholarly fracture permanent. The economic crisis may have stopped or reversed the trend of ever more specialized, smaller, university departments, but the increased cost structure inherited from the boom years lingers. Creating a new department should be an exceptional event. Universities went overboard, influenced and pressured by commercial interests.

As a quality-control system, the scholarly-communication system should be conservative and skeptical. As a communication system, it should give exposure to new ideas and give them a chance to develop. By simultaneously pursuing two contradictory goals, scholarly journals have become ineffective at both. They are too specialized to be credible validators. They are too slow and bureaucratic for growing new ideas.

Journals survive because universities use them for assessment. Not surprisingly, scholarly papers solidly reside in muggle world. Too many papers are written by Very Serious Intellectuals (VSIs) for VSIs. Too many papers are written in self-aggrandizing pompous prose, loaded with countless footnotes. Too many papers are written to flatter VSIs with too many irrelevant references. Too many papers are written to puff up a tidbit of incremental information. Too many papers are written. Too few papers detail negative results or offer serious critique, because that only makes enemies.

When given the opportunity, scholarly authors produce awe inspiring presentations. The edutainment universe of TED Talks may not be an appropriate forum for the daily grunt work of the scholar, but is it really too much to ask that the scholarly-communication system let the wizardry shine through?

Universities claim to be society's engines of innovation. They have preached the virtues of creative destruction brought on by technological innovation. Yet, the wizards of the ivory tower resist minor change as much as the muggles of the world.

Open Access is catalyzing reform on the business side of the scholarly-communication system. Will Open Access be enough to push universities into experimentation on the scholarly side?

That is an Open question.

Wednesday, October 1, 2014

The Metadata Bubble

In an ideal world, scholars deposit their papers in an Open Access repository, because they know it will advance their research, support their students, and promote a knowledge-based society. A few disciplinary repositories, like ArXiv, have shown that it is possible to close the virtuous cycle where scholars reinforce each other's Open Access habits. In these communities, no authority is needed to compel participation.

Institutional repositories have yet to build similar broad-based enthusiastic constituencies. Yet, many Open Access advocates believe that the decentralized approach of institutional repositories creates a more scalable system with a higher probability for long-term survival. The campaign to enact institutional deposit mandates hopes to jump start an Open Access virtuous cycle for all scholarly disciplines and all institutions. The risk of such a campaign is that it may backfire if scholars should experience Open Access as an obligation with few benefits. For long-term success, most scholars must perceive their compelled participation in Open Access as a positive experience.

It is, therefore, crucial that repositories become essential scholarly resources, not dark archives to be opened only in case of emergency. The Open Archives Initiative (OAI) repository design provided what was thought to be the necessary architecture. Unfortunately, we are far from realizing its anticipated potential. The Protocol for Metadata Harvesting (OAI-PMH) allows service providers to harvest any metadata in any format, but most repositories provide only minimal Dublin Core metadata, a format in which most fields are optional and several are ambiguous. Extremely few repositories enable Object Reuse and Exchange (OAI-ORE), which allows for complex inter-repository services through the exchange of multimedia objects, not just metadata about them. As a result, OAI-enabled services are largely limited to the most elementary kind of searches, and even these often deliver unsatisfactory results, like metadata-only placeholder records for works restricted by copyright or other considerations.

In a few years, we will entrust our life and limb to self-driving cars. Their programs have just milliseconds to compute critical decisions based on information that is imprecise, approximate, incomplete, and inconsistent: all maps are outdated by the time they are produced, GPS signals may disappear, radar and/or lidar signatures are ambiguous, and video or images provide obstructed views in constantly changing environments. When we can extract so much actionable information from such "dirty" information, it seems quaint to obsess about metadata.

Databases automatically record user interactions. Users fill out forms and effectively crowdsource metadata. Expert systems can extract, from any document in any format and in any language, author information, citations, keywords, DNA sequences, chemical formulas, mathematical equations, etc. Other expert systems have growing capabilities to analyze sound, image, and video. Technology is evaporating the pool of problems that require human intervention at the transaction level. The opportunities for human metadata experts to add value are disappearing fast.

The metadata approach is obsolete for an even more fundamental reason. Metadata are the digital extension of a catalog-centered paper-based information system. In this kind of system, today's experts organize today's information so tomorrow's users may solve tomorrow's problems efficiently. This worked well when technology changed slowly, when experts could predict who the future users would be, what kind of problems they would like to solve, and what kind of tools they would have at their disposal. These conditions no longer apply.

When digital storage is cheap, why implement expensive selection processes for an archive? When search technology does not care whether information is excruciatingly organized or piled in a heap, why spend countless hours organizing and curating content? Why agonize over potential future problems with unreadable file formats? Preserve all the information about current software and standards, and start developing the expert systems to unscramble any historical format. Think of any information-management task. How reasonable is the proposition that this task will require direct human intervention in two years? In five years? In ten years?

For content, more is more. We must acquire as much content as possible, and store it safely.

For content administration, less is more. Expert systems give us the freedom to do the bare minimum and to make a mess of it. While we must make content useful and enable as many services as possible, it is no longer feasible to accomplish that by designing systems for an anticipated future. Instead, we must create the conditions that attract developers of expert systems. This is remarkably simple: Make the full text and all data available with no strings attached.

Real Open Access.

Monday, June 30, 2014

Disruption Disrupted?

The professor who books his flights online, reserves lodging with Airbnb, and arranges airport transportation with Uber understands the disruption of the travel industry. He actively supports that disruption every time he attends a conference. When MOOCs threaten his job, when The Economist covers reinventing the university and titles it “Creative Destruction", that same professor may have second thoughts. With or without disruption, academia surely is in a period of immense change. There is the pressure to reduce costs and tuition, the looming growth of MOOCs, the turmoil in scholarly communication (subscription prices, open access, peer review, alternative metrics), the increased competition for funding, etc.

The term disruption was coined and popularized by Harvard Business School Professor Clayton Christensen, author of The Innovator's Dilemma. [The Innovator's Dilemma, Clayton Christensen, Harvard Business Review Press, 1997] Christensen created a compelling framework for understanding the process of innovation and disruption. Along the way, he earned many accolades in academia and business. In recent years, a cooling of the academic admiration became increasingly noticeable. A snide remark here. A dismissive tweet there. Then, The New Yorker launched a major attack on the theory of disruption. [The Disruption Machine, Jill Lepore, The New Yorker, June 23rd, 2014] In this article, Harvard historian Jill Lepore questions Christensen's research by attacking the underlying facts. Were Christensen's disruptive startups really startups? Did the established companies really lose the war or just one battle? At the very least, Lepore is implying that Christensen misled his readers.

As of this writing, Christensen has only responded in a brief interview. [Clayton Christensen Responds to New Yorker Takedown of 'Disruptive Innovation', Bloomberg Businessweek, June 20th, 2014] It is clear he is preparing a detailed written response.

Lepore's critique appears at the moment when disruption may be at academia's door, seventeen years after The Innovator's Dilemma was published, much of the research almost twenty years old. Perhaps, the article is merely a symptom of academics growing nervous. Yet, it would be wrong to dismiss Lepore's (or anyone other's) criticism based on any perceived motivation. Facts can be and should be examined.

In 1997, I was a technology manager tasked with dragging a paper-based library into the digital era. When reading (and re-reading) the book, I did not question the facts. When Christensen stated that upstart X disrupted established company Y, I accepted it. I assume most readers did. The book was based on years of research, all published in some of the most prestigious peer-reviewed journals. It is reasonable to assume that the underlying facts were scrutinized by several independent experts. Truth be told, I did not care much that his claims were backed by years of research. Christensen gave power to the simple idea that sticking with established technology can carry an enormous opportunity cost.

Established technology has had years, perhaps decades, to mitigate its weaknesses. It has a constituency of users, service providers, sales channels, and providers of derivative services. This constituency is a force that defends the status quo in order to maintain established levels of quality, profit margins, and jobs. The innovators do not compete on a level playing field. Their product may improve upon the old in one or two aspects, but it has not yet had the opportunity to mitigate its weaknesses. When faced with such innovations, all organizations tend to stick with what they know for as long as possible.

Christensen showed the destructive power of this mind set. While waiting until the new is good enough or better, organizations lose control of the transition process. While pleasing their current customers, they lose future customers. By not being ahead of the curve, by ignoring innovation, by not restructuring their organizations ahead of time, leaders may put their organizations at risk. Christensen told compelling disruption stories in many different industries. This allowed readers to observe their own industry with greater detachment. It gave readers the confidence to push for early adoption of inevitable innovation.

I am not about to take sides in the Lepore-Christensen debate. Neither needs my help. As an observer interested in scholarly communication, I cannot help but noting that Lepore, a distinguished scholar, launched her critique from a distinctly non-scholarly channel. The New Yorker may cater to the upper-crust of intellectuals (and wannabes), but it remains a magazine with journalistic editorial-review processes, quite distinct from scholarly peer-review processes.

Remarkably, the same happened only a few weeks ago, when the Financial Times attempted to take down Piketty's book. [Capital in the Twenty-First Century, Thomas Piketty, Belknap Press; 2014]  [Piketty findings undercut by errors, Chris Giles, Financial Times, May 23rd, 2014] Piketty had a distinct advantage over Christensen. The Financial Times critique appeared a few weeks after his book came out. Moreover, he had made all of his data public, including all technical adjustments required to make data from different sources compatible. As a result, Piketty was able to respond quickly, and the controversy quickly dissipated. Christensen has the unenviable task of defending twenty-year old research. For his sake, I hope he was better at archiving data than I was in the 1990s.

What does it say about the status of scholarly journals when scholars use magazines to launch scholarly critiques? Was Lepore's article not sufficiently substantive for a peer-reviewed journal? Are scholarly journals incapable or unwilling to handle academic controversy involving one of its eminent leaders? Is the mainstream press just better at it? Would a business journal even allow a historian to critique business research in its pages? If this is the case, is peer review less about maintaining standards and more about protecting an academic tribe? Is the mainstream press just a vehicle for some scholars to bypass peer review and academic standards? What would it say about peer review if Lepore's arguments should prevail?

This detached observer pours a drink and enjoys the show.

PS (7/15/2014): Reposted with permission at The Impact Blog of The London School of Economics and Political Science.

Friday, June 20, 2014

The Billionaires, Part 1: Elon Musk

Elon Musk did not need a journal to publicize his Hyperloop paper. [Hyperloop Alpha] No journal can create the kind of buzz he creates on his own. He did not need the validation of peer review; he had the credibility of his research teams that already revolutionized travel on earth and to space. He did not need the prestige of a journal's brand; he is his own brand.

Any number of journals would have published this paper by this author. They might even have expedited their review process. Yet, journals could hardly have done better than the public-review process that actually took place. Within days, experts from different disciplines had posted several insightful critiques. By now, there are too many to list. A journal would have insisted that the paper include author(s) and affiliations, a publication date (Aug. 12th, 2013), a bibliography... but those are irrelevant details to someone on a mission to change the world.

Does the Hyperloop paper even qualify as a scholarly paper? Or, is it an engineering-based political pamphlet written to undermine California's high-speed rail project? As a data point for scholarly communication, the Hyperloop paper may be an extreme outlier, but it holds some valuable lessons for the scholarly-communication community.

The gate-keeping role of journals is permanently over.

Neither researchers nor journalists rely on scholarly editors to dismiss research on their behalf.

In many disciplines, day-to-day research relies more on the grey literature (preprints, technical reports, even blogs and mailing lists) than on journal articles. In other words, researchers commit considerable time to refereeing one another, but they largely ignore each other's gate keeping. When it matters, they prefer immediacy over gate keeping and their own gate keeping over someone else's.

The same is true for journalists. If the story is interesting, it does not matter whether it comes from an established journal or the press release of a venture capitalist. Many journalists balance their reports with comments from neutral or adversarial experts. This practice may satisfy a journalistic concept of objectivity, but giving questionable research "equal treatment" may elevate it to a level it does not deserve.

Public review can be fast and effective. 

The web-based debate on Hyperloop remained remarkably professional and civil. Topics that attract trolls and conspiracy theorists may benefit from a more controlled discussion environment, but the public forum worked well for Hyperloop. The many critiques provide skeptical, but largely constructive, feedback that bold new ideas need.

Speculative papers that spark the imagination do not live by the stodgy rules of peer review.

The Hyperloop paper would be a success if its only accomplishment is inspiring a handful of young engineers to research radically different modes of mass transportation. Unfortunately, publishing speculative, incomplete, sloppy, or bad research may cause real harm. The imagined link between vaccines and autism (published in a peer-reviewed journal and later retracted) serves as an unhappy reminder of the latter.

Not all good research belongs in the scholarly record.

This episode points to an interactive future of scholarly communication. After the current public discussion, Hyperloop may gain acceptance, and engineering journals may publish many papers about it. Alternatively, the idea may die a quiet death, perhaps documented by one or more historical review papers (or books).

The ideal research paper solves a significant problem with inspiration (creative bold ideas) and perspiration (proper methodology, reproducibility, accuracy). Before that ideal is in sight, researchers travel long winding roads with many detours and dead ends. Most papers are small incremental steps along that road. A select few represent milestone research.

The de-facto system to identify milestone research is journal prestige. No journal could survive if it advertised itself as a place for routine research. Instead, the number of journals has exploded, and each journal claims high prestige for the narrowest of specializations. All of these journals treat all submissions as if they are milestone research and apply the same costly and inefficient refereeing processes across the board.

The cost of scholarly communication is more than the sum of subscriptions and page charges. While refereeing can be a valuable experience, there is a point of diminishing returns. Moreover, overwhelmed scholars are more likely to conduct only cursory reviews after ignoring the requests for extended periods. The expectation that all research deserves to be refereed has reduced the quality of the refereeing process, introduced inordinate delays, increased the number of journals, and indirectly increased the pressure to publish.

Papers should earn the privilege to be refereed. By channeling informal scholarly communication to social-network platforms, research can gain some scholarly weight based on community feedback and usage-based metrics. Such social networks, perhaps run by scholarly societies, would provide a forum for lively debate, and they could act as submission and screening systems for refereed journals. By restricting refereed journals to milestone research supported and validated by a significant fraction of the profession, we would need far fewer, less specialized journals.

A two-tier system would provide the immediacy and openness researchers crave, while reserving the highest level of scrutiny to research that has already shown significant promise.

Wednesday, May 21, 2014

Sustainable Long-Term Digital Archives

How do we build long-term digital archives that are economically sustainable and technologically scalable? We could start by building five essential components: selection, submission, preservation, retrieval, and decoding.

Selection may be the least amenable to automation and the least scalable, because the decision whether or not to archive something is a tentative judgment call. Yet, it is a judgment driven by economic factors. When archiving is expensive, content must be carefully vetted. When archiving is cheap, the time and effort spent on selection may cost more than archiving rejected content. The falling price of digital storage creates an expectation of cheap archives, but storage is just one component of preservation, which itself is only one component of archiving. To increase the scalability of selection, we must drive down the cost of all other archive services.

Digital preservation is the best understood service. Archive content must be transferred periodically from old to new storage media. It must be mirrored at other locations around world to safeguard against natural and man-made disasters. Any data center performs processes like these every day.

The submission service enters bitstreams into the archive and enables future retrieval of identical copies. The decoding service extracts information from retrieved bitstreams, which may have been produced by lost or forgotten software.

We could try to eliminate the decoding service by regularly re-encoding bitstreams for current technology. While convenient for users, this approach has a weakness. If a refresh cycle should introduce an error, subsequent cycles may propagate and amplify the error, making recovery difficult. Fortunately, it is now feasible to preserve old technology using virtualization, which lets us emulate almost any system on almost any hardware. Anyone worried about the long term should consider the Chrome emulator of Amiga 500 (1987) or the Android emulator of the HP 45 calculator (1973). The hobbyists who developed these emulators are forerunners of a potential new profession. A comprehensive archive of virtual old systems is an essential enabling technology for all other digital archives.

The submission and retrieval services are interdependent. To enable retrieval, the submission service analyzes bitstreams and builds an index for the archive. When bitstreams contain descriptive metadata constructed specifically for this purpose, the process of submission is straightforward. However, archives must be able to accept any bitstream, regardless of the presence of such metadata. For bitstreams that contain a substantial amount of text, full-text indexing is appropriate. Current technology still struggles with non-text bitstreams, like images, graphics, video, or pure data.

To simplify and automate the submission service, we need the participation of software developers. Most bitstreams are produced by mass-market software such as word processors, database or spreadsheet software, video editors, or image processors. Even data produced by esoteric experiments are eventually processed by applications that still serve hundreds of thousands of specialists. Within one discipline, the number of applications rarely exceeds a few hundred. To appeal to this relatively small number of developers, who are primarily interested in solving their customers' problems, we need a better argument than “making archiving easy.”

Too few application developers are aware of their potential role in research data management. Consider, for example, an application that converts data into graphs. Although most of the graphs are discarded after a quick glance, each is one small step in a research project. With little effort, that graphing software could provide transparent support for research data management. It could reformat raw input data into a re-usable and archivable format. It could give all files it produces unique identifiers and time stamps. It could store these files in a personal repository. It could log activity in a digital lab notebook. When a file is deleted, the personal repository could generate an audit trail that conforms to discipline-specific customs. When research is published, researchers could move packages of published and supporting material from personal to institutional repositories and/or to long-term archives.

Ad-hoc data management harms the longer-term interests of individual researchers and the scholarly community. Intermediate results may be discarded before it is realized they were, after all, important. The scholarly record may not contain sufficient data for reproducibility. Research-misconduct investigations may be more complicated and less reliable.

For archivists, the paper era is far from over. During the long transition, archivists may prepare for the digital future in incremental steps. Provide personal repositories. Work with a few application developers to extend key applications to support data management. After proof of concept, gradually add more applications.

Digital archives will succeed only if they are scalable and sustainable. To accomplish this, digital archivists must simplify and automate their services by getting involved well before information is produced. Within each discipline, archives must work with researchers, application providers, scholarly societies, universities, and funding agencies to develop appropriate policies for data management and the technology infrastructure to support those policies.

Monday, April 14, 2014

The Bleeding Heart of Computer Science

Who is to blame for the Heartbleed bug? Perhaps, it does not matter. Just fix it, and move on. Until the next bug, and the next, and the next.

The Heartbleed bug is different from other Internet scares. It is a vulnerability at the core of the Internet infrastructure, a layer that provides the foundation for secure communication, and it went undetected for years. It should be a wake-up call. Instead, the problem will be patched. Some government and industry flacks will declare the crisis over. We will move on and forget about it.

There is no easy solution. No shortcut. We must redevelop our information infrastructure from the ground up. Everything. Funding and implementing such an ambitious plan may become feasible only after a major disaster strikes that leaves no other alternative. But even if a complete redesign were to become a debatable option, it is not at all clear that we are up to the task.

The Internet is a concurrent and asynchronous system. A concurrent system consists of many independent components like computers and network switches. An asynchronous system operates without a central clock. In synchronous systems, like single processors, a clock provides the heartbeat that tells every component when state changes occur. In asynchronous systems, components are interrupt driven. They react to outside events, messages, and signals as they happen. The thing to know about concurrent asynchronous systems is this: It is impossible to de-bug them. It is impossible to isolate components from one another for testing purposes. The cost of testing quickly becomes prohibitive for each successively smaller marginal reduction in the probability of bugs. Unfortunately, when a system consists of billions of components, even extremely low-probability events are a daily occurrence. These unavoidable fundamental problems are exacerbated by continual system changes in hardware and software and by bad actors seeking to introduce and/or exploit vulnerabilities.

When debugging is not feasible, mathematical rigor is required. Current software-development environments are all about pragmatism, not rigor. Programming infrastructure is built to make programming easy, not rigorous. Most programmers develop their programs in a virtual environment and have no idea how their programs really function. Today's computer-science success stories are high-school geniuses that develop multimillion-dollar apps and college dropouts that start multibillion-dollar businesses. These are built on fast prototypes and viral marketing, not mathematical rigor. Who in their right mind would study computer science from people who made a career writing research proposals that never led to anything worth leaving a paltry academic job for?

Rigor in programming is the domain of Edsger W. Dijkstra, the most (in)famous, admired, and ignored computer-science eccentric. In 1996, he laid out his vision of Very Large Scale Application of Logic as the basis for the next fifty years of computer science. Although the examples are dated, his criticism of the software industry still rings true:
Firstly, simplicity and elegance are unpopular because they require hard work and discipline to achieve and education to be appreciated. Secondly we observe massive investments in efforts that are heading in the opposite direction. I am thinking about so-called design aids such as circuit simulators, protocol verifiers, algorithm animators, graphical aids for the hardware designers, and elaborate systems for version control: by their suggestion of power, they rather invite than discourage complexity. You cannot expect the hordes of people that have devoted a major part of their professional lives to such efforts to react kindly to the suggestion that most of these efforts have been misguided, and we can hardly expect a more sympathetic ear from the granting agencies that have funded these efforts: too many people have been involved and we know from past experience that what has been sufficiently expensive is automatically declared to have been a great success. Thirdly, the vision that automatic computing should not be such a mess is obscured, over and over again, by the advent of a monstrum that is subsequently forced upon the computing community as a de facto standard (COBOL, FORTRAN, ADA, C++, software for desktop publishing, you name it).
[The next fifty years, Edsger W. Dijkstra, circulated privately, 1996,
Document 1243a of the E. W. Dijkstra Archive,,
or, for fun, a version formatted in the Dijkstra handwriting font]

The last twenty years were not kind to Dijkstra's vision. The hordes turned into horsemen of the apocalypse that trampled, gored, and burned any vision of rigor in software. For all of us, system crashes, application malfunctions, and software updates are daily occurrences. It is build into our expectation.

In today's computer science, the uncompromising radicals that prioritize rigor do not stand a chance. Today's computer science is the domain of genial consensus builders, merchants of mediocrity that promise everything to everyone. Computer science has become a social construct that evolves according to political rules.

A bottoms-up redesign of our information infrastructure, if it ever becomes debatable, would be defeated before it even began. Those who could accomplish a meaningful redesign would never be given the necessary authority and freedom. Instead, the process would be taken over by political and business forces, resulting into effective status quo.

In 1996, Dijkstra believed this:
In the next fifty years, Mathematics will emerge as The Art and Science of Effective Formal Reasoning, and we shall derive our intellectual excitement from learning How to Let the Symbols Do the Work.
There is no doubt that he would still cling to this goal, but even Dijkstra may have started to doubt his fifty-year timeline.