The Cathedral and the Bazaar by Eric S. Raymond

The Cathedral and the Bazaar by Eric S. Raymond

I analyze a successful free software project (fetchmail), which was made to deliberately try some amazing ideas about software engineering suggested by the history of Linux. I discuss these theories in terms of two fundamentally opposed styles of development: the cathedral model of most commercial software makers against the bazaar model of the Linux world. I show that these models are based on conflicting views about the nature of the task of debugging software. Later, I make an argument from the Linux experience, the following statement. "If you have enough eyes, all the fleas will jump at sight '' In the end, I suggest some fruitful analogies with other systems self-regulated agents selfish,

The Cathedral and the Bazaar

Linux is subversive. Who would have thought five years ago that an operating system world-class emerge, as if by magic, thanks to the activity hacker deployed in spare time by several thousand developers scattered all over the planet, connected only by the tenuous threads the Internet?

What is certain is that I do not. When Linux appeared on my way, in early 1993, I had invested in UNIX and free software development about ten years. I was one of the first to contribute to GNU mid-eighties and have been providing a lot of free software to the network, developing or collaborating on several programs (nethack, the Emacs VC and GUD modes, and xlife others) are still widely used. I thought I knew how things should be done.

Linux came to overturn much of what I thought I knew. He had been preaching for years the Unix gospel of small tools, rapid prototyping and evolutionary programming. But I also believed there was a certain critical complexity above which a more centralized approach planned was required. I thought the software larger (operating systems and really large tools such as Emacs) required built like cathedrals, ie, it must be carefully crafted by geniuses or small bands of mages working enclosed by stone and mud, without releasing beta versions early.

Style development Linus Torvalds ( "release early and often, delegate everything you can, be open to the point of promiscuity") I came as a surprise. It was not any reverent way to build the cathedral. On the contrary, the Linux community is more like a bustling bazaar of Babel, full of individuals with purposes and disparate approaches (faithfully represented by the repositories of Linux files that can accept input from whoever), from which emerge a stable system and coherent solely from a series of gadgets.

The fact that this bazaar style seemed to work, and work well, let me really surprised. As I was learning to move in that environment, not only I worked hard on individual projects, but to try to understand why the Linux world not foundered in a sea of confusion, but strengthened with unimaginable speed to builders cathedrals.
I thought begin to understand the middle of 1996. The average destination a perfect place to prove my theory, in the form of a free software project would try to make the style bazaar consciously medium. I did and it was a success worthy of consideration.

In the remainder of this article I will tell the story of this project, and use it to propose some aphorisms about the actual development of free software. Not all of these things were learned in the Linux world, but we'll see how it was that came to them give a particular meaning. If I am right, they will serve to better understand what makes the software so good source Linux community; and will help you be more productive.

Mail had to get Since 1993 I have been in charge of the technical side of a small ISP free access called Chester County InterLink (CCIL) in West Chester, Pennsylvania (I was one of the founders of CCIL and wrote the original multiuser BBS software, which can be entered telnet: // currently supports more than three thousand users in 19 lines).. This job allowed me access to the network 24 hours a day through the CCIL 56K line; in fact, he practically demanded me !.

By then I Habi habituated to email. For various reasons it was hard to get SLIP to link my machine at home ( and CCIL. When finally I made it, I found it particularly annoying to have to enter each time telnet to locke to check my mail. What was wanted he was forwarded to snark so that biff (1) notifies me when he arrived.

A simple redirection with sendmail would not work because snark is not always online and does not have a static IP address. I needed was a program out my SLIP connection and bring the mail to my machine. I knew that such programs existed, and that most used a simple protocol called POP (Post Office Protocol, Protocol Office), so we made sure me that the POP3 server was in the BSD / OS operating locke .

I needed a POP3 client; so that I searched the net and found one. Actually, I found three or four. I used pop-perl for a while, but lacked a feature clearly evident: the ability to identify addresses of the emails recovered so that answers could be correct.

The problem was this: suppose that such a monty on locke sent me an email. If I pulled from snark and then tried to answer, then my email program led the response to a nonexistent monty in snark. Manual editing of addresses beat them answer for the '@' soon became very upset.

It was clear that the computer had to do this for me. (In fact, according to RFC1123, Section 5.2.18, sendmail should be doing.) However, none of the customers did really POP! This brings us to the first lesson:

Every good work of software starts from the personal needs of the programmer.
This may sound very obvious: the old proverb says that "necessity is the mother of all inventions". However, there are many software programmers who spend their days in exchange for a salary, in programs that neither need nor want. Not so in the Linux world; which serves to explain why the average quality of software so high in that community is given.

For all this, do you think I rushed immediately to the maelstrom of writing, from scratch, the program of a new POP3 client that competing with existing ones? Never in life! POP carefully checked the tools I had available, wondering "what is closest to what I need?" because

Good programmers know what to write. The best, to rewrite (and reuse).

Although I presume not be an extraordinary programmer, I have always tried to imitate one of them. An important feature is the great programmers meticulousness with which they build. They will know them ten not for effort but for results; and almost always easier from a good partial solution than zero.

Linus, for example, did not try to write Linux from scratch. Instead, he started by reusing code and ideas from Minix, a small operating system (OS) UNIX-made machine 386. He eventually ended up throwing away or rewriting the entire code Minix, but while he had him served as a launchpad important gestating project that later became Linux.

In that spirit, I began to search for a reasonably POP tool were written to be used as a starting platform for my development.

World UNIX tradition of sharing the sources always has been paid to code reuse (this is why the GNU project chose Unix as its OS base, despite serious reservations that had). The Linux world has taken this tradition to bring it close to its technological limit; It has terabytes of source code that are generally available. So it's more likely that finding something good is more likely to succeed in the Linux world than anywhere else.

This happened in my case. Besides those who had found earlier, my second search got a total of nine candidates: fetchpop, Poptart, get-mail, gwpop, pimp, pop-perl, POPC, popmail and UPOP. The first I chose was the fetchpop, program Seung-Hong Oh. I added my par code that have the ability to rewrite the headers and several improvements, which were incorporated by the author in version 1.9.

However, a few weeks later I ran into the source of popclient code written by Carl Harris, and found I had a problem. Although fetchpop had some original ideas (such as its daemon mode), it could only handle POP3 and was written in the manner of an amateur (Seung-Hong was a brilliant programmer, but had no experience, and both characteristics were palpable) . Carl code was better, quite professional and robust, but his program lacked several important features fetchpop that were difficult to implement (including that I had added).

Was he or changed? Change meant discarding the code that was added in exchange for a better development base.
A practical reason for change was the need for multi-protocol support. POP3 is the mail server protocol most commonly used, but not the only one. Fetchpop and others did not handle POP2, RPOP or APOP, and I already had the vague idea of adding IMAP (Message Access Protocol Internet, the protocol most powerful and recent emails) just for entertainment.

But there was a more theoretical reason to think that the change could be a good idea, something I learned long before Linux:

3.    "Contemplate disposal, and anyway have to." (Fred Brooks, The Mythical Man-Month, Chapter 11)

You put it another way: no problem fully understood until the first solution is implemented. The next time perhaps implying an already know enough to fix it. So if you want to solve it, get ready to start over at least once.

Well, I said, fetchpop changes were a first attempt, so change.

After sending my first series of improvements to Carl Harris on 25 June 1996, I found out that he had lost interest in popclient for some time. The program was a little deserted, dusty and with some minor fleas hanging. As he had to make several corrections quickly agreed that the logical thing was that I took over the project.

Without realizing it, the project had reached other dimensions. I was not trying to make a few minor changes to a POP client, but I had made responsible for one; and the ideas bubbling in my head I would probably lead to major changes.

In a software culture that encourages sharing source code, this was the natural way the project evolved. I acted in accordance with the following:

If you have the right attitude, you will find interesting problems.

But Carl Harris's attitude was even more important. He understood that When you lose interest in a program, the last duty is to inherit a competent successor.

Not even discuss it, Carl and I knew that the common goal was to get the best solution. The only question between us was whether I could prove that the project would be in good hands. Once I did, he acted willingly and diligently. I hope behave the same when my turn comes.

The importance of users

That was how popclient inherited. In addition, I received your user base, which was just as important. Having users is wonderful. Not only because they prove that you are fulfilling a need that has done something right, but because, properly cultivated, can become magnificent attendees.
Another important aspect of the tradition UNIX, Linux, again, leads to the limit, is that many users are also hackers, and, being available source code, hackers become very effective. This can be tremendously helpful in reducing debugging time programs. Copn a good stimulus, users diagnose problems, suggest fixes, and help you better much faster than you would without assistance programs.

Treat users as collaborators is the most appropriate way to improve the code, and the most effective debugging.

It is usually easy to underestimate the power of this effect. In fact, it is possible that all underestimating the multiplier we continued to acquire capacity with the number of users and against system complexity, until us so Linus came to demonstrate.

Actually, I believe that the genius of Linus does not lie in the construction of the Linux kernel itself, but in the invention the development model Linux. When I once expressed this view before him, he smiled and repeated softly a phrase that has been said many times: "I'm basically a very lazy person who likes to get credit for what really make" others. Lazy like a fox. Or, as Robert Heinlein would, too lazy to fail.

In retrospect, a precedent for the methods and success that Linux could be on the development of GNU Emacs libraries and archives Lisp code. In contrast to the cathedral-style building Emacs core written in C, and many other tools of the FSF, the evolution of the Lisp code was quite fluid and generally directed by the users themselves. Ideas and prototype modes were overwritten three or four times before reaching their final stable. While frequent informal collaborations made possible by the Internet, the Linux style.

Indeed, one of my most successful programs before fetchmail was probably the VC mode for Emacs, a Linux kind collaboration that made e-mail together with three other people, of which I have only met one (Richard Stallman ) till the date. VC was a front-end for SCCS, RCS and later CVS type controls offered "to the touch" version control operations from Emacs. He was developed from a small and, to some extent, crudely sccs.el someone had written. The development of VC succeeded because, unlike Emacs itself, Emacs code in Lisp could go through the cycle of publishing, testing and debugging, very quickly.

(One of the side effects of the policy of the FSF legally binding code to the GPL was that it became harder for FSF to use the bazaar mode, because of their idea that Debin to assign copyright for each individual contribution of more than twenty lines in order to immunize protected by the GPL code from any legal problems arising from law copyright. users of the BSD licenses and the MIT X Consortium not have this problem, because no trying to reserve rights that any attempt to question.)

Release early and often

Rapid and frequent publications of the code are a critical part of the Linux development model. Most programmers, in which I include myself, believed before was bad policy trivial involved in larger projects, because the first versions, almost by definition, come riddled with errors, and no one likes to exhaust the patience of users.

This idea reasserted preference for programmers cathedral style development. If the main goal was that users saw the least amount of errors, then just Habi to release once every six months (or even less frequency) and work as a donkey in the meantime debugging versions that bring to light. The core of Emacs written in C developed this way. Not Lisp library, as repositories of files Lisp, where you could get developing new versions of the code, regardless of the Emacs development cycle, were beyond the control of the FSF.

The most important of these was the elisp files from Ohio State University, which anticipated the spirit and many of the features of today's big Linux archives. But only a few of us really reflect on what we were doing, or what the mere existence of the file suggested on the challenges involved in the development model cathedral style FSF. I made a serious attempt, circa 1992, formally uniting much of the Ohio code with the official Emacs Lisp library. I got into very serious political fights and I did not succeed.

But a year later, as Linux is agigantaba, it became clear that was going on something different and much healthier. Open Linus development policy was the very opposite of cathedral-style building. File repositories sunsite and tsx-11 showed an intense activity and many Linux distributions in circulation. And all this was handled with a frequency in publishing programs that was unprecedented.

Linus was treating his users as collaborators in the most effective manner possible:

Fast and often free, and listen to your customers.

Linus's innovation was not so much this (something had been happening in the UNIX world tradition for a long time), but take it to a level of intensity that was commensurate with the complexity of what was developing. At that time it was not unusual to release a new version of the kernel more than once a day! And, because he cultivated his base of attendees developers and sought help in intensamaente Internet more than any other, it worked.

But how did it work? Was it something I could emulate, or due to the unique genius of Linus?

I do not think so. Okay, Linus is a devilishly clever hacker (how many of us could design a high quality kernel?). But Linux itself does not represent a startling conceptual leap forward. Linus is not (at least not so far) an innovative genius of design such as Richard Stallman or James Gosling. Actually, for my Linus it is an engineering genius; has a sixth sense to avoid deadlocks in development and debugging, and is very clever type to find your way with minimal effort from point A to point B. In fact, the whole design of Linux breathes this quality and Linus reflects a conservative approach that simplifies design.

Therefore, if the frequent publications of the code and search for assistance within the Internet are not accidents but integral parts of Linus wit to see the critical path of least effort, what was what was maximizing? What was it that was squeezing machinery?

Posed in this way, the question answers itself. Linus was keeping its users-hackers-goers constantly stimulated and rewarded by the prospect of taking part in the action and satisfy your ego, awarded the exhibition and constant, almost daily improvement of their work.

Linus clearly bet to maximize the number of man-hours spent on debugging and development, despite the risk of become unstable running the code and exhaust the user base, if a serious error was unfathomable. Linus was behaving as though he believed something like this:

Given a sufficient base of attending developers and beta-testers, almost any problem can be characterized quickly, and its solution be obvious at least for someone.

Or, put less formally, "with many eyes, all bugs jump in sight". This I have named Linus's Law.

My original formulation was praying that every problem should be transparent to someone. Linus discovered that the people who understood and solved a problem that was not necessarily the same, even in most cases. He said that "someone else is the problem and solve it." But the point is that both things tend to happen very quickly.

Here, I think, lies an essential difference between the style bazaar and the cathedral. Cathedral in focus style of programming, bugs and development problems are tricky, insidious, deep phenomena. It usually takes months of exhaustive review for a few reaching the assurance that they have been eliminated altogether. Therefore such long intervals between each version is released, and the inevitable demoralization occur when these versions, long awaited, are not perfect.

In the approach bazaar style programming on the other hand, it is assumed that errors are relatively evident phenomena or, at least, they can become relatively evident when displayed to thousands of developers enthusiastic assistants who support the evenly over each of the versions. Consequently, it is released frequently in order to obtain a greater amount of corrections, achieving as a beneficial side effect losing less when such an obstacle is crossed.

And that's it. That's enough. If Linus's Law was false, then any sufficiently complex system such as the Linux kernel, which is being manipulated by so many, should have collapsed at one point under the weight of certain unforeseen interactions and "very deep" inadvertent errors. But if true, it would suffice to explain the relative absence of errors in the code of Linux.

After all, this should not seem so surprising. Some years ago sociologists found that the average opinion of a large number of equally expert (or equally ignorant) observers is more reliable to predict that one of the observers selected randomly. This is known as the Delphi effect. Apparently what Linus has shown is that this is also valid in the field of purification of an operating system: the Delphi effect can abate the complexity involved in developing, even at the level of the involved in the development of core some.
I am indebted to Jeff Dutky, who suggested that Linus's Law can be restated by saying that "debugging can be done in parallel." Jeff observes that although debugging requires participants to communicate with a programmer who coordinates the work, not demana any significant coordination among them. Therefore, do not fall victim to the amazing complexity and quadratic
operating costs that cause the incorporation of developers is problematic.

In practice, the theoretical loss of efficiency due to duplication of work by programmers is almost never an issue of importance in the Linux world. One effect of the "policy release early and often" is that this kind of duplications are minimized by corrections quickly spread.

Brooks made an observation related to Jeff: "The total cost of maintaining a widely used program is typically about 40 percent or more of the cost of development Surprisingly this cost is strongly influenced by the number of users more users.. detect a greater number of errors. " (Emphasis added).

A Most users detects more errors because they have different ways to evaluate the program. This effect increases when users are asaitentes developers. Each focuses the task of characterizing the errors with a conceptual baggage and other analytical instruments, from a different angle. The Delphi effect seems to work precisely because of these differences. In the specific context of debugging, such differences also tend to reduce duplication.

So adding more beta-testers may not help reduce the complexity of the "deeper" than the current errors from the point of view of the developer, but increases the likelihood that the toolbox of any of them equates to the problem, in such a way that person clearly see the error.

Linus also doubles his bets. In the event that actually exist serious errors, versions of the Linux kernel are listed in such a way that potential users can choose the latest version considered "stable" or catch a knife edge and risk bugs provided take advantage of new features. This tactic has not been formally imitated by most Linux hackers, but perhaps they should. The fact of having both options, it becomes even more attractive.

When a rose is not a rose?

After studying the way he acted Linus and have formulated a theory of why I was successful, I made the conscious decision to try my new project (which, I must admit, is much less complex and ambitious).

The first thing I did was reorganize and simplify popclient. Carl Harris's work was very good, but exhibited typical of many of the programmers in C. He treated the code as central and the data structures as support for this unnecessary complexity. As a result, the code was very elegant, but the design of the data structures out ad hoc and ugly (at least with respect to the exacting standards of this old hacker Lisp).

However, he had another reason for rewriting besides improving the design of the data structure and code: The project should evolve into something I understood fully. It is not no fun to be responsible for correcting errors in a program that is not understood.

Therefore, during the first month or so, I was just following the details of the basic design of Carl. The first serious change I made was to add IMAP support. I did reorganizing managers protocols in a generic manager with three methods tables (for POP2, POP3 and IMAP). This and some previous changes show a general principle that's good programmers have in mind, especially programming in languages such C and do data management dynamically:

Intelligent data structures and code rough work much better than in the reverse case.

Again, Fred Brooks, Chapter 11: "Show me your code and conceal your data structures, and will continue intrigued Show me your data structures and generally do not need to see your code, it will be obvious. 

Actually, he spoke of "flowcharts" and "tables". But thirty years of terminology and cultural changes, is practically the same idea.

At this point (early September 1996, about six weeks after starting) I started thinking that a name change might be appropriate. After all, it was no longer just a POP client. But still I hesitated, because there was nothing new and genuinely mine design. My version of popclient had yet to develop an identity.

This changed radically when fetchmail learned how to forward mail received at the SMTP port. I will return to this in a moment. First I want to say this: I said earlier that I decided to use this project to test my theory about style correction Linus Torvalds. As I did? (Could you ask nicely). It was as follows:

Released early and often (almost I never stopped doing it in less than ten days; during periods of intense development, once daily).

Analysts extended my beta versions, incorporating everyone who contacted me to know about fetchmail.

Effected billboards to this list whenever a new version released, encouraging people to participate.

And I listened to my assistants analysts, consulting them design decisions and taking them into account when they sent me their feedback and consequent improvements.

The reward for these simple measures was immediate. From the beginning of the project I got quality bug reports, often with good accompanying solutions that envy most developers. I obtained constructive criticism, messages from fans and intelligent suggestions. Which leads to the next lesson:

If you treat your analysts (beta-testers) as if they were its most valuable resource, they will respond by becoming you in your most valuable resource.

One interesting measure of fetchmail's success was the size of the analysts beta list project, fetchmail friends. When I wrote this, I had 249 members, and numbered two to three weekly.

Reviewing it today, late May 1997 the list is beginning to lose members due to an extremely interesting reason. Several people have asked me to give low because fetchmail is working so well that they need not see all traffic from the list! Maybe this is part of normal life cycle of a mature project by the construction method bazaar style.

Popclient becomes fetchmail

The crucial for the project was when Harry Hochheiser sent me the source code to incorporate referral received mail to the client machine via the SMTP port. I realized almost immediately that a proper implementation of this feature would leave all other methods to one step from being obsolete.

For weeks he had been perfecting fetchmail, adding features, although I felt that the interface design was useful but somewhat clumsy, inelegant and with too insignificant options hanging out of place. Ease of emptying the mail received a post-mailbox file or standard output in a certain way bothered me, but I could not understand why.

I noticed when I started thinking about mail forwarding for SMTP was that the popclient was trying to do too many things together. It had been designed to function simultaneously as a transport agent (MTA) and a delivery agent (MDA). By sending mail by SMTP could quit the MDA and focus only on the MTA, sending mail to others programs for local delivery just as sendmail does.

Why suffer with all the complexity involved either configure the delivery agent or perform a lock and then added at the end of the file-mailbox when port 25 is almost guaranteed almost any platform supporting TCP / IP ? Especially when this means that the mail thus obtained is guaranteed viewed as a mail that has been transferred normally, by the SMTP, which is what we really want.

From here several lessons are extracted. First, the idea of sending the SMTP port was the biggest single payoff I got from consciously trying to emulate the methods of Linus. A user gave me a fabulous idea, and all that remained was to understand its implications.

The greatest, after having good ideas is recognizing good ideas from your users. The latter is sometimes the best.

What is very interesting is that you will quickly find that when this absolutely convinced and sure of what he owes to others, then the world will treat you as if you had made each part of the invention itself, and this will make you appreciate modestly natural wit. We can all see how well this worked for Linus himself!

(When you read this document in the Perl Conference of August 1997, Larry Wall was in the front row When I got what I just said, Larry said loudly. "Go, say that,
Tell them, brother !. "All present laughed because
also they knew that he had worked very well for the inventor of Perl)

And a few weeks after the project started walking in the same spirit, I began to receive similar adulation, not just from my users but from other people who had learned by third parties. I have been secured part of that email. I'd read on occasion, if I came over to ask if my life has been worthwhile :-).

But there are two more fundamental lessons that have nothing to do with the policies that are general to all types of design:

12.    Often the most innovative and spectacular solutions come to understand that the concept of the problem was wrong.

He had been trying to solve the wrong by continuing to develop the popclient as a delivery agent and combined transport, with all kinds of medium rare modes of local delivery problem. Fetchmail design needed to be rethought from top to bottom as a pure transport agent, as a link, if it comes to SMTP, the normal path that mail on the Internet.

When you run into a wall during development-when you find it difficult to think beyond the correction follows- is often time to ask not whether you actually have the right answer, but whether it is considering the right question. Perhaps the problem needs to be rethought.

Well, I had reframed my problem. Evidently, what I had to do now was to program support shipment by SMTP in the generic driver, do the default mode, and eventually eliminate all other modes of delivery, especially delivery to a mailbox-file and emptying to standard output.

I was, for some time, hesitating to give step 3; fearing to upset the old poclient users who depended on these alternative delivery mechanisms. In theory, they could immediately switch to .forward files or their equivalents in another scheme that was not sendmail, to get the same results. But in practice, the transition could be complicated too.

When I finally did, however, profits were immense. The most intricate parts of the driver code vanished. Setting turned radically simple: to not deal with the MDA system and file-user's mailbox, and not to worry that the operating system supports file locking.

Also, the only risk of misplacing mail had also vanished. Before, if you specified delivery to a mailbox-file and the disk was full, then the mail was lost irretrievably. This does not happen with sending via SMTP SMTP because the receiver does not return an OK while the message has not been delivered successfully, or at least has been sent to the queue for later delivery.

In addition, performance improved a lot (although you will not notice on the first run). Another benefit was negligible simplifying the manual page.

Later it had to add delivery to a local agent specified by the user in order to handle threatening situations involved with the dynamic address allocation in SLIP. However, I found a much simpler way to do it.

What was the lesson? Do not hesitate to discard any superfluous feature if you can do it without loss of effectiveness. Antoine de Saint-Exupéry (an aviator and aircraft designer when he was writing not classic children's books) said.

"Perfection (in design) is achieved not when there is nothing left to add, but when there is something to take longer."

When the code is improving and simplifying goes, it is when you know you're right. Thus, in this process, the fetchmail design acquired an identity different from its ancestor, the popclient.

It was time to change the name. The new design looked more like a double than the old popclient Sendmail; both they were MTAs, transport agents, but while the Sendmail pushes then delivers the new popclient pull and after delivery. So, after two arduous months, rebaptized with the name of fetchmail.

Fetchmail growth

There I was with a nice and innovative design, a program that knew worked well because I used daily, and I learned from the beta list, which was very active. Gradually this made me see that I was no longer involved in a trivial Hacked staff, which could be useful for a few more people. I had my hands on a program that any hacker with a Unix box and a SLIP / PPP connection really needs.

With the method expedition SMTP went ahead of competition, enough to become a "professional killer", one of those classic programs that occupies so well place than the alternatives are not just discarded but forgotten.

I think you really could not imagine or plan for a result like this. You have to get a handle concepts such powerful design that subsequently results seem inevitable, natural, even foreordained. The only way to get hold of these ideas is to play with a lot of ideas; or have a vision of engineering enough to take the good ideas of others beyond their own original authors thought they could reach.

Andrew Tanenbaum had a good original idea, to build a simple native Unix for 386, to serve as a teaching tool. Linus Torvalds pushed the Minix concept further than Andrew could ever imagine, and transformed into something wonderful. In the same way (though on a smaller scale), I took some ideas by Carl Harris and Harry Hochheiser and pursue them strongly. None of us was "original" in the romantic sense of the idea that people have of genius. But most of the development of science, engineering and software is not due to an original genius, but the hacker mythology to the contrary.

The results were always somewhat complicated: in fact, just the kind of challenge that a hacker lives! And this meant that I had to set my own even higher standards. To make fetchmail was as good as I now saw it could be, he had to write not only to satisfy my own needs, but also include and support others who were outside my orbit. And this had to do keeping the program simple and robust.

The first most important and powerful feature I wrote after doing that was the support for multiple collected, that is, the ability to collect mail from mailboxes that had accumulated all mail from a user group, and then transfer each message to individual container the respective recipient.

I decided to add support multiple collected partly because some users claimed him, but mostly because it would evidence a code errors collected individually, by forcing me to deal with addressing generality. As it happened. RFC822 put to work properly took me quite some time, not only because each of the component parts are difficult, but because it involved a lot of confusing and interdependent details.

Thus, addressing multiple collected an excellent design decision turned. Thus I learned that:

14 Any tool should be useful in the expected way, but a truly great * tool is provided to be used in the most unexpected way.

The unexpected use of multiple collected the fetchmail was working mailing lists with the list kept, and alias expansion performed on the client side of the SLIP / PPP connection. This means that someone who has a computer and an ISP account can manage a mailing list without having to continue entering the ISP's alias files.

Another important change demanded by my beta was auxiliary support for 8-bit MIME operation. This could be easily obtained, since he had been careful to keep the code 8-bit clean. Not that I would have anticipated me the need for this feature, but was due to another rule:

15. When software for gateway of any kind is written, take pains to disturb the data stream as little as possible, and * ever * throw away information unless the recipient forces you to!

If he had not obeyed this rule, then the 8-bit MIME support would have been difficult and full of errors. So, all I had to do was read RFC 1652 and add some trivial logic in generating headlines.

Some European users pressured to introduce an option that limits the number of messages carried by session (so that they could control the costs of their expensive phone networks). I opposed this change for a long time, I am still not completely happy with it. But if you write for the world you must listen to their customers: this should not change anything just because you are not giving money.

A few more lessons learned fetchmail

Before returning to the general issues of software engineering, one must consider two specific lessons from the fetchmail experience.

The rc file syntax includes optional keywords "noise" that are ignored entirely by the parser. The English-style syntax they allow is considerably more readable than the word sequence traditional key-value pairs you get when you remove those keywords optional.

These began as an experiment in the early morning, when I noticed that many of the statements of the rc files resembled a little minilanguage imperative. (This was also the reason why I changed the original keyword of popclient "server" to "poll").

I looked at the time that the turn that English minilanguage more imperative guy could make it easier to use. Now, although I am a convinced supporter of the design school "do a language," exemplified in Emacs, HTML and many databases, I'm not usually a fan of English style syntax.

Programmers have traditionally tended to favor control syntax, because it is very precise, compact and have no redundancy. This is a cultural legacy of the time when computing resources were expensive, so the analysis stage had to be the most simple and economical as possible. English, with 50% redundancy, looked like a very inappropriate model then.

This is not the reason why I doubt the English style syntax; and I mention it here only to demolish it. With cheap cycles, fluency should not be an end in itself. It is more important for a language being suitable for humans to be economical in terms of computational resources.

However, there are enough reasons to tread carefully. One is the cost of the complexity of the analysis stage: no one wants to increase it to a point that a major source of errors and confusion for the user again. Another is that by making an English-like language syntax is often demands that the "English" speaking, so that the superficial resemblance to natural language is as confusing as it might have been the traditional syntax is considerably deformed. (You can see much of this in the 4GLs languages and search commercial databases).

The fetchmail control syntax seems to avoid these problems because the language domain is extremely restricted. It is far from being a widely used language; the things he says are not very complicated, so there is little chance of confusion, to move from a small subset of English and the actual control language. I think you can draw a broader lesson from this:

16.    When your language is far from a complete Turing, then the syntactic sugar can be your friend.

Another lesson is about security by obscurity. Some fetchmail users asked me to change the software to store encrypted keys in your rc file access, so you could not see them crackers by chance.

I did not because this practically would not provide any additional protection. Anyone who acquires the necessary permissions to read the respective rc file would be anyway able to run the fetchmail; and if your password was, it could take the necessary decoder fetchmail code itself to get it.
Everything encrypting passwords in the file .fetchmailrc could have gotten was a false sense of security for people who are not really into this medium. The general rule is as follows:

A security system is as safe as secret. Beware of secrets half.

Conditions for bazaar style

The first to read this document and its first unfinished versions were made public, constantly asked about the requirements for a successful development within the model of the bazaar, including both the qualification of the project leader and the state of the code when you you will go public and start building a community of co-developers.

It is clear that one can not start from scratch in the bazaar style. With it, one can try, look for errors, set up and improve something, but it would be very difficult to lead a project in a similar way to the bazaar. Linus did not try this. I did not like that. Our nascent developer community needs something to play and run.

When you start the construction of a community building, which should be able to do is present a plausible promise. The program need not be particularly good. It may be crass, have many errors, be incomplete and poorly documented. But what can not fail is to convince potential co-developers that the program can evolve into something elegant in the future.

Linux and fetchmail made public with strong, attractive basic designs. Many people think that the model of the bazaar as I have presented correctly considered this critical, then jumped from here to the conclusion that it is essential that the project leader has a higher level of design intuition and a lot capacity.

But Linus got his design from Unix. I got mine initially from the old popmail (although later change much, much more, keeping the proportions of what Linux has done). So does it really have an extraordinary talent leader-coordinator of the bazaar model, or you can scroll through with only coordinate other talent for design?

I think it is not critical that the coordinator be able to originate designs of exceptional quality, but what is absolutely essential is that he (or she) is able to recognize good design ideas from others.

Both Linux project like fetchmail give evidence of this. Although Linus is not a spectacular original designer (as discussed above), it has been shown to have a powerful ability to recognize good design and integrate the Linux kernel. I have already described how the idea of designing larger for fetchmail (SMTP forwarding) came from another.

The first readers of this paper complimented me by suggesting that I am prone to undervalue design originality in bazaar projects because I have it in good measure, and therefore take it for granted. It may be true in part; The design is certainly my strong (compared with programming or debugging).
But the problem with being clever and original in software design tends to become a habit: you do things like reflection, so that look elegant and complicated when it should keep them simple and robust. I have suffered setbacks in projects because of this mistake, but I managed not happen the same with fetchmail.

So I believe the fetchmail project succeeded partly because propensity to be restrained my cunning; This is an argument that is (at least) against design originality as essential for projects to be successful bazaar. Consider again Linux. Suppose Linus Torvalds had been trying to dispose fundamental innovations in operating system design during the development stage; Could it be as stable and successful as the kernel actually we have today?

Of course, a certain minimum skill level for designing and writing programs is needed, but hopefully anyone who wants to seriously launch an effort bazaar style is already above this level. The internal market free software community, by reputation exerts subtle pressure on people not to launch development efforts may not be able to maintain. So far, this seems to be working quite well.

There is another kind of skill not normally associated with software development which I think is equally important for projects bazaar, and sometimes even more, as the ingenuity in design. A coordinator or leader bazaar-style project must have people skills and good communication skills.

This might seem obvious. In order to build a development community needs to attract people, interest them in what is being done and keep her comfortable with the work that is being developed. Technical enthusiasm is a good deal to accomplish this, but it is far from definitive. In addition, it is important personality one projects.

It is no coincidence that Linus is a guy who makes people appreciate it and want to help. Nor is it a coincidence that I'm a tireless extrovert who enjoys working with a crowd and have some poise and improvised comic instincts. To make the bazaar model work, it helps to have at least some capacity for social relations.

The social context of free software

While it has been said the best Hacked start out as personal solutions to everyday problems of the author, and become popular because the common problem for a large group of users. This brings us back to the matter of rule 1, which can perhaps rethought in a more useful way:

To solve an interesting problem, start by finding a problem that is interesting.

So it was with Carl Harris and the old popclient, and so with me and fetchmail. This, however, has long understood. The interesting point, the point that the histories of Linux and fetchmail asking us to focus, is on the next stage in the evolution of software in the presence of a large and active community of users and co-developers.

In The Mythical Man-Month, Fred Brooks observed that programmer time is not fungible; the developers add a mature software project makes it late. He explained that the complexity and communication costs of a project increases as the square of the number of developers, while work grows only linearly. This approach is known as the Brooks Act, and is generally accepted as true. But if Brooks's Law were general, then Linux would be impossible.

A few years later Gerald Weinberg's classic The Psychology of Computer Programming raises, in retrospect, as a vital correction to Brooks. In his discussion of "egoless programming", Weinberg notes that in places where developers do not have ownership of their code, and encourage other people to look for errors and possible improvements, are the places where progress is dramatically faster than elsewhere.

The terminology used by Weinberg has perhaps prevented his analysis from gaining the acceptance it deserves: one has to smile at Internet hackers have no ego. I believe, however, that his argument seems more valid now than ever.

UNIX history should have prepared us for what we have learned from Linux (and what I've verified experimentally on a smaller by deliberately copying Linus's methods scale). That is, while creating programs remains essentially a solitary activity, the really big developments arise attention and thinking ability of entire communities. The developer who uses only his brain on a closed project is lagging behind that knows how to create an open, evolutionary context in which the search for errors and improvements are made by hundreds of people.

But the traditional Unix world was not pushing this approach to its logical conclusion due to various factors. One was the legal limitations produced by various licenses, trade secrets and commercial interests. Another (in hindsight) was that the Internet was not yet good enough.

Before the Internet was widely available, there were some geographically compact communities where the culture encouraged "egoless programming" Weinberg, and the developer could easily attract many developers and users trained. Bell Labs, the MIT AI Lab, the University of California, Berkeley are places where innovations that are legendary and still potent.

Linux was the first project of a conscious and successful use throughout the world as a nest of talent effort. I do not think it's a coincidence that the gestation period of Linux coincided with the birth of the World Wide Web, and that Linux left its infancy during the same period in 1993-1994, when the takeoff of the ISP industry was and the explosion of massive interest in the Internet. Linus was the first who learned to play by the new rules that pervasive Internet made possible.

While cheap Internet was a necessary condition for the Linux model to evolve condition, I do not think is in itself a sufficient condition. Another vital factor was the development of a leadership style and set of cooperative habits that allow developers to attract co-developers and get the most out of the medium.

But what is the style of leadership and how these habits? They can not be based on power relationships, and even if they were, leadership by coercion would not produce the results we're seeing. Weinberg quotes the autobiography of the nineteenth century Russian anarchist Kropotkin's Memoirs of a Revolutionary, which is very consistent with this topic:

"Having been raised in a family that had servants, I entered active life, like all young men of my time, with great confidence in the necessity of commanding, ordering, scolding, punishing and the like. But when, in a early stage, I had to manage serious enterprises and to deal with free men, and when each mistake could have serious consequences, I began to appreciate the difference between acting on the principle of order and discipline and acting on the principle of understanding. the former works admirably in a military parade, but does not work when real life is involved and can only be achieved through the severe effort of many converging wills. "

The "severe effort of many converging wills" is precisely what a project like Linux requires; while the "principle of command" is effectively impossible to apply to volunteers anarchist's paradise we call the Internet. To operate and compete effectively, hackers who want to lead collaborative projects have to learn how to recruit and energize communities of interest in the mode vaguely suggested by the "principle of understanding" of Kropotkin. They must learn to use Linus's Law.

Earlier I referred to the Delphi effect as a possible explanation for Linus's Law. But more powerful analogies to adaptive systems in biology and economics irresistibly suggested. The Linux world behaves in many respects like a free market or an ecological system, where a group of selfish agents attempting to maximize utility where processes generate self-correcting spontaneous order more elaborate and efficient than what you could achieve any kind of centralized planning. Here, then, it is the place to see the "principle of understanding".

The "utility function" Linux hackers are maximizing is not economical in the classical sense, but something intangible as his ego satisfaction and reputation among other hackers. (One could speak of his "altruistic motivation" but would ignore the fact that altruism is itself a form of ego satisfaction for the altruist). Volunteer groups that work this way are not really scarce; one in which I participated is that of fans of science fiction, that unlike the world of hackers, explicitly recognizes the "egoboo" (the enhancement of one's reputation among others) as the basic motivation behind activity volunteers.

Linus, by successfully positioning himself as a lookout for a project in which the development is done by others, and nurturing interest in it until it became self-sustaining, has shown the long-range "principle of mutual understanding" Kropotkin. This cuasieconómico approach to the world of Linux allows us to see what is the function of such an understanding.

We can see the method of Linus as how to create an efficient market in "egoboo" linking, as firmly as possible, the selfishness of the individual to difficult goals can only be achieved by sustained cooperation hackers. With the fetchmail project I have shown (on a much smaller scale, of course) that his methods can be copied successfully. Possibly, mine was done in a way a bit more consciously and systematically than he.

Many people (especially those who politically distrust free markets) would expect a culture of selfish individuals that target alone is fragmentary, territorial, illegal and hostile. But this idea is clearly refuted by (for instance) the stunning variety, quality and depth of Linux documentation. It is taken for granted that programmers hate documenting; how then Linux hackers generate so much? Clearly, the free market egoboo Linux works better to produce virtuous, that epartamentos editing, massively subsidized commercial software producers.

Both the draft fetchmail as the Linux kernel have shown that with proper stimulus to the ego of other hackers a strong developer / coordinator can use the Internet to reap the benefits of having a large number of co-developers without the danger of the project run wild run into a real mess. Therefore, the Brooks Act I counter the following:

19.    If the development coordinator has a medium at least as good as the Internet, and knows how to lead without coercion, many heads are inevitably better than one.

I think the future of free software will increasingly people who know how to play Linus's game, people who leave behind the cathedral and embrace the bazaar. This is not to say that individual vision and brilliance will no longer matter; On the contrary, I think in the forefront of free software are those who start from individual vision and brilliance, then the building positively enrich voluntary communities of interest.

Maybe this is not only the future of free software. No commercial developer would be able to muster the talent the Linux community can invest in a problem. Very few could afford even to hire the more than two hundred people who have contributed to fetchmail!

It is possible that long-term triumph culture of free software, not because cooperation is morally right or because the "ownership" of software is morally wrong (assuming you really believe in the latter, which is not true neither Linus nor for me), but simply because the commercial world can not win an evolutionary arms race to free software communities that can put orders of magnitude more skilled time into a problem that any company.

This article was improved by conversations with a large number of people who helped me find errors. In particular, I thank Jeff Dutky, who suggested the approach that "finding errors could be made parallel "and helped expand the examination. I also thank Nancy Lebovitz for his suggestion to emulate Weinberg by imitating Kropotkin. I also received insightful reviews Joan Eslinger and Marty Franz List General Tech. Paul Egger made me see the conflict between GPL and the bazaar model. I am grateful to the members of PLUG, the Linux user group from Philadelphia, to become the first public for the first version of this article. Finally, comments Linus Torvalds were very helpful, and initial support was very encouraging.

Further reading

I cited several parts of Frederick P. Brooks classic The Mythical Man-Month because in many respects, still have to improve their views. I recommend fondly 25th anniversary edition of the Addison-Wesley (ISBN 0-201-83595-9), which comes with his article entitled No Silver Bullet.

The new edition brings an invaluable retrospective twenty years, in which Brooks frankly admits certain criticisms of the original text that could not be maintained over time. I first read the retrospective after it was essentially finished this article, and surprised to find that Brooks attributed to Microsoft such practices at the bazaar!

The Psychology of Computer Programming Gerald P. Wienberg (New York, Van Nostrand Reinhold, 1971) introduced the concept unfortunately denoted by "egoless programming". Although he was far from being the first person to realize the futility of the "principle of order" was probably the first to recognize and argue the issue regarding software development.

Richard P. Gabriel, analyzing the culture of UNIX before the era of Linux, raised the superiority of a primitive bazaar-style model in an article in 1989: Lisp: Good News, Bad News, and How To Win Big. Despite being behind in some respects, this essay is considered correct in something Lisp fans (among whom I include myself). One of them reminded me that the section titled worse is better predicted with great accuracy to Linux.

The work of De Marco and Lister, Peopleware: Productive Projects and Teams (New York, Dorset House, 1987; ISBN 0-932633-05-6) is a gem that has been underestimated; He was cited for my fortune, by Fred Brooks in his retrospective. While little of what the authors say is directly applicable to communities of free software or Linux, their views on the conditions necessary for creative work is acute and highly recommended for anyone who tries to bring some of the virtues of bazaar to a more commercial context.

13.    Epilogue: Netscape adopts the bazaar!

It's a strange feeling that is felt when one realizes that it is helping to write history ...

On January 22 1998, approximately seven months after I published this paper, Netscape Communications, Inc. announced plans to release the source of Netscape Communicator. I had no idea that this was going to happen before the announcement date.

Eric Hahn, Executive Vice President and Chief Technology Officer at Netscape, sent me an email shortly after the announcement, which reads: '' On behalf of all who make up Netscape, I want to thank you for helping us get to this point, first. His thinking and writings were fundamental inspirations to our decision. ''

The following week, I made a plane trip to Silicon Valley as part of the invitation for a conference all day about strategy (the February 4, 1998) with some of its technical and higher level executives. Together, we designed the strategy of publishing the source of Netscape and license, and made some other plans that we hope will eventually have far-reaching positive implications on the free software community. At the moment I am writing, it is too early to be more specific, but they will be publishing the details in the weeks to come.

Netscape is about to provide us with a large-scale test in the real world, the bazaar model within the business. Free software culture now faces a danger; If you do not work actions Netscape, then the concept of free software can reach into disrepute so that the business world is not discussed again until a decade.

On the other hand, this is also a spectacular opportunity. The initial reaction to this movement on Wall Street and elsewhere was cautiously positive. We are providing an opportunity to show that we can. If Netscape retrieves a significant market share through this move, it can trigger an already long overdue revolution in the software industry.

The next year should prove to be a very interesting and intense learning period.

Version and updates

I set the Linux Kongress 1.17, held on 21 May 1997. Add the literature on July 7, 1997.
I put the story of the Perl Conference on November 18, 1997.

I replaced the term 'free software' 'to that of' 'open source' 'on February 9, 1998 in version 1.29.

I added the '' section Epilog: Netscape adopts Bazar! '' On February 10, 1998 in version 1.31.

I eliminated the graph on GPL vs. Paul Eggert Bazar in response to repeated arguments by RMS on 28 July 1998.

In other reviews I have incorporated minor editorial changes and corrected some details.