Chapter IV: Information Processing: Threat or Menace

Some years ago I decided to set up my own web site. One question was how much of my life to include. Did I want someone looking at my academic work – perhaps a potential employer – to discover that I had put a good deal of time and energy into researching medieval recipes, a subject unrelated to either law or economics, thus (arguably) proving that I was a dilettante rather than a serious scholar? Did I want that same potential employer to discover that I held unfashionable political opinions, ranging from support for drug legalization to support for open immigration? And did I want someone who might be outraged at my political views to be able to find out what I and my family members looked like and where we lived?

I concluded that keeping my life in separate compartments was not a practical option. I could have set up separate sites for each part, with no links between them – but anyone with a little enterprise could have found them all with a search engine. And even without a web site, anyone who wanted to know about me could find vast amounts of information by a quick search of Usenet, where I have been an active poster for more than ten years. Keeping my virtual mouth shut was not a price I was willing to pay, and nothing much short of that would do the job.

This is not a new problem. Before the internet existed, I still had to decide to what degree I wanted to live in multiple worlds – whether, for example, I should discuss my hobbies or my political views with professional colleagues. What has changed is the scale of the problem. In a large world where personal information was spread mostly by gossip and processed almost entirely by individual human brains, facts about me were to a considerable extent under my control – not because they were secret but because nobody had the time and energy to discover everything knowable about everyone else. Unless I was a major celebrity, I was the only one specializing in me.

That was not true everywhere. In the good old days – say most of the past 3,000 years – one reason to run away to the big city was to get a little privacy. In the villages in which most of the world lived, anyone’s business was everyone’s business. In Sumer or Rome or London the walls were no more opaque and you were no less visible than at home, but there was so much going on, so many people, that nobody could keep track of it all.

That form of privacy – privacy through obscurity – cannot survive modern data processing. No individual can keep track of it all but many of us have machines that can. The data of an individual life are not notably more complicated than they were 2,000 years ago. It is true that the number of lives has increased thirty- or fortyfold in the last 2,000 years,^¹ but our ability to handle data has increased a great deal more than that. Not only can we keep track of the personal data for a single city, we could, to at least a limited degree, keep track of the data for the whole world, assuming we had it and wanted to.

The implications of these technologies have become increasingly visible over the past ten or fifteen years. Many are highly desirable. The ability to gather and process vast amounts of information permits human activities that would once have been impossible; to a considerable extent it abolishes the constraints of geography on human interaction. Consider two examples.

Thirty some years ago, I spent several summers as a counselor at a camp for gifted children. Many of the children, and some of my fellow counselors, became my friends – only to vanish at the end of the summer. From time to time I wondered what had become of them.

I can now stop wondering, at least about some. A few years ago, someone who had been at the camp organized an email list for ex-campers and counselors; membership is currently over 200. That list exists because of technologies that make possible not only easy communication with people spread all over the country but also finding them in the first place – searching a very large haystack for a few hundred needles. Glancing down a page of Yahoo! Groups, I find almost 3,000 such lists, each for a different camp; the largest has more than 700 members.

For a second example, consider a Usenet Newsgroup that I stumbled across many years ago, dedicated to a technologically ingenious but now long obsolete video game machine of which I once owned two – one for my son and one for me. Reading the posts, I discovered that someone in the group had located Smith Engineering/Western Technologies, the firm that held the copyright on the Vectrex and its games, and written to ask permission to make copies of game cartridges. The response, pretty clearly from the person who designed the machine, was an enthusiastic yes. He was obviously delighted to discover that there were people still playing with his toy, his dream, his baby. Not only were they welcome to copy cartridges, if anyone wanted to write new games he would be happy to provide the necessary software. It was a striking, to me heartwarming, example of the ability of modern communications technology to bring together people with shared enthusiasms.

My examples so far are small and noncommercial – people learning other people’s secrets or getting together with old friends or strangers with shared interests. While such applications of informational technology are an increasingly important feature of the world we live in, they are not nearly as prominent or politically contentious as large-scale commercial uses of personal information. A first step in understanding such activities is to think about why some people would want to collect and use individual information about large numbers of strangers. Consider two examples.

You are planning to open a new grocery store in an existing chain – a multi-million dollar gamble. Knowledge about the people who live in the neighborhood – how likely they are to shop at your store and how much they will buy – is crucial. How do you get it?

The first step is to find out what sort of people shop in your present stores and what they buy. To do that you offer customers a shopping card. The card is used to get discounts, so shoppers pass the card through a reader almost every time they go through the checkout, providing you lots of detailed information about their shopping patterns. One way you use that information is to improve the layout of existing stores; if people who buy spaghetti almost always buy spaghetti sauce at the same time, putting them in the same aisle will make your store more convenient, hence more attractive, hence more profitable.

Another way is to help you decide where to locate your new store. If you discover that old people on average do not buy very much of what you are selling, perhaps a retirement community is the wrong place. If couples with young children do all their shopping on the weekend when one parent can stay home with the kids, singles shop after work on weekdays (weekends are for parties), and retired people during the working day (shorter lines), then a location with a suitable mix of all three types will give you a more even flow of customers, higher utilization of the store, and greater profits. Combining information about your customers with information about the demography of alternative locations, provided free by the U.S. census or at a higher price by private firms, you can substantially improve the odds on your gamble.

For a higher tech application of information technology, consider advertising. When I read a magazine, I see the same ads as everyone else – mostly for things I have no interest in. But a web page can send a different response to every query, customizing the ads I see to fit my interests. No TV ads, since I do not own a television, lots of ads for high-tech gadgets.

In order to show me the right ads, the people managing the page need to know what I am interested in. Striking evidence that such information is already out there and being used appears in my mailbox on a regular basis – a flood of catalogs.

How did the companies sending out those catalogs identify me as a potential customer? If they could see me, it would be easy. Not only am I wearing a technophile ID bracelet (Casio calls it a databank watch), I am wearing the model that, in addition to providing a calculator, database, and appointment calendar, also checks in three times a day with the U.S. atomic clock to make sure it has exactly the right time. Sharper Image, Techno-Scout, Innovations et al. cannot see what is on my wrist – although if the next chapter’s transparent society comes to pass that may change. They can, however, talk to each other. When I bought my Casio Wave Captor Databank 150 (the name would have been longer but they ran out of room on the watch), that purchase provided the proprietors of the catalog I bought it from with a snippet of information about me. They no doubt resold that information to anyone willing to pay for it. Sellers of gadgets respond to the purchase of a Casio Wave Captor the way sharks respond to blood in the water.

As our technology gets better, it becomes possible to create and use such information at lower cost and in much more detail. A web page can keep track not only of what you buy but also of what you look at and for how long. Combining information from many sources, it becomes both possible and potentially profitable to create databases with detailed information on the behavior of a very large number of individuals, certainly including me, probably including you.

The advantages of that technology to individual customers are fairly obvious. If I am going to look at ads, I would prefer that they be ads for things I might want to buy. If I am going to have my dinner interrupted by a telephone call from a stranger, I would prefer it be someone offering to prune my aging apricot tree – last year’s crop was a great disappointment – rather than someone offering to refinance my nonexistent mortgage.

As these examples suggest, there are advantages to individuals to having their personal information publicly available and easy to find. What are the disadvantages? Why are many people upset about the loss of privacy and the misuse of “their” private information? Why did Lotus, after announcing its plan to offer masses of such data on a CD, have to cancel it in response to massive public criticism? Why is the question of what information web sites are permitted to gather about their customers, what they may do with it, and what they must tell their customers about what they are doing with it, a live political and legal issue?

One gut-level answer is that many people feel strongly that information about them is theirs. They should be able to decide who gets it; if it is going to be sold, they should get the money.

The economist’s response is that they already do get the money. The fact that selling me a gadget provides the seller with a snippet of information that he can then resell makes the transaction a little more profitable for the seller, attracts additional sellers, and ultimately drives down the price I must pay for the gadget. The effect is tiny – but so is the price I could get for the information if I somehow arranged to sell it myself. It is only the aggregation of large amounts of such information that is valuable enough to be worth the trouble of buying and selling it.

A different response, motivated by moral intuition rather than economics, is that the argument confuses information about me – located in someone else’s mind or database – with information that belongs to me. How can I have a property right over the contents of your mind? If I am stingy or dishonest, do I have an inherent right to forbid those I treat badly from passing on the information? If not, why should I have a right to forbid them from passing on other information about me?

There is, however, a vaguer but more important reason why people are upset at the idea of a world where anyone willing to pay can learn almost everything about them. Many people value their privacy not because they want to be able to sell information about themselves but because they do not want other people to have it. While it is hard to come up with a clear explanation of why we feel that way – a subject discussed at greater length in the final chapter of this section – it is clear that we do. At some level, control over information about ourselves is seen as a form of self-protection. The less other people can find out about me, the less likely it is that they will use information about me either to injure me or to identify me as someone they wish to injure – which brings us back to some of the issues I considered when setting up my web page.

Concerns with privacy apply to at least two sorts of personal information. One is information generated by voluntary transactions with some other party – what products I have bought and sold, what catalogs and magazines I subscribe to, what web pages I browse. Such information starts in the possession of both parties to the transaction – I know what I bought from you, you know what you sold to me. The other kind is information generated by actions I take that are publicly visible – court records, newspaper stories, gossip.

Ownership of the first sort of information can, at least in principle, be determined by contract. A magazine can, and some do, promise its subscribers that their names will not be sold. Software firms routinely offer people registering their programs the option of having their names made or not made available to other firms selling similar products. Web pages can, and many do, provide explicit privacy policies limiting what they will do with the information generated in the process of browsing their sites.

To understand the economics of the process, think of information as a produced good; like other such goods, who owns how much of it is determined by agreement among the parties who produce it. When I subscribe to a magazine, the publisher and I are jointly producing a piece of information about my tastes – the information that I like that kind of magazine. That information is of value to the magazine, which may want to resell it. It is of value to me, either because I might want to resell it or because I might want to keep it off the market in order to protect my privacy. The publisher can, by selling subscriptions at a lower price without a privacy guarantee than with, offer to pay me for control over the information. If the information is worth more to me than he is offering, I refuse; if it is worth less, I accept. Control over the information ends up with whoever most values it. If no mutually acceptable terms can be found, I do not subscribe and that bit of information does not get produced.

This seems to imply that default rules about privacy, rules specifying who starts out owning the information, should not matter. That would be true in a world where arranging contracts was costless – a world of zero transaction costs. In the world we now live in, it is not. Most of us, unless we care a great deal about our privacy, do not bother to read privacy policies. Even if I prefer that catalogs and mailing lists not resell information about me, it is too much trouble to check the small print on everything I might subscribe to. It would be still more trouble if every firm I dealt with offered two prices, one with and one without a guarantee of privacy, and more still if the firm offered a menu of levels of protection, each with its associated price.

The result is that most magazines and web sites, at least in my experience, offer only a single set of terms; if they allow the subscriber some choice, it is not linked to price, probably because the amounts involved are too small to be worth bargaining over. Hence default rules matter and we get political and legal conflicts over the question of who, absent any explicit contractual agreement, has what control over the personal information generated by transactions.

That may change. What may change it is technology – the technology of intelligent agents. It is possible in principle, and is becoming possible in practice, to program your web browser with information about your privacy preferences. Using that information, the browser can decide what different levels of privacy protection are or are not worth to you and select pages and terms accordingly. Browsers work cheap.

For this to happen we need a language of privacy – a way in which a web page can specify what it does or does not do with information generated by your interactions with it in a form your browser can understand. Once such a language exists and is in widespread use, the transaction costs of bargaining over privacy drop sharply. You tell your browser what you want and what it is worth to you, your browser interacts with a program on the web server hosting the page and configured by the page’s owner. Between them they agree on mutually satisfactory terms – or they fail to do so, and you never see the page.

This is not a purely hypothetical idea. Its current incarnation is the Platform for Privacy Preferences (P3P), supported by several of the leading web browsers. Web pages provide information about their privacy policies, users provide information about what they are willing to accept, and the browser notifies the user if a site’s policies are inconsistent with his requirements. Presumably a web site that misrepresented its policies could be held liable for doing so, although, as far as I know, no such case has yet reached the courts.

Safe to tell a secret to one,
Risky to two,
To tell it to three is folly,
Everyone else will know.

Suppose we solve the transaction cost problems, permitting a true market in personal information. There remains a second problem – enforcing the rights you have contracted for. You can check the contents of your safe deposit box to be sure they are still there, but it does no good to check the contents of a firm’s database to make sure your information is still there. They can sell your information and still have it.

The problem of enforcing rights with regard to information is not limited to a future world of automated contracting – it exists today. As I like to put it when discussing current privacy law, there are only two ways of controlling information about you and one of them doesn’t work.

The way that doesn’t work is to let other people have information about you and then make rules about how they use it. That is the approach embodied in modern privacy law. If you disagree with my evaluation, I suggest a simple experiment. Start with $5,000, the name of a random neighbor, and the Yellow Pages for “Investigators.” The objective is to end up with a credit report on your neighbor – something that, under the Federal Fair Credit Reporting Act, you are not allowed to have. If you are a competent con man or internet guru, you can probably dispense with the money and the phone book.

That approach to protecting privacy works poorly when enforcing terms imposed by federal law. It should work somewhat better for enforcing terms agreed to in the marketplace, since in that case it is supported by reputational as well as legal sanctions – firms do not want the reputation of cheating their customers. But I would still not expect it to work terribly well. Once information is out there, it is very hard to keep track of who has it and what he has done with it. It is particularly hard when there are many uses of the information that you do not want to prevent – a central problem with the Fair Credit Reporting Act. Setting up rules that permit only people with a legitimate reason to look at your credit report is hard; enforcing them is harder.

The other way of protecting information, the way that does work, is not to let the information out in the first place. That is how the strong privacy of the previous chapter was protected. You do not have to trust your ISP or the operator of an anonymous remailer not to tell your secrets; you haven’t given them any secrets to tell.

There are problems with applying that approach to transactional information. When you subscribe to a magazine, the publisher knows who you are, or at least where you live – it needs that information to get the magazine to you. When you buy something from me, I know that I have sold it to you. The information starts in the possession of both of us – short of controlled amnesia, how can it end in the possession of only one?

In our present world, that is a nearly insuperable problem. But in a world of strong privacy, you do not have to know whom you are selling to. If, at some point in the future, privacy is sufficiently important to people, online transactions can be structured to make each party anonymous to the other, with delivery either online via a remailer (for information transactions) or the less convenient realspace equivalent of a physical forwarding system. In such a world, we are back with one of the oldest legal rules of all: possession. If I have not revealed the information to you, you do not have it, so I need not worry about what you are going to do with it.

Returning to something more like our present world, one can imagine institutions that would permit a considerably larger degree of individual control over the uses of personal information than now exists, modeled on arrangements now used to maintain firms’ control over their valuable mailing lists. Individuals subscribing to a magazine would send the seller not their name and address but the name of the information intermediary they employed and the number by which that intermediary identified them. The magazine’s publisher would ship the intermediary 4,000 copies and the numbers identifying 4,000 (anonymous) subscribers, the intermediary would put on the address labels and mail them out. The information would never leave the hands of the intermediary, a firm in the business of protecting privacy. To check its honesty, I establish an identity with my own address and the name “David Freidmann,” subscribe to a magazine using that identity, and see if David Freidmann gets any junk mail.

Such institutions would be possible and, if widely used, not terribly expensive. My guess is that it will not happen. The reason is that most people either do not want to keep the relevant information secret (I don’t, for example; I like gadget catalogs) or do not want to enough to go to any significant trouble. But it is still worth thinking about how they could get privacy if they wanted to, and those thoughts may become of more practical relevance if technological progress sharply reduces the cost.

These discussions suggest two different ways in which the technologies that help to create the problem could be used to solve it. Both are ways of making it possible for an individual to treat information about himself as his property. One is to use computer technologies, including encryption, to give me or my trusted agents direct control over the information, permitting others to use it only with my permission – for instance, to send me information about goods they think I might want to buy – without ever getting possession of it.

The other is to treat information as we now treat real estate – to permit individuals to put restrictions on the use of property they own that are binding on subsequent purchasers. If, for example, I sell you an easement permitting you to cross my land in order to reach yours and I later sell the land, the easement is good against the buyer. Even if he did not know it existed, he now has no right to refuse to let you through.

That is not true for most other forms of property.² If I sell you a car with the restriction that you agree not to permit it to be driven on Sunday, I may be able to enforce the restriction against you, I may be able to sue you for damages if, contrary to our contract, you sell it to someone else without requiring him to abide by the agreement. But I have no way of enforcing the restriction on him.

One plausible explanation of the difference is that land ownership involves an elaborate system for recording title, including modifications such as easements, making it possible for the prospective purchaser to determine in advance what obligations run with the land. We have no such system for recording ownership, still less for recording complicated forms of ownership, for most other sorts of property.

At first glance, personal information seems even less suitable for the more elaborate form of property rights than pens, chairs, or computers. In most likely uses, the purchaser is buying information about a very large number of people. If my particular bit of information is only worth three cents to him, a legal regime that requires him to spend a dollar checking the restrictions on it before he uses it means that the information will never be used.

A possible solution is to take advantage of the same data-processing technologies that make it possible to aggregate and use information on that scale to maintain the record of complicated property rights in it. One could imagine a legal regime where every piece of personal information had to be accompanied by a unique identification number. Using that number, a computer could access information about the restrictions on use of that information in machine-readable form at negligible cost. Again, it does not seem likely in the near future, but might become a real possibility farther down the road.

1 World population two thousand years ago is estimated at about 170 million; see Colin McEvedy and Richard Jones, Atlas of World Population History.

2 One exception is that claims against property that was used as security for a loan may run with the property – do in the case of automobiles, where such claims are normally recorded on the title document.