Lawrence Lessig

Lessig, Lawrence, und Lawrence Lessig.Code. Version 2.0. New York: Basic Books, 2006. S. 43ff

Identity and Authentication: Cyberspace
Identity and authentication in cyberspace and real space are in theory the
same. In practice they are quite different. To see that difference, however, we
need to see more about the technical detail of how the Net is built.
As I’ve already said, the Internet is built from a suite of protocols referred
to collectively as “TCP/IP.” At its core, the TCP/IP suite includes protocols for
exchanging packets of data between two machines “on” the Net.2 Brutally sim-
plified, the system takes a bunch of data (a file, for example), chops it up into
packets, and slaps on the address to which the packet is to be sent and the
address from which it is sent. The addresses are called Internet Protocol
addresses, and they look like this: 128.34.35.204. Once properly addressed, the
packets are then sent across the Internet to their intended destination.
Machines along the way (“routers”) look at the address to which the packet is
sent, and depending upon an (increasingly complicated) algorithm, the
machines decide to which machine the packet should be sent next. A packet
could make many “hops” between its start and its end. But as the network
becomes faster and more robust, those many hops seem almost instantaneous. In the terms I’ve described, there are many attributes that might be asso-
ciated with any packet of data sent across the network. For example, the
packet might come from an e-mail written by Al Gore. That means the e-mail
is written by a former vice president of the United States, by a man knowl-
edgeable about global warming, by a man over the age of 50, by a tall man, by
an American citizen, by a former member of the United States Senate, and so
on. Imagine also that the e-mail was written while Al Gore was in Germany,
and that it is about negotiations for climate control. The identity of that
packet of information might be said to include all these attributes.
But the e-mail itself authenticates none of these facts. The e-mail may say
it’s from Al Gore, but the TCP/IP protocol alone gives us no way to be sure. It
may have been written while Gore was in Germany, but he could have sent it
through a server in Washington. And of course, while the system eventually
will figure out that the packet is part of an e-mail, the information traveling
across TCP/IP itself does not contain anything that would indicate what the
content was. The protocol thus doesn’t authenticate who sent the packet,
where they sent it from, and what the packet is. All it purports to assert is an
IP address to which the packet is to be sent, and an IP address from which the
packet comes. From the perspective of the network, this other information is
unnecessary surplus. Like a daydreaming postal worker, the network simply
moves the data and leaves its interpretation to the applications at either end.
This minimalism in the Internet’s design was not an accident. It reflects a
decision about how best to design a network to perform a wide range over
very different functions. Rather than build into this network a complex set of
functionality thought to be needed by every single application, this network
philosophy pushes complexity to the edge of the network—to the applications
that run on the network, rather than the network’s core. The core is kept as
simple as possible. Thus if authentication about who is using the network is
necessary, that functionality should be performed by an application con-
nected to the network, not by the network itself. Or if content needs to be
encrypted, that functionality should be performed by an application con-
nected to the network, not by the network itself.
This design principle was named by network architects Jerome Saltzer,
David Clark, and David Reed as the end-to-end principle.3 It has been a core
principle of the Internet’s architecture, and, in my view, one of the most
important reasons that the Internet produced the innovation and growth that
it has enjoyed. But its consequences for purposes of identification and authen-
tication make both extremely difficult with the basic protocols of the Internet
alone. It is as if you were in a carnival funhouse with the lights dimmed to
darkness and voices coming from around you, but from people you do no know and from places you cannot identify. The system knows that there are
entities out there interacting with it, but it knows nothing about who those
entities are. While in real space—and here is the important point—anonymity
has to be created, in cyberspace anonymity is the given.
Identity and Authentication: Regulability
This difference in the architectures of real space and cyberspace makes a big
difference in the regulability of behavior in each. The absence of relatively self-
authenticating facts in cyberspace makes it extremely difficult to regulate
behavior there. If we could all walk around as “The Invisible Man” in real
space, the same would be true about real space as well. That we’re not capable
of becoming invisible in real space (or at least not easily) is an important rea-
son that regulation can work.
Thus, for example, if a state wants to control children’s access to “inde-
cent” speech on the Internet, the original Internet architecture provides little
help. The state can say to websites, “don’t let kids see porn.” But the website
operators can’t know—from the data provided by the TCP/IP protocols at
least—whether the entity accessing its web page is a kid or an adult. That’s dif-
ferent, again, from real space. If a kid walks into a porn shop wearing a mus-
tache and stilts, his effort to conceal is likely to fail. The attribute “being a kid”
is asserted in real space, even if efforts to conceal it are possible. But in cyber-
space, there’s no need to conceal, because the facts you might want to conceal
about your identity (i.e., that you’re a kid) are not asserted anyway.
All this is true, at least, under the basic Internet architecture. But as the
last ten years have made clear, none of this is true by necessity. To the extent
that the lack of efficient technologies for authenticating facts about individ-
uals makes it harder to regulate behavior, there are architectures that could be
layered onto the TCP/IP protocol to create efficient authentication. We’re far
enough into the history of the Internet to see what these technologies could
look like. We’re far enough into this history to see that the trend toward this
authentication is unstoppable. The only question is whether we will build
into this system of authentication the kinds of protections for privacy and
autonomy that are needed.
Architectures of Identification
Most who use the Internet have no real sense about whether their behavior is
monitored, or traceable. Instead, the experience of the Net suggests
anonymity. Wikipedia doesn’t say “Welcome Back, Larry” when I surf to its site to look up an entry, and neither does Google. Most, I expect, take this lack
of acknowledgement to mean that no one is noticing.
But appearances are quite deceiving. In fact, as the Internet has matured,
the technologies for linking behavior with an identity have increased dra-
matically. You can still take steps to assure anonymity on the Net, and many
depend upon that ability to do good (human rights workers in Burma) or evil
(coordinating terrorist plots). But to achieve that anonymity takes effort. For
most of us, our use of the Internet has been made at least traceable in ways
most of us would never even consider possible.
Consider first the traceability resulting from the basic protocols of the
Internet—TCP/IP. Whenever you make a request to view a page on the Web,
the web server needs to know where to sent the packets of data that will
appear as a web page in your browser. Your computer thus tells the web server
where you are—in IP space at least—by revealing an IP address.
As I’ve already described, the IP address itself doesn’t reveal anything
about who you are, or where in physical space you come from. But it does
enable a certain kind of trace. If (1) you have gotten access to the web through
an Internet Service Provider (ISP) that assigns you an IP address while you’re
on the Internet and (2) that ISP keeps the logs of that assignment, then it’s
perfectly possible to trace your surfing back to you.
How?
Well, imagine you’re angry at your boss. You think she’s a blowhard who
is driving the company into bankruptcy. After months of frustration, you
decide to go public. Not “public” as in a press conference, but public as in a
posting to an online forum within which your company is being discussed.
You know you’d get in lots of trouble if your criticism were tied back to
you. So you take steps to be “anonymous” on the forum. Maybe you create an
account in the forum under a fictitious name, and that fictitious name makes
you feel safe. Your boss may see the nasty post, but even if she succeeds in get-
ting the forum host to reveal what you said when you signed up, all that stuff
was bogus. Your secret, you believe, is safe.
Wrong. In addition to the identification that your username might, or
might not, provide, if the forum is on the web, then it knows the IP address
from which you made your post. With that IP address, and the time you made
your post, using “a reverse DNS look-up,”4 it is simple to identify the Internet
Service Provider that gave you access to the Internet. And increasingly, it is rel-
atively simple for the Internet Service Provider to check its records to reveal
which account was using that IP address at that specified time. Thus, the ISP
could (if required) say that it was your account that was using the IP address
that posted the nasty message about your boss. Try as you will to deny it (“Hey, on the Internet, no one knows you’re a dog!”), I’d advise you to give up
quickly. They’ve got you. You’ve been trapped by the Net. Dog or no, you’re
definitely in the doghouse.
Now again, what made this tracing possible? No plan by the NSA. No
strategy of Microsoft. Instead, what made this tracing possible was a by-prod-
uct of the architecture of the Web and the architecture of ISPs charging access
to the Web. The Web must know an IP address; ISPs require identification
before they assign an IP address to a customer. So long as the log records of
the ISP are kept, the transaction is traceable. Bottom line: If you want
anonymity, use a pay phone!
This traceability in the Internet raised some important concerns at the
beginning of 2006. Google announced it would fight a demand by the govern-
ment to produce one million sample searches. (MSN and Yahoo! had both
complied with the same request.) That request was made as part of an inves-
tigation the government was conducting to support its defense of a statute
designed to block kids from porn. And though the request promised the data
would be used for no other purpose, it raised deep concerns in the Internet
community. Depending upon the data that Google kept, the request showed
in principle that it was possible to trace legally troubling searches back to
individual IP addresses (and to individuals with Google accounts). Thus, for
example, if your Internet address at work is a fixed-IP address, then every
search you’ve ever made from work is at least possibly kept by Google. Does
that make you concerned? And assume for the moment you are not a
terrorist: Would you still be concerned?
A link back to an IP address, however, only facilitates tracing, and again,
even then not perfect traceability. ISPs don’t keep data for long (ordinarily);
some don’t even keep assignment records at all. And if you’ve accessed the
Internet at an Internet café, then there’s no reason to believe anything could
be traced back to you. So still, the Internet provides at least some anonymity.
But IP tracing isn’t the only technology of identification that has been lay-
ered onto the Internet. A much more pervasive technology was developed
early in the history of the Web to make the web more valuable to commerce
and its customers. This is the technology referred to as “cookies.”
When the World Wide Web was first deployed, the protocol simply
enabled people to view content that had been marked up in a special pro-
gramming language. This language (HTML) made it easy to link to other
pages, and it made it simple to apply basic formatting to the content (bold, or
italics, for example).
But the one thing the protocol didn’t enable was a simple way for a web-
site to know which machines had accessed it. The protocol was “state-less.” When a web server received a request to serve a web page, it didn’t know any-
thing about the state of the requester before that request was made.5
From the perspective of privacy, this sounds like a great feature for the
Web. Why should a website know anything about me if I go to that site to view
certain content? You don’t have to be a criminal to appreciate the value in
anonymous browsing. Imagine libraries kept records of every time you
opened a book at the library, even for just a second.
Yet from the perspective of commerce, this “feature” of the original Web
is plainly a bug, and not because commercial sites necessarily want to know
everything there is to know about you. Instead, the problem is much more
pragmatic. Say you go to Amazon.com and indicate you want to buy 20 copies
of my latest book. (Try it. It’s fun.) Now your “shopping cart” has 20 copies of
my book. You then click on the icon to check out, and you notice your shop-
ping cart is empty. Why? Well because, as originally architected, the Web had
no easy way to recognize that you were the same entity that just ordered 20
books. Or put differently, the web server would simply forget you. The Web as
originally built had no way to remember you from one page to another. And
thus, the Web as originally built would not be of much use to commerce.
But as I’ve said again and again, the way the Web was is not the way the
Web had to be. And so those who were building the infrastructure of the Web
quickly began to think through how the web could be “improved” to make it
easy for commerce to happen. “Cookies” were the solution. In 1994, Netscape
introduced a protocol to make it possible for a web server to deposit a small
bit of data on your computer when you accessed that server. That small bit of
data—the “cookie”—made it possible for the server to recognize you when
you traveled to a different page. Of course, there are lots of other concerns
about what that cookie might enable. We’ll get to those in the chapter about
privacy. The point that’s important here, however, is not the dangers this tech-
nology creates. The point is the potential and how that potential was built. A
small change in the protocol for client-server interaction now makes it possi-
ble for websites to monitor and track those who use the site.
This is a small step toward authenticated identity. It’s far from that, but it
is a step toward it. Your computer isn’t you (yet). But cookies make it possible
for the computer to authenticate that it is the same machine that was access-
ing a website a moment before. And it is upon this technology that the whole
of web commerce initially was built. Servers could now “know” that this
machine is the same machine that was here before. And from that knowledge,
they could build a great deal of value.
Now again, strictly speaking, cookies are nothing more than a tracing
technology. They make it simple to trace a machine across web pages. That tracing doesn’t necessarily reveal any information about the user. Just as we
could follow a trail of cookie crumbs in real space to an empty room, a web
server could follow a trail of “mouse droppings” from the first entry on the
site until the user leaves. In both cases, nothing is necessarily revealed about
the user.
But sometimes something important is revealed about the user by associ-
ation with data stored elsewhere. For example, imagine you enter a site, and it
asks you to reveal your name, your telephone number, and your e-mail address
as a condition of entering a contest. You trust the website, and do that, and
then you leave the website. The next day, you come back, and you browse
through a number of pages on that website. In this interaction, of course,
you’ve revealed nothing. But if a cookie was deposited on your machine
through your browser (and you have not taken steps to remove it), then when
you return to the site, the website again “knows” all these facts about you. The
cookie traces your machine, and this trace links back to a place where you
provided information the machine would not otherwise know.
The traceability of IP addresses and cookies is the default on the Internet
now. Again, steps can be taken to avoid this traceability, but the vast majority
of us don’t take them. Fortunately, for society and for most of us, what we do
on the Net doesn’t really concern anyone. But if it did concern someone, it
wouldn’t be hard to track us down. We are a people who leave our “mouse
droppings” everywhere.
This default traceability, however, is not enough for some. They require
something more. That was Harvard’s view, as I noted in the previous chapter.
That is also the view of just about all private networks today. A variety of
technologies have developed that enable stronger authentication by those
who use the Net. I will describe two of these technologies in this section. But
it is the second of these two that will, in my view, prove to be the most impor-
tant.
The first of these technologies is the Single Sign-on (SSO) technology.
This technology allows someone to “sign-on” to a network once, and then get
access to a wide range of resources on that network without needing to
authenticate again. Think of it as a badge you wear at your place of work.
Depending upon what the badge says (“visitor” or “researcher”) you get dif-
ferent access to different parts of the building. And like a badge at a place of
work, you get the credential by giving up other data. You give the receptionist
an ID; he gives you a badge; you wear that badge wherever you go while at the
business.
The most commonly deployed SSO is a system called Kerberos. But
there are many different SSOs out there—Microsoft’s Passport system is an example—and there is a strong push to build federated SSOs for linking many
different sites on the Internet. Thus, for example, in a federated system, I
might authenticate myself to my university, but then I could move across any
domain within the federation without authenticating again. The big advan-
tage in this architecture is that I can authenticate to the institution I trust
without spreading lots of data about myself to institutions I don’t trust.
SSOs have been very important in building identity into the Internet. But
a second technology, I believe, will become the most important tool for iden-
tification in the next ten years. This is because this alternative respects impor-
tant architectural features of the Internet, and because the demand for better
technologies of identification will continue to be strong. Forget the hassle of
typing your name and address at every site you want to buy something from.
You only need to think about the extraordinary growth in identity theft to rec-
ognize there are many who would be eager to see something better come
along.
To understand this second system, think first about how credentials work
in real space.6 You’ve got a wallet. In it is likely to be a driver’s license, some
credit cards, a health insurance card, an ID for where you work, and, if you’re
lucky, some money. Each of these cards can be used to authenticate some fact
about you—again, with very different levels of confidence. The driver’s license
has a picture and a list of physical characteristics. That’s enough for a wine
store, but not enough for the NSA. The credit card has your signature. Ven-
dors are supposed to use that data to authenticate that the person who signs
the bill is the owner of the card. If the vendor becomes suspicious, she might
demand that you show an ID as well.
Notice the critical features of this “wallet” architecture. First, these creden-
tials are issued by different entities. Second, depending upon their technology,
they offer different levels of confidence. Third, I’m free to use these credentials
in ways never originally planned or intended by the issuer of the credential.
The Department of Motor Vehicles never coordinated with Visa to enable
driver’s licenses to be used to authenticate the holder of a credit card. But
once the one was prevalent, the other could use it. And fourth, nothing
requires that I show all my cards when I can use just one. That is, to show my
driver’s license, I don’t also reveal my health insurance card. Or to use my Visa,
I don’t also have to reveal my American Express card.
These same features are at the core of what may prove to be the most
important addition to the effective architecture of the Internet since its birth.
This is a project being led by Microsoft to essentially develop an Identity
Metasystem—a new layer of the Internet, an Identity Layer, that would com-
plement the existing network layers to add a new kind of functionality. This Identity Layer is not Microsoft Passport, or some other Single Sign-On tech-
nology. Instead it is a protocol to enable a kind of virtual wallet of credentials,
with all the same attributes of the credentials in your wallet—except better.
This virtual wallet will not only be more reliable than the wallet in your
pocket, it will also give you the ability to control more precisely what data
about you is revealed to those who demand data about you.
For example, in real space, your wallet can easily be stolen. If it’s stolen,
then there’s a period of time when it’s relatively easy for the thief to use the
cards to buy stuff. In cyberspace, these wallets are not easily stolen. Indeed, if
they’re architected well, it would be practically impossible to “steal” them.
Remove the cards from their holder, and they become useless digital objects.
Or again, in real space, if you want to authenticate that you’re over 21 and
therefore can buy a six-pack of beer, you show the clerk your driver’s license.
With that, he authenticates your age. But with that bit of data, he also gets
access to your name, your address, and in some states, your social security
number. Those other bits of data are not necessary for him to know. In some
contexts, depending on how creepy he is, these data are exactly the sort you
don’t want him to know. But the inefficiencies of real-space technologies
reveal these data. This loss of privacy is a cost of doing business.
The virtual wallet would be different. If you need to authenticate your age,
the technology could authenticate that fact alone—indeed, it could authenti-
cate simply that you’re over 21, or over 65, or under 18, without revealing any-
thing more. Or if you need to authenticate your citizenship, that fact can be
certified without revealing your name, or where you live, or your passport
number. The technology is crafted to reveal just what you want it to reveal,
without also revealing other stuff. (As one of the key architects for this meta-
system, Kim Cameron, described it: “To me, that’s the center of the system.”7)
And, most importantly, using the power of cryptography, the protocol makes
it possible for the other side to be confident about the fact you reveal without
requiring any more data.
The brilliance in this solution to the problems of identification is first
that it mirrors the basic architecture of the Internet. There’s no central repos-
itory for data; there’s no network technology that everyone must adopt. There
is instead a platform for building identity technologies that encourages com-
petition among different privacy and security providers—TCP/IP for identity.
Microsoft may be leading the project, but anyone can build for this protocol.
Nothing ties the protocol to the Windows operating system. Or to any other
specific vendor. As Cameron wisely puts it, “it can’t be owned by any one
company or any one country . . . or just have the technology stamp of any one
engineer.”

Seitenhierarchie

Links

Hilfe

Intranet-Tools