How much do you remember from elementary school? I remember vinyl tile floors,
the playground, the teacher sentencing me to standing in the hallway. I had a
teacher who was a chess fanatic; he painted a huge chess board in the paved
schoolyard and got someone to fabricate big wooden chess pieces. It was enough
of an event to get us on the evening news. I remember Run for the Arts, where I
tried to talk people into donating money on the theory that I could run, which
I could not. I'm about six months into trying to change that and I'm good for a
mediocre 5k now, but I don't think that's going to shift the balance on K-12
art funding.
I also remember a domain name: bridger.pps.k12.or.us
I have quipped before that
computer science is a field mostly concerned with assigning numbers to things,
which is true, but it only takes us so far. Computer scientists also like to
organize those numbers into structures, and one of their favorites has always
been the tree. The development of wide-area computer networking surfaced a
whole set of problems around naming or addressing computer systems that belong
to organizations. A wide-area network consists of a set of institutions that
manage their own affairs. Each of those institutions may be made up of
departments that manage their own affairs. A tree seemed a natural fit. Even
the "low level" IP addresses, in the days of "classful" addressing, were
a straightforward hierarchy: each dot separated a different level of the tree,
a different step in an organizational hierarchy.
The first large computer networks, including those that would
become the Internet, initially relied on manually building lists of machines by
name. By the time the Domain Name System was developed, this had already become
cumbersome. The rapid growth of the internet was hard to keep up with, and besides,
why did any one central entity---Jon Postel or whoever---even care about the
names of all of the computers at Georgia Tech? Like IP addressing, DNS was designed
as a hierarchy with delegated control. A registrant obtains a name in the hierarchy,
say gatech.edu, and everything "under" that name is within the control, and
responsibility, of the registrant. This arrangement is convenient for both the
DNS administrator, which was a single organization even after the days of Postel,
and for registrants.
We still use the same approach today... mostly. The meanings of levels of the
hierarchy have ossified. Technically speaking, the top of the DNS tree, the DNS
root, is a null label referenced by a trailing dot. It's analogous to the '/'
at the beginning of POSIX file paths. "gatech.edu" really should be written as
"gatech.edu." to make it absolute rather than relative, but since resolution of
relative URLs almost always recurses to the top of the tree, the trailing dot
is "optional" enough that it is now almost always omitted. The analogy to POSIX
file paths raises an interesting point: domain names are backwards. The 'root'
is at the end, rather than at the beginning, or in other words, they run from
least significant to most significant, rather than most significant to least
significant. That's just... one of those things, you know? In the early days
one wasn't obviously better than the other, people wrote hierarchies out both
ways, and as the dust settled the left-to-right convention mostly prevailed
but right-to-left hung around in some protocols. If you've ever dealt with
endianness, this is just one of those things about computers that you have to
accept: we cannot agree on which way around to write things.
Anyway, the analogy to file paths also illustrates the way that DNS has ossified.
The highest "real" or non-root component of a domain name is called the top-level
domain or TLD, while the component below it is called a second-level domain. In
the US, it was long the case that top-level domains were fixed while second-level
domains were available for registration. There have always been exceptions in
other countries and our modern proliferation of TLDs has changed this somewhat,
but it's still pretty much true. When you look at "gatech.edu" you know that
"edu" is just a fixed name in the hierarchy, used to organize domain names by
organization type, while "gatech" is a name that belongs to a registrant.
Under the second-level name, things get a little vague. We are all familiar with
the third-level name "www," which emerged as a convention for web servers and
became a practical requirement. Web servers having the name "www" under an
organization's domain was such a norm for so many years that hosting a webpage
directly at a second-level name came to be called a "naked domain" and had some
caveats and complications.
Other than www, though, there are few to no standards for the use of third-level
and below names. Larger organizations are more likely to use third-level names
for departments, infrastructure operators often have complex hierarchies of
names for their equipment, and enterprises the world 'round name their load-balanced
webservers "www2," "www3" and up. If you think about it, this situation seems like
kind of a failure of the original concept of DNS... we do use the hierarchy, but
for the most part it is not intended for human consumption. Users are only
expected to remember two names, one of which is a TLD that comes from a relatively
constrained set.
The issue is more interesting when we consider geography. For a very long time, TLDs
have been split into two categories: global TLDs, or gTLDs, and country-code TLDs,
or ccTLDs. ccTLDs reflect the ISO country codes of each country, and are intended for
use by those countries, while gTLDs are arbitrary and reflect the fact that DNS was
designed in the US. The ".gov" gTLD, for example, is for use by the US government,
while the UK is stuck with ".gov.uk". This does seem unfair but it's now very much
cemented into the system: for the large part, US entities use gTLDs, while entities
in other countries use names under their respective ccTLDs. The ".us" ccTLD exists
just as much as all the others, but is obscure enough that my choice to put my
personal website under .us (not an ideological decision but simply a result of where
a nice form of my name was available) sometimes gets my email address rejected.
Also, a common typo for ".us" is ".su" and that's geopolitically amusing. .su is of
course the ccTLD for the Soviet Union, which no longer exists, but the ccTLD lives
on in a limited way because it became Structurally Important and difficult to remove, as names and addresses
tend to do.
We can easily imagine a world where this historical injustice had been fixed: as
the internet became more global, all of our US institutions could have moved under
the .us ccTLD. In fact, why not go further? Geographers have long organized political
boundaries into a hierarchy. The US is made up of states, each of which has been
assigned a two-letter code by the federal government. We have ".us", why not "nm.us"?
The answer, of course, is that we do.
In the modern DNS, all TLDs have been delegated to an organization who administers
them. The .us TLD is rightfully administered by the National Telecommunications and
Information Administration, on the same basis by which all ccTLDs are delegated to
their respective national governments. Being the US government, NTIA has naturally
privatized the function through a contract to telecom-industrial-complex giant
Neustar. Being a US company, Neustar restructured and sold its DNS-related business
to GoDaddy. Being a US company, GoDaddy rose to prominence on the back of infamously
tasteless television commercials, and its subsidiary Registry Services LLC now
operates our nation's corner of the DNS.
But that's the present---around here, we avoid discussing the present so as to hold
crushing depression at bay. Let's turn our minds to June 1993, and the publication of
RFC 1480 "The US Domain." To wit:
Even though the original intention was that any educational
institution anywhere in the world could be registered under the EDU
domain, in practice, it has turned out with few exceptions, only
those in the United States have registered under EDU, similarly with
COM (for commercial). In other countries, everything is registered
under the 2-letter country code, often with some subdivision. For
example, in Korea (KR) the second level names are AC for academic
community, CO for commercial, GO for government, and RE for research.
However, each country may go its own way about organizing its domain,
and many have.
Oh, so let's sort it out!
There are no current plans of putting all of the organizational
domains EDU, GOV, COM, etc., under US. These name tokens are not
used in the US Domain to avoid confusion.
Oh. Oh well.
Currently, only four year colleges and universities are being
registered in the EDU domain. All other schools are being registered
in the US Domain.
Huh?
RFC 1480 is a very interesting read. It makes passing references to so many
facets of DNS history that could easily be their own articles. It also
defines a strict, geography-based hierarchy for the .us domain that is a
completely different universe from the one in which we now live. For example,
we learned above that, in 1993, only four-year institutions were being
placed under .edu. What about the community colleges? Well, RFC 1480 has an
answer. Central New Mexico Community College would, of course, fall under
cnm.cc.nm.us. Well, actually, in 1993 it was called the Technical-Vocational
Institute, so it would have been tvi.tec.nm.us. That's right, the RFC
describes both "cc" for community colleges and "tec" for technical institutes.
Even more surprising, it describes placing entities under a "locality" such as
a city. The examples of localities given are "berkeley.ca.us" and "portland.wa.us", the
latter of which betrays an ironic geographical confusion. It then specifies "ci"
for city and "co" for county, meaning that the city government of our notional
Portland, Washington would be ci.portland.wa.us. Agencies could go under the
city government component (the RFC gives the example "Fire-Dept.CI.Los-Angeles.CA.US")
while private businesses could be placed directly under the city (e.g. "IBM.Amonk.NY.US").
The examples here reinforce that the idea itself is different from how we use DNS
today: The DNS of RFC 1480 is far more hierarchical and far more focused on full
names, without abbreviations.
Of course, the concept is not limited to local government. RFC 1480 describes
"fed.us" as a suffix for the federal government (the example "dod.fed.us" illustrates
that this has not at all happened), and even "General Independent Entities" and
"Distributed National Institutes" for those trickier cases.
We can draw a few lessons from how this proposal compares to our modern day.
Back in the 1990s, .gov was limited to the federal government.
The thinking was that all government agencies would move into .us, where the
hierarchical structure made it easier to delegate management of state and
locality subtrees. What actually happened was the opposite: the .us thing
never really caught on, and a more straightforward and automated management
process made .gov available to state and local governments. The tree has
effectively been flattened.
That's not to say that none of these hierarchical names ever caught on.
GoDaddy continues to maintain what they call the "usTLD Locality-Based
Structure". At the decision of the relevant level of the hierarchy (e.g.
a state), locality-based subdomains of .us can either be delegated to
the state or municipality to operate, or operated by GoDaddy itself as
the "Delegated Manager." The latter arrangement is far more common, and
it's going to stay that way: RFC 1480 names are not dead, but they are
on life support. GoDaddy's contract allows them to stop onboarding any
additional delegated managers, and they have.
Few of these locality-based names found wide use, and there are even
fewer today. Multnomah County Library once used "multnomah.lib.or.us,"
which I believe was actually the very first "library" domain name registered.
It now silently redirects to "multcolib.org", which
we could consider a graceful name only in that the spelling of
"Multnomah" is probably not intuitive to those not from the region. As
far as I can tell, the University of Oregon and OGI (part of OHSU)
were keeping very close tabs on the goings-on of academic DNS, as
Oregon entities are conspicuously over-represented in the very early
days of RFC 1480 names---behind only California, although Georgia
Tech and Trent Heim of former Colorado company XOR both give their
respective states a run for the money.
"co.bergen.nj.us" works, but just gets you a redirect notice page to
bergencountynj.gov. It's interesting that this name is actually longer
than the RFC 1480 name, but I think most people would agree that bergencountynj.gov
is easier to remember. Some of that just comes down to habit, we all know ".gov",
but some of it is more fundamental. I don't think that people often
understand the hierarchical structure of DNS, at least not intuitively, and
that makes "deeply hierarchical" (as GoDaddy calls them) names confusing.
Certainly the RFC 1480 names for school districts produced complaints.
They were also by far the most widely adopted. You can pick and choose
examples of libraries (.lib..us) and municipal governments that have
used RFC 1480 names, but school districts are another world: most school
districts that existed at the time have a legacy of using RFC 1480 naming.
As one of its many interesting asides, RFC 1480 explains why: the practice
of putting school districts under .k12..us actually
predates RFC 1480. Indeed, the RFC seems to have been written in part to
formalize the existing practice. The idea of the k12..us hierarchy
originated within IANA in consultation with InterNIC (newly created at
the time) and the Federal Networking Council, a now-defunct advisory
committee of federal agencies that made a number of important early
decisions about internet architecture.
RFC 1480 is actually a revision on the slightly older RFC 1386, which
instead of saying that schools were already using the k12 domains, says that
"there ought to be a consistent scheme for naming them." It then says
that the k12 branch has been "introduced" for that purpose. RFC 1386 is
mostly silent on topics other than schools, so I think it was written
mostly to document the decision made about schools with other details about
the use of locality-based domains left sketchy until the more thorough
RFC 1480.
The decision to place "k12" under the state rather than under a municipality
or county might seem odd, but the RFC gives a reason. It's not unusual for
school districts, even those named after a municipality, to cover a larger
area than the municipality itself. Albuquerque Public Schools operates
schools in the East Mountains; Portland Public Schools operates schools
across multiple counties and beyond city limits. Actually the RFC gives
exactly that second one as an example:
For example, the Portland school
district in Oregon, is in three or four counties. Each of those
counties also has non-Portland districts.
I include that quote mostly because I think it's funny that the authors
now know what state Portland is in. When you hear "DNS" you think Jon
Postel, at least if you're me, but RFC 1480 was written by Postel along
with a less familiar name, Ann Westine Cooper. Cooper was a coworker of
Postel at USC, and RFC 1480 very matter-of-factly names the duo of
Postel and Cooper as the administrator of the .US TLD. That's interesting
considering that almost five years later Postel would become involved in
a notable conflict with the federal government over control of DNS---one
of the events that precipitated today's eccentric model of public-private
DNS governance.
There are other corners of the RFC 1480 scheme that were not contemplated
in 1993, and have managed to outlive many of the names that were. Consider,
for example, our indigenous nations: these are an exception to the normal
political hierarchy of the US. The Navajo Nation, for example, exists in a
state that is often described as parallel to a state, but isn't really.
Native nations are sovereign, but are also subject to federal law by
statute, and subject to state law by various combinations of statute,
jurisprudence, and bilateral agreement. I didn't really give any detail
there and I probably still got something wrong, such is the complicated
legal history and present of Native America. So where would a native
sovereign government put their website? They don't fall under the
traditional realm of .gov, federal government, nor do they fall under a
state-based hierarchy. Well, naturally, the Navajo Nation is found at
navajo-nsn.gov.
We can follow the "navajo" part but the "nsn" is odd, unless they spelled
"nation" wrong and then abbreviated it, which I've always thought is what
it looks like on first glance. No, this domain name is very much an artifact
of history. When the problem of sovereign nations came to Postel and Cooper,
the solution they adopted was a new affinity group, like "fed" and "k12"
and "lib": "nsn", standing for Native Sovereign Nation. Despite being a
late comer, nsn.us probably has the most enduring use of any part of the
RFC 1480 concept. Dozens of pueblos, tribes, bands, and confederations
still use it. squamishtribe.nsn.us, muckleshoot.nsn.us, ctsi.nsn.us,
sandiapueblo.nsn.us.
Yet others have moved away... in a curiously "partial" fashion. navajo-nsn.gov
as we have seen, but an even more interesting puzzler is tataviam-nsn.us. It's
only one character away from a "standardized" NSN affinity group locality domain,
but it's so far away. As best I can tell, most of these governments initially
adopted "nsn.us" names, which cemented the use of "nsn" in a similar way to "state"
or "city" as they appear in many .gov domains to this day. Policies on .gov
registration may be a factor as well, the policies around acceptable .gov names
seem to have gone through a long period of informality and then changed a number
of times. Without having researched it too deeply, I have seen bits and pieces
that make me think that at various points NTIA has preferred that .gov domains
for non-federal agencies have some kind of qualifier to indicate their "level"
in the political hierarchy. In any case, it's a very interesting situation because
"native sovereign nation" is not otherwise a common term in US government.
It's not like lawyers or lawmakers broadly refer to tribal governments as NSNs,
the term is pretty much unique to the domain names.
So what ever happened to locality-based names? RFC 1480 names have fallen
out of favor to such an extent as to be considered legacy by many of their
users. Most Americans are probably not aware of this name hierarchy at all,
despite it ostensibly being the unified approach for this country. In
short, it failed to take off, and those sectors that had widely adopted it
(such as schools) have since moved away. But why?
As usual, there seem to be a few reasons. The first is user-friendliness.
This is, of course, a matter of opinion---but anecdotally, many people
seem to find deeply hierarchical domain names confusing. This may be a
self-fulfilling prophecy, since the perception that multi-part DNS names
are user-hostile means that no one uses them which means that no users
are familiar with them. Maybe, in a different world, we could have broken
out of that loop. I'm not convinced, though. In RFC 1480, Postel and
Cooper argue that a deeper hierarchy is valuable because it allows for
more entities to have their "obviously correct" names. That does make
sense to me, splitting the tree up into more branches means that there is
less name contention within each branch. But, well, I think it might be
the kind of logic that is intuitive only those who work in computing.
For the general public, I think long multi-part names quickly become
difficult to remember and difficult to type. When you consider the dollar
amounts that private companies have put into dictionary word domain names,
it's no surprise that government agencies tend to prefer one-level names
with full words and simple abbreviations.
I also think that the technology outpaced the need that RFC 1480 was
intended to address. The RFC makes it very clear that Postel and Cooper
were concerned about the growing size of the internet, and expected the
sheer number of organizations going online to make maintenance of the DNS
impractical. They correctly predicted the explosion of hosts, but not the
corresponding expansion of the DNS bureaucracy. Between the two versions
of the .us RFC, DNS operations were contracted to Network Solutions. This
began a winding path that lead to delegation of DNS zones to various
private organizations, most of which fully automated registration and
delegation and then federated it via a common provisioning protocol. The
size of, say, the .com zone really did expand beyond what DNS's designers
had originally anticipated... but it pretty much worked out okay. The
mechanics of DNS's maturation probably had a specifically negative effect
on adoption of .us, since it was often under a different operator from the
"major" domain names and not all "registrars" initially had access.
Besides, the federal government never seems to have been all that on board
with the concept. RFC 1480 could be viewed as a casualty of the DNS wars,
a largely unexplored path on the branch of DNS futures that involved IANA
becoming completely independent of the federal government. That didn't
happen. Instead, in 2003 .gov registration was formally opened to municipal,
state, and tribal governments. It became federal policy to encourage use of
.gov for trust reasons (DNSSEC has only furthered this), and .us began to fall by the wayside.
That's not to say that RFC 1480 names have ever gone away. You can still
find many of them in use. state.nm.us doesn't have an A record, but governor.state.nm.us
and a bunch of other examples under it do. The internet is littered
with these locality-based names, many of them hiding out in smaller
agencies and legacy systems. Names are hard to get right, and one of the reasons is
that they're very hard to get rid of.
When things are bigger, names have to be longer. There is an
argument that with only 8-character names, and in each position allow
a-z, 0-9, and -, you get 37**8 = 3,512,479,453,921 or 3.5 trillion
possible names. It is a great argument, but how many of us want
names like "xs4gp-7q". It is like license plate numbers, sure some
people get the name they want on a vanity plate, but a lot more
people who want something specific on a vanity plate can't get it
because someone else got it first. Structure and longer names also
let more people get their "obviously right" name.
You look at Reddit these days and see all these usernames that are two
random words and four random numbers, and you see that Postel and Cooper
were right. Flat namespaces create a problem, names must either be complex
or long, and people don't like it either. What I think they got wrong, at
a usability level, is that deep hierarchies still create names that are
complex and long. It's a kind of complexity that computer scientists
are more comfortable with, but that's little reassurance when you're
starting down the barrel of "bridger.pps.k12.or.us".