Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Website Question(s)



>>>>> "Michael" == Michael Smith <smith@example.com> writes:

    Michael> "Lyle (Hiroshi) Saxon" <ronfaxon@example.com> writes:

    >> http://www5d.biglobe.ne.jp/~LLLtrs/PhotoGlryMain/pgb/Kurihama01a.html

    >> What is weird is that sometimes (not always) this squiggly line

Actually, it ALWAYS happens, though usually not where you can see it.
The HTTP protocol requires that the URL be sent to the server in that
form.

    >> becomes two characters - a "%" and a "7", as follows:
    >> http://www5d.biglobe.ne.jp/%7ELLLtrs/PhotoGlryMain/pgb/Kurihama01a.html

    >> Is the bloody squiggly line actually a two-bit character that
    >> must be a "%" and a "7" on some computers?  Whatever bozo
    >> thought it was a good idea to put that in there should have
    >> their neck wrung!

    Michael> That is indeed some crazy and bizarre, wacky mixed-up
    Michael> stuff. I have no idea what in the world may be going on
    Michael> there. Maybe some kind of prank.

*sigh*  Go to Bookstore and tachiyomi OReilly books, do not pass Go, do
not collect Summer Bonus.

First, it's called a tilde (hankaku nami in Japanese).

Second, as you can see, the encoded form is not two characters, it's
three: %7E, the URL escape character followed by the ASCII code for
tilde in hexadecimal.

Third, the tilde is the conventional Unixoid abbreviation for "home
directory".  If followed by a valid username, it is the home directory
of that user.  If not, it is the home directory of the current user.
In the case of a web server, the notion of home directory is somewhat
unclear.  If the user is actually a system user, it is normally not
that user's system home (too dangerous), but a subdirectory (often
~/public_html).  This is a special case, since user homes should not
be under the webserver's DocumentRoot.  If the user is a virtual user
(ie, exists only as web space and maybe a mail alias), then it is some
arbitrary place, usually under DocumentRoot for convenience.

Fourth, the translation you observe is called "URL-encoding", and it's
used so that "unsafe" characters do not appear in URLs.  The reason
for this is that the URLs get parsed by and passed into functions that
don't necessarily know what to do with the characters, or might do the
wrong thing.  A typical example is the ASCII space character, because
most string processing routines interpret it as a token separator.
The tilde is considered unsafe because it is an abbreviation which has
different meanings in different contexts, and thus behavior is
somewhat indeterminate.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links