Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Weird font problem



> And yet their character codes are 0x92, 0x93 and 0x94 respectively. Aren't
> extended characters in Unicode at least 2 bytes in length?

#
UTF-8 is popular for HTML and similar protocols. UTF-8 is a way of 
transforming all Unicode characters into a variable length encoding of 
bytes. It has the advantages that the Unicode characters corresponding 
to the familiar ASCII set have the same byte values as ASCII, and that 
Unicode characters transformed into UTF-8 can be used with much existing 
software without extensive software rewrites.
<http://www.unicode.org/standard/principles.html>
#

Most savvy coders who have switched to UTF-8 still stick to decimal html 
entities for extended characters: while today's standard-compliant 
browsers should have no problems handling the literals, back-end 
scripting may be a different matter.

Practically speaking, test your setup against the big guns in the front 
end industry:

http://www.zeldman.com
http://www.alistapart.com/stories/emen/
http://www.bradchoate.com/
http://daringfireball.net/

Unlike that Whazzit page, they get their curly quotes right. If your 
system can't handle them, you'll need to tweak on.

:: Rudolf Ammann
:: Mie University, Japan


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links