Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Character encoding stuff
- Date: Fri, 31 May 2013 02:01:51 +0900
- From: Nguyen Viet Cuong <mrcuongnv@example.com>
- Subject: Re: [tlug] Character encoding stuff
- References: <51A76F77.9030306@imaginatorium.org>
Hi Brian,(1) You can extract such information from: ftp://ftp.unicode.org/Public/6.3.0/ucd/UnicodeData-6.3.0d3.txt
On Fri, May 31, 2013 at 12:25 AM, Brian Chandler <brian@example.com> wrote:I write bits of my website in PHP, and am always bumping up against
character set issues. Here are (is??) a plurality of questions.
(1) In particular, when scraping jigsaw puzzle manufacturer websites, I
want to know what characters I'm looking at. Things like "Is that cross
a *multiplication sign, *lowercase-x, *capital-X, *zenkaku-x,
*zenkaku-X, or who knows what (х for example, and I managed to type that
one in). I started looking on the web, then realised I actually wrote a
primitive one myself: for example
http://imaginatorium.org/svc/unicode.php?ins=x%C3%97%D1%85
But it would be nice to get more than just numbers: stuff like
"Cyrillic", "Punctuation" etc. Any suggestions for useful tools, either
Web-based or a screen utility I can run in Linux?
(2) I user gedit, which is sort of fine, but it does Really Stupid
(sorry, I mean "clever") display tricks, trying to guess how things
should be shown depending on surrounding characters. So paste in the
following two lines, and the two marus appear completely different (in
size: both are circles):
これは、○です。(マル)
But this is exactly the same character: ○
Are there any suggestions of editors more suited to multi-script work?
There are a few other things, but I'd better go an watch детараме хиро
now. (That came out wrong...)
Brian Chandler
--
To unsubscribe from this mailing list,
please see the instructions at http://lists.tlug.jp/list.html
The TLUG mailing list is hosted by ASAHI Net, provider of mobile and
fixed broadband Internet services to individuals and corporations.
Visit ASAHI Net's English-language Web page: http://asahi-net.jp/en/
--
Nguyen Viet Cuong
- Follow-Ups:
- Re: [tlug] Character encoding stuff
- From: Nguyen Viet Cuong
- References:
- [tlug] Character encoding stuff
- From: Brian Chandler
Home | Main Index | Thread Index
- Prev by Date: [tlug] Character encoding stuff
- Next by Date: Re: [tlug] Character encoding stuff
- Previous by thread: [tlug] Character encoding stuff
- Next by thread: Re: [tlug] Character encoding stuff
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links