Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Unicode/ICU question about joining lines
- Date: Fri, 13 Aug 2021 10:26:02 +0900
- From: eizietheez@example.com
- Subject: Re: [tlug] Unicode/ICU question about joining lines
- References: <CACaJP_QGLoO=qFPSQYUFp3PvZy7O7PTEBFvjBaom4-vPuHZLmw@mail.gmail.com> <24853.16777.99356.678602@turnbull.sk.tsukuba.ac.jp> <CACaJP_TprrtCgg-PNwUN7CfzCdNE4H13Gw6Srhw3Ck+yn-fe1A@mail.gmail.com>
- User-agent: mblaze/1.1
> I was able to work around the issue this time, but I have been > frustrated with software that always inserts a space when joining lines > for many years, so I will likely revisit the problem in the future and > classify those blocks! :) Admittedly, I am out of my comfort zone here, but isn't orthography an orthogonal issue to the script itself? For a fully cross-lingual solution, I would suspect that you at least need metadata about what language you are processing, in addition to the characters themself. What are some examples of scripts that differ in their usage of word-boundary spaces depending on language? A dumb example might be classical Latin and Greek vs their modern equivalents.
- Follow-Ups:
- Re: [tlug] Unicode/ICU question about joining lines
- From: Travis Cardwell
- References:
- [tlug] Unicode/ICU question about joining lines
- From: Travis Cardwell
- [tlug] Unicode/ICU question about joining lines
- From: Stephen J. Turnbull
- Re: [tlug] Unicode/ICU question about joining lines
- From: Travis Cardwell
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Unicode/ICU question about joining lines
- Next by Date: Re: [tlug] Unicode/ICU question about joining lines
- Previous by thread: Re: [tlug] Unicode/ICU question about joining lines
- Next by thread: Re: [tlug] Unicode/ICU question about joining lines
- Index(es):