Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Unicode/ICU question about joining lines
- Date: Thu, 12 Aug 2021 17:05:58 +0900
- From: Travis Cardwell <travis.cardwell@example.com>
- Subject: Re: [tlug] Unicode/ICU question about joining lines
- References: <CACaJP_QGLoO=qFPSQYUFp3PvZy7O7PTEBFvjBaom4-vPuHZLmw@mail.gmail.com> <CAAhy3dufsFDgNaF0V5yq0-VxKSyC5kkF4kF7dLabnYKk8o67rQ@mail.gmail.com> <CADR0rncr_fhnuKBzk1qqx=3niZnHBEQp31k5XrjnuFzdNwR-Vg@mail.gmail.com> <CACaJP_QU4zS-NjzuX5mq4c+uuMCsOk6otJTD-GCig94k_ZQtmg@mail.gmail.com> <2ILEGJ5IPJ99U.3OOI8BA92KS0X@wilsonb.com>
On Thu, Aug 12, 2021 at 4:21 PM <eizietheez@example.com> wrote: > Do you accept multi-lingual text? If not, then a simple hack would be to just > look for spaces in the input text and classify the language accordingly. The > probability of mis-detection should decrease exponentially with the input > length. That is an interesting idea! Thanks! The software that I am working on does accept multilingual text, but users can write the text in one (long) line in cases where lines are not joined correctly, so this could be a viable option. > Of course, even J語 does sometimes contain spaces in practice, simply as a > mistake or as a kind of "scare quote" emphasis around words. At a company that I worked at, developers put spaces around all ローマ字 words in Japanese text. I am not certain, but I think that the practice originated because the ticketing system in use required spaces to correctly parse markup: 例えば、 @foldText@ は関数である。 I suspect that they started to put spaces around all such ローマ字 for consistency. An unfortunate result was that such spaces would often cause unsightly line wrapping in rendered text. Cheers, Travis
- References:
- [tlug] Unicode/ICU question about joining lines
- From: Travis Cardwell
- Re: [tlug] Unicode/ICU question about joining lines
- From: Raymond Wan
- Re: [tlug] Unicode/ICU question about joining lines
- From: Benjamin Kowarsch
- Re: [tlug] Unicode/ICU question about joining lines
- From: Travis Cardwell
- Re: [tlug] Unicode/ICU question about joining lines
- From: eizietheez
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Unicode/ICU question about joining lines
- Next by Date: Re: [tlug] Unicode/ICU question about joining lines
- Previous by thread: Re: [tlug] Unicode/ICU question about joining lines
- Next by thread: [tlug] Unicode/ICU question about joining lines
- Index(es):