Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: carriage returns



>>>>> "Tony" == Tony Laszlo <laszlo@example.com> wrote:

    Tony> I use jvim and yudit for editing Japanese documents. I am
    Tony> having trouble with long strings of Japanese text which have
    Tony> no carriage returns. With Yudit,

Save it to a file.  Use nkf or something to find out what the encoding
is (with yudit, I assume it will be UTF-8).  Then use iconv(1) to
convert it to 16-bit Unicode, and cut every 60 bytes (30 characters).
(Note that the NL character(s) should also be two bytes each; you
should probably specify bigendian Unicode and insert "\000\n" or
"\000\r\000\n".)  Then convert it back with iconv.  You can do this in
a pipeline once you've the line cutter done right, which should be a
one-liner in perl or sed or something like that.  Alternatively, use
perl to call nkf or kcc to get the char code, call iconv on the file
(sucking stdout in via a pipe), do the cutting, and spit it back out
via a pipe to iconv.

Yo, Simon, why doesn't Perl support iconv internally?  It's an XPG2
function, most reasonable OSes should have it....  (Note, AFAICT
Python doesn't either.)

Many editors support using external filters on marked regions, maybe
you can do it in the editor that way.

Your all-ASCII lines will be short, and so will most paragraph's first
and last lines, but so what.

Alternatively, you could use an Mule-kei editor.  They will
automatically fill (breaking the lines, with fill-paragraph-or-region,
usually bound to M-q) or wrap (without line-breaking, just set the
variable truncate-lines to nil).

To do this to a whole file batch (since you don't want to use a
different editor for actual editing):

xemacs -batch $FILE -eval '(replace-string "\n" "\n\n")' \
       -f mark-whole-buffer -f fill-region -f save-buffers-kill-emacs

The point of the replace-string call is to make sure that paragraphs
are separated by empty lines, otherwise you'll probably get one big
paragraph.

There are probably other ways to do this, but the best way is to stop
corresponding with people who send you junk like that ;-).  They
probably have other bad habits like sharing dirty needles.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences       Tel/fax: +81 (298) 53-5091
_________________  _________________  _________________  _________________
What are those straight lines for?  "XEmacs rules."


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links