Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] EDICT dictionary on Kindle



On 17 December 2012 09:26, John Mettraux <jmettraux@example.com> wrote:

> I've generated a Kindle (.mobi) version of EDICT2.
[...]
> The tool I used to generate it is at
>
>   https://github.com/jmettraux/edict2-kindle

As I mentioned  before, I set up John's tool in a
script to do a weekly generation of a Kindle "mobi"
file. The script work fine when I run it "by hand", but
I find it fails as a cron job. The problem is in the
execution of the ruby script. When I captured stderr to
file I find it full of:

...
failed to parse: 刖 [げつ] /(n) (arch) (obsc) (See 剕) cutting off the leg
at the knee (form of punishment in ancient China)/EntL2542160/
#<ArgumentError: invalid byte sequence in US-ASCII>
...

It seems that whatever locale is used by cron is not getting
through to ruby, so it is defaulting to ASCII and chucking up
on the UTF8 in the input file. The script has the usual
"# encoding: UTF-8" as the first line, but that seem only to
affect the script; not the data. Googling the problem shows
a lot of discussion of the problem, but no cron-related
solutions.

I've tried all sorts of locale-setting fiddles, and tried several different
shells, but nothing works.

Any suggestions?

Jim

-- 
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links