Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Editing XML



On 2014-06-19 09:21, Travis Cardwell wrote:
On 2014年06月19日 03:56, Brian Chandler wrote:
So the question is: What is the right tool? InDesign files (haven't
actually managed to see one yet) appear to be xml,

InDesign files (INDD) are not XML, but the document can be exported to
IDML (InDesign Markup Language [1]) or XLIFF (XML Localisation Interchange
File Format [2]), which are XML.  Memsource apparently supports IDML.

OK -- basically the havoc has already been wreaked by the time the XLIFF file has been made, with its inappropriate segmentation... But looking at this:

http://wiki.memsource.com/wiki/MemSource_Cloud_User_Manual#Segmentation

The only customisation mentioned is all about stopping abbreviations like approx. from splitting this sentence into two. Looks like a desperate hack to me. OTOH, there must be customization to break on Japanese 'maru'... so this is something else to ask about.

The question is whether there is some
other generic framework for cracking the text out of (specifically
.idml) xml files for translation, in an intelligent and flexible way,
capable of helping automation, rather than hindering it. For example,
one global replace, something like (imagined example):

s/<char-special type='maru-suuji' value=$N>/($N)/

... would replace every circled number by the appropriate (n), supposing
that this is the design decision. To do this in Memsource effectively
means that every single numeral will be retyped, errors will occur, etc
etc. COST.

sed! :)

Well, not exactly sed, because the need is for something xml-aware, which for example will replace 黒 by "black", but only inside <content> tags.

Translate Toolkit [3] is a set of utilities written in Python and easy to
hack (that has saved me considerable effort in software translation
projects).  Though it does not support IDML, it supports XLIFF.

Right; XLIFF is post-havoc. The fundamental problem is that the L24n industry has not yet noticed you need to localise format as well as text, and this could be done systematically, just like the text.

Thanks! (for the other responses too)
Brian Chandler


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links