Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Re: jp encoding detection



Brett Robson wrote:

> I've got a lot of files that are inconsistent in encoding, most are
> EUC but a fair few are in SJIS. I don't want to go through each one so
> I'm looking for a utility that identifies which encoding is used in a
> file. nkf and kcc don't seem to do this. (I could force all to EUC,
> but first I want to know what the problem is)
> 
> I haven't thought it through but a programme wouldn't be too hard but
> I don't really want to write one.

I use the following script to convert ID3-Tags:

-------cut here-------
#!/bin/sh

fname="$1"
shift
ENCODINGS="cp936 cn-big5 euc-jp shift_jis"

for i in $ENCODINGS; do
	if id3 -l "$fname" | iconv -f "$i" -t eucjp >/dev/null 2>/dev/null; then
		if [ "$i" = "shift_jis" ]; then
			echo "Tag for \"$fname\" already encoded in shift_jis, nothing to do."
			exit 1
		fi
		echo "Converting Tag for \"$fname\" from $i to shift_jis..."
		id3 -l -R "$fname" |
			iconv -f "$i" -t euc-jp |
			sed -e "1d;s/: /='/;s/ *$/'/" | (
			while read line; do
				eval $line
			done
			echo "$Artist - $Title ($Album)"
			JISTITLE=`echo "$Title" | nkf -E -s`
			JISARTIST=`echo "$Artist" | nkf -E -s`
			JISALBUM=`echo "$Album" | nkf -E -s`
			id3 -t "$JISTITLE" -a "$JISARTIST" -A "$JISALBUM" "$Filename" >/dev/null
			exit
		)
	fi
done
echo "Unknown Tag encoding."
exit 2
-------cut here-----------------

-- 
Tobias								PGP: 0x9AC7E0BC
This mail is made of 100% recycled bits

Attachment: pgp00006.pgp
Description: PGP signature


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links