Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] emacs conversion of HTML Entities



;; use with the wrong emacs is an exercise for the reader; unicode
;; handling is not yet standardized
;; there oughtta be a way to do this in PSGML but I can't find it,
;; maybe I'll send this one to Lennart.

(progn
 (unless (featurep 'xemacs) (error "ooh yuck you've got the wrong Emacs!"))

 (cond ((emacs-version>= 21 5 7)
        (message "You have been living right!"))
       ((emacs-version>= 21 4)
        (message "The impossible just takes a few more lines.")
        (require 'un-define)
        (defalias 'unicode-to-char 'ucs-to-char))
       (t
        (error "This XEmacs was released before you were born.  Upgrade!")))

 (defun entities-to-characters ()
   (interactive)
   (goto-char (point-min))
   (while (re-search-forward "&#\\([a-fA-F0-9]+\\);" nil t)
     (message (match-string 1))
     (insert (or (unicode-to-char (string-to-number (match-string 1) 16))
                 (match-string 0))) ; fail-safe
     (delete-region (match-beginning 0) (match-end 0)))))

;; Caveats: I think your entity syntax is wrong and the numbers you give
;; don't make any sense (U+0333 is a combining character, so it won't be
;; in the Japanese repertoire).  But it works for me with &#4E00; and
;; with &#FFFE;.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links