Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]stripping HTML tags with Perl
- To: tlug@example.com
- Subject: stripping HTML tags with Perl
- From: "Drew C. Poulin" <poulin@example.com>
- Date: Mon, 04 Dec 2000 13:30:53 -0800
- Content-Transfer-Encoding: 7bit
- Content-Type: Text/Plain; charset=us-ascii
- Reply-To: tlug@example.com
- Resent-From: tlug@example.com
- Resent-Message-ID: <0rIwEC.A.CLC.BzAL6@example.com>
- Resent-Sender: tlug-request@example.com
Pardon me if this is too far off-topic, but I plan to buy Simon Cozen's book soon (once I make myself worthy), so it should be my only question like this. I'm getting my toes wet in Perl by trying to strip a file of some strings, including HTML tags. What I have so far is below. (The name of file is nd2.) The problem line is s/<.*?>//ig; If the tag is <h3>, for example, the substitution above deletes only the 3> portion; it leaves <h untouched. I think I'll be on my way if someone can explain why that happens and what I ought to be doing. Thanks for any leads. Drew Poulin @example.com="/home/poulin/scripts/nd2"; $^I=".bk"; while (<>) { s/diff .*?\n//ig; #delete lines beginning with diff(sp) s/[0-9].*?\n//ig; s/\^M//ig; s/<.*?>//ig; print; }
- Follow-Ups:
- Re: stripping HTML tags with Perl
- From: Fredric Fredricson <fredric.fredriksson@example.com>
- stripping HTML tags with Perl
- From: "Stephen J. Turnbull" <turnbull@example.com>
- Re: stripping HTML tags with Perl
- From: Simon Cozens <simon@example.com>
Home | Main Index | Thread Index
- Prev by Date: Re: kinput2 & netscape
- Next by Date: Re: stripping HTML tags with Perl
- Prev by thread: Re: Convert IE bookmarks to Netscape
- Next by thread: Re: stripping HTML tags with Perl
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links