Gwailo B.A. Economics (Hon) Sep 22, 2002 #1 I was wondering if there was any simple utility, either graphical or Darwin, to strip HTML tags from an HTML (i.e., plaintext) file? A perl utility would be best, since I do a lot of work remotely on the terminal. TIA
I was wondering if there was any simple utility, either graphical or Darwin, to strip HTML tags from an HTML (i.e., plaintext) file? A perl utility would be best, since I do a lot of work remotely on the terminal. TIA
F fddi1 Registered Sep 22, 2002 #2 I'm sure there is a HTML parser module for perl. Try CPAN. You should be able to find one if not some.
I'm sure there is a HTML parser module for perl. Try CPAN. You should be able to find one if not some.
vertigo Swollen Member Sep 22, 2002 #3 if all you want to do is strip the tags, you could do something like $html =~ s/<.+>//g;
Gwailo B.A. Economics (Hon) Sep 23, 2002 #4 Originally posted by vertigo if all you want to do is strip the tags, you could do something like $html =~ s/<.+>//g; Click to expand... That's very kind of you but I don't know how to implement this into a script, I know 0 perl...
Originally posted by vertigo if all you want to do is strip the tags, you could do something like $html =~ s/<.+>//g; Click to expand... That's very kind of you but I don't know how to implement this into a script, I know 0 perl...
hazmat Rusher of Din Sep 23, 2002 #5 There's a utility call html2text on a NetBSD system I have an account on. I'm sure you could find it for Darwin.
There's a utility call html2text on a NetBSD system I have an account on. I'm sure you could find it for Darwin.
Gwailo B.A. Economics (Hon) Sep 23, 2002 #6 While I'm still going to look for a darwin alternative, I figured out that I could probably just open my web page in IE and select Save as Plain Text That'll have to suffice for now, but thanks for all the hints guys!
While I'm still going to look for a darwin alternative, I figured out that I could probably just open my web page in IE and select Save as Plain Text That'll have to suffice for now, but thanks for all the hints guys!
hazmat Rusher of Din Sep 23, 2002 #7 Here: http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=html2text . Looks like plenty of options.
Here: http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=html2text . Looks like plenty of options.