Perl modules to remove HTML tags

This page is a list of Perl modules for removing some or all tags from HTML pages. I started this list because I'm interested in making a search facility on some web pages I run, and I wanted to be able to remove the HTML tags from the text before indexing the pages.

Modules

HTML::Clean

HTML::Clean

HTML::Defang

HTML::Defang

HTML::Detoxifier

HTML::Detoxifier

My review of version 0.02 on CPAN Ratings

HTML::EscapeEvil

HTML::EscapeEvil

HTML::Laundry

HTML::Laundry

My review of version 0.0107 on CPAN Ratings

HTML::Restrict

HTML::Restrict

My review of version 2.2.3 on CPAN Ratings

HTML::Sanitizer

HTML::Sanitizer

This module is mentioned in the documentation of several other modules, which is why I include it here, but it is only available on the CPAN archive site backpan.perl.org, which means it's no longer supported.

HTML::Scrubber

HTML::Scrubber

My review of version 0.15 on CPAN Ratings

HTML::Strip

HTML::Strip

My review of version 2.10 on CPAN Ratings

HTML::TagFilter

HTML::TagFilter

My review of version 1.03 on CPAN Ratings

HTML::Trim

HTML::Trim

HTML::Truncate

HTML::Truncate

MojoMojo::Declaw

MojoMojo::Declaw


Copyright © Ben Bullock 2009-2023. All rights reserved. For comments, questions, and corrections, please email Ben Bullock (benkasminbullock@gmail.com) or use the discussion group at Google Groups. / Disclaimer