The following Perl script tells you what characters match a regular expression.
This is the source code for the script above. To use it, substitute
your regular expression into the argument of count_match
on the final line.
#!/usr/local/bin/perl use warnings; use strict; use Unicode::UCD 'charinfo'; binmode STDOUT, "utf8"; sub count_match { my ($re)=@_; my $overflow; # Print a maximum of $max_chars characters. my $max_chars = 50; my $total_characters = 0; # All the Unicode characters we're allowed. Found by trial and # error. for my $n (0x00 .. 0xD7FF, 0xE000 .. 0xFDCF, 0xFDF0.. 0xFFFD) { if (chr ($n) =~ /$re/) { if ($total_characters < $max_chars) { my $name = "?"; my $charinfo = charinfo ($n); if ($charinfo) { $name = charinfo ($n)->{name}; } printf "%04X: '%s' %s\n", $n, chr $n, $name; } elsif (! $overflow) { $overflow = 1; print "Printing only first $max_chars.\n"; } $total_characters++; } } print "\n$total_characters characters match.\n"; } count_match($re);
Note that answers differ slightly depending on Perl version, since the underlying Unicode character database changed between Perl 5.8 and 5.10.