Unicode numbers are not numeric in Perl
This demonstrates that numeric Unicode characters are not considered
numeric by Perl, even when they match the regular
expression \d.
#!perl use warnings; use strict; use utf8; use Scalar::Util 'looks_like_number'; my $count = 1; # Wide ASCII one, Unicode FF11. my $ff11 = '1'; my $warned; if ($ff11 =~ /\d/) { print "ok $count\n"; } else { print "not ok $count\n"; } $count++; # Catch warnings. $SIG{__WARN__} = sub { $warned = "@_"; }; if ($ff11 >= 1) { print "ok $count\n"; } else { print "not ok $count\n"; } $count++; if ($warned) { print "not ok $count - warning '$warned'\n"; } else { print "ok $count - no warnings\n"; } $count++; if (looks_like_number ($ff11)) { print "ok $count\n"; } else { print "not ok $count\n"; } print "1..$count\n";
ok 1
not ok 2
not ok 3 - warning 'Argument "\x{ff11}" isn't numeric in numeric ge (>=) at /usr/home/ben/lemoda/perl/perl-numeric/ff11.pl line 19.
'
not ok 4
1..4
Thus, when validating whether numbers may be used in arithmetic, it's
better to use [0-9] to match digits than \d.
For example, take Lingua::EN::Numericalize. Line 106
validates numbers using \d and then does arithmetic on
them. However, this fails if the input string contains Unicode-encoded
characters like the '1' in the above example program.
Copyright © Ben Bullock 2009-2025. All
rights reserved.
For comments, questions, and corrections, please email
Ben Bullock
(benkasminbullock@gmail.com).
/
Privacy /
Disclaimer