Entropy of discrete probabilities in Perl

This Perl program computes the entropy $$H=-\sum_{n=1}^K p_n log_2(p_n)$$ of some discrete probability distributions. The things computed are the entropy of a single symbol alphabet, a four-symbol alphabet with probabilities $1/3$, $1/4$, $1/4$, $1/6$, an alphabet with four, six, and eight equally-likely symbols, a two-symbol alphabet with probabilities $1/x$ and $1 - 1/x$, and an n-symbol alphabet with probabilities $(1/2, 1/4, ..., 1/2^n, 1/2^n)$.

This is the output of the program:

Entropy of a single constant:
0

Entropy of alphabet with probabilities 1/3/, 1/4, 1/4, 1/6:
1.95914791702724

Entropy of four equally-likely symbols:
2

Entropy of six equally-likely symbols:
2.58496250072116 (log_2 (6) = 2.58496250072116)

Entropy of eight equally-likely symbols:
3

Entropy of a two-symbol alphabet with probabilities 1/n, 1 - 1/n:
1 0
2 1
3 0.91829583405449
4 0.811278124459133
5 0.721928094887362
6 0.650022421648354
7 0.591672778582327
8 0.543564443199596
9 0.503258334775646
10 0.468995593589281

Entropy of an n-symbol alphabet, probs (1/2, 1/4, ... 1/(2^n), 1/(2^n)):
n = 2: entropy = 1.5
n = 3: entropy = 1.75
n = 4: entropy = 1.875
n = 5: entropy = 1.9375
n = 6: entropy = 1.96875
n = 7: entropy = 1.98438
n = 8: entropy = 1.99219
n = 9: entropy = 1.99609
n = 10: entropy = 1.99805
n = 11: entropy = 1.99902


This is the program itself:

#!/home/ben/software/install/bin/perl
use warnings;
use strict;

# A small error factor.

use constant {
    eps => 1e-10,
};

print "Entropy of a single constant:\n";

print &entropy (1), "\n\n";

print "Entropy of alphabet with probabilities 1/3/, 1/4, 1/4, 1/6:\n";

my @p = (1/3, 1/4, 1/4, 1/6);

print &entropy (@p), "\n\n";

print "Entropy of four equally-likely symbols:\n";

my @p3 = (1/4) x 4;

print &entropy (@p3), "\n\n";

print "Entropy of six equally-likely symbols:\n";

my @p2 = (1/6) x 6;

print &entropy (@p2), " (log_2 (6) = ", log2 (6), ")\n\n";

print "Entropy of eight equally-likely symbols:\n";

my @p4 = (1/8) x 8;

print &entropy (@p4), "\n\n";

print "Entropy of a two-symbol alphabet with probabilities 1/n, 1 - 1/n:\n";

for my $factor (1..10) {
    my @p5 = (1 - 1/$factor, 1/$factor);
    print $factor, " ", entropy (@p5), "\n";
}

print "\n";

print "Entropy of an n-symbol alphabet, probs (1/2, 1/4, ... 1/(2^n), 1/(2^n)):\n";

for my $factor (1..10) {
    my @q;
    for my $j (0..$factor) {
        $q[$j] = 1/(2**($j+1));
    }
    $q[$factor + 1] = 1/(2**($factor+1));
    printf "n = %d: entropy = %g\n", $factor + 1, entropy (@q);
}

print "\n";

exit;

# Log to the base 2, so log2 (2) = 1, log2 (4) = 2, etc.

sub log2
{
    my ($n) = @_;
    return log ($n) / log (2);
}

# Given a set of discrete probabilities, "@p", compute the entropy.

sub entropy
{
    my (@p) = @_;

    # Do some checks on the inputs before proceeding.

    my $sum = 0;
    for (@p) {
        if ($_ < 0 || $_ > 1) {
            warn "$_ is not a probability.\n";
            return undef;
        }
        $sum += $_;
    }
    if (abs ($sum - 1) > eps) {
        warn "@p does not sum to 1.\n";
        return undef;
    }

    # The return value.

    my $h = 0;

    for (@p) {
        if ($_ == 0) {
            next;
        }
        $h += - $_ * log2 ($_);
    }
    return $h;
}

(download)


Copyright © Ben Bullock 2009-2023. All rights reserved. For comments, questions, and corrections, please email Ben Bullock (benkasminbullock@gmail.com) or use the discussion group at Google Groups. / Privacy / Disclaimer