Perl modules to find the MIME type of a file

This page compares CPAN Perl modules which can be used to detect the mime type of a file. (The mime type is a short string like "image/png", or "text/html", which is used when sending files over the internet so that the receiver can work out what kind of file was sent.)

Contents

List of modules

File::LibMagic

File::LibMagic

My review of version 1.15 on CPAN Ratings

File::LibMagic::FFI

File::LibMagic::FFI

File::MMagic

File::MMagic

My review of version 1.30 on CPAN Ratings

File::MMagic::XS

File::MMagic::XS

My review of version 0.09008 on CPAN Ratings

File::MimeInfo

File::MimeInfo

My review of version 0.28 on CPAN Ratings

File::Type

File::Type

My review of version 0.22 on CPAN Ratings

File::Type::WebImages

File::Type::WebImages

My review of version 1.01 on CPAN Ratings

MIME::Detect

MIME::Detect

My review of version 0.08 on CPAN Ratings

MIME::Type::FileName

MIME::Type::FileName

My review of version 1.0 on CPAN Ratings

MIME::Types

MIME::Types

My review of version 2.13 on CPAN Ratings

Media::Type::Simple

Media::Type::Simple

My review of version v0.31.0 on CPAN Ratings

Padre::MIME

Padre::MIME

Outputs

In this section, I show the outputs of the modules on various types of files. The source code which produces these results is shown at Test program. The accuracy scores of the modules are shown at Scores.

If there is an official MIME type for the file type, the results of the modules which get the MIME type correct are highlighted in blue. Some of these file types do not have recognised MIME types and are not highlighted.

Very long C file in UTF-8

File::LibMagic encoding binary
File::LibMagic mime type application/octet-stream
File::LibMagic::FFI application/octet-stream
File::MMagic text/plain
File::MMagic::XS text/plain
File::MimeInfo text/x-csrc
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain; text/x-csrc
MIME::Detect: from name text/x-csrc
MIME::Type::FileName text/x-c
MIME::Types text/x-csrc
Media::Type::Simple text/x-csrc
Valid-UTF-8 Yes

Short SVG image file in UTF-8 ("image/svg+xml")

File::LibMagic encoding us-ascii
File::LibMagic mime type image/svg+xml
File::LibMagic::FFI image/svg+xml
File::MMagic text/plain
File::MMagic::XS text/xml
File::MimeInfo image/svg+xml
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain; application/xml
MIME::Detect: from name image/svg+xml
MIME::Type::FileName image/svg+xml
MIME::Types image/svg+xml
Media::Type::Simple image/svg+xml
Valid-UTF-8 Yes

GIF image ("image/gif")

File::LibMagic encoding binary
File::LibMagic mime type image/gif
File::LibMagic::FFI image/gif
File::MMagic image/gif
File::MMagic::XS image/gif
File::MimeInfo image/gif
File::Type image/gif
File::Type::WebImages image/gif
MIME::Detect: from file image/gif
MIME::Detect: from name image/gif
MIME::Type::FileName image/gif
MIME::Types image/gif
Media::Type::Simple image/gif
Valid-UTF-8 No

Binary executable

File::LibMagic encoding binary
File::LibMagic mime type application/x-executable
File::LibMagic::FFI application/x-executable
File::MMagic application/octet-stream
File::MMagic::XS application/x-executable
File::MimeInfo application/octet-stream
File::Type application/x-executable-file
File::Type::WebImages undefined
MIME::Detect: from file 
MIME::Detect: from name 
MIME::Type::FileName application/octet-stream
MIME::Types unknown
Media::Type::Simple no extension
Valid-UTF-8 No

PNG image ("image/png")

File::LibMagic encoding binary
File::LibMagic mime type image/png
File::LibMagic::FFI image/png
File::MMagic image/png
File::MMagic::XS image/png
File::MimeInfo image/png
File::Type image/x-png
File::Type::WebImages image/png
MIME::Detect: from file image/png
MIME::Detect: from name image/png
MIME::Type::FileName image/png
MIME::Types image/png
Media::Type::Simple image/png
Valid-UTF-8 No

EUC-JP text data

File::LibMagic encoding iso-8859-1
File::LibMagic mime type text/plain
File::LibMagic::FFI text/plain
File::MMagic text/plain
File::MMagic::XS text/plain
File::MimeInfo text/plain
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain
MIME::Detect: from name 
MIME::Type::FileName application/octet-stream
MIME::Types unknown
Media::Type::Simple no extension
Valid-UTF-8 No

JPEG image data ("image/jpeg")

File::LibMagic encoding binary
File::LibMagic mime type image/jpeg
File::LibMagic::FFI image/jpeg
File::MMagic image/jpeg
File::MMagic::XS image/jpeg
File::MimeInfo image/jpeg
File::Type image/jpeg
File::Type::WebImages image/jpeg
MIME::Detect: from file image/jpeg
MIME::Detect: from name image/jpeg
MIME::Type::FileName image/pjpeg
MIME::Types image/jpeg
Media::Type::Simple image/jpeg
Valid-UTF-8 No

BMP image file ("image/bmp")

File::LibMagic encoding binary
File::LibMagic mime type image/x-ms-bmp
File::LibMagic::FFI image/x-ms-bmp
File::MMagic image/bmp
File::MMagic::XS image/x-ms-bmp
File::MimeInfo image/bmp
File::Type image/x-bmp
File::Type::WebImages image/bmp
MIME::Detect: from file 
MIME::Detect: from name image/bmp
MIME::Type::FileName image/x-windows-bmp
MIME::Types image/x-bmp
Media::Type::Simple image/x-ms-bmp
Valid-UTF-8 No

X bitmap (xbm) file

File::LibMagic encoding us-ascii
File::LibMagic mime type text/plain
File::LibMagic::FFI text/plain
File::MMagic text/plain
File::MMagic::XS text/plain
File::MimeInfo image/x-xbitmap
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain
MIME::Detect: from name image/x-xbitmap
MIME::Type::FileName image/xbm
MIME::Types image/x-xbitmap
Media::Type::Simple image/x-xbitmap
Valid-UTF-8 Yes

MNG video data

File::LibMagic encoding binary
File::LibMagic mime type video/x-mng
File::LibMagic::FFI video/x-mng
File::MMagic application/octet-stream
File::MMagic::XS video/x-mng
File::MimeInfo video/x-mng
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file video/x-mng
MIME::Detect: from name video/x-mng
MIME::Type::FileName application/octet-stream
MIME::Types video/x-mng
Media::Type::Simple video/x-mng
Valid-UTF-8 No

Microsoft Office Excel file (old format) ("application/vnd.ms-excel")

File::LibMagic encoding application/vnd.ms-excelbinary
File::LibMagic mime type application/vnd.ms-excel
File::LibMagic::FFI application/vnd.ms-excel
File::MMagic application/msword
File::MMagic::XS application/msword
File::MimeInfo application/vnd.ms-excel
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file application/x-ole-storage
MIME::Detect: from name application/vnd.ms-excel
MIME::Type::FileName application/vnd.ms-excel
MIME::Types application/vnd.ms-excel
Media::Type::Simple application/vnd.ms-excel
Valid-UTF-8 No

Microsoft Office Word file (old format) ("application/msword")

File::LibMagic encoding application/mswordbinary
File::LibMagic mime type application/msword
File::LibMagic::FFI application/msword
File::MMagic application/msword
File::MMagic::XS application/msword
File::MimeInfo application/msword
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file application/msword; application/x-ole-storage
MIME::Detect: from name application/msword
MIME::Type::FileName application/msword
MIME::Types application/msword
Media::Type::Simple application/msword
Valid-UTF-8 No

Gzip file ("application/gzip")

File::LibMagic encoding binary
File::LibMagic mime type application/x-gzip
File::LibMagic::FFI application/x-gzip
File::MMagic application/x-gzip
File::MMagic::XS text/plain
File::MimeInfo application/gzip
File::Type application/x-gzip
File::Type::WebImages undefined
MIME::Detect: from file application/gzip
MIME::Detect: from name application/gzip
MIME::Type::FileName application/x-gzip
MIME::Types application/gzip
Media::Type::Simple application/x-gzip
Valid-UTF-8 No

XHTML file

File::LibMagic encoding utf-8
File::LibMagic mime type application/xml
File::LibMagic::FFI application/xml
File::MMagic text/html
File::MMagic::XS text/xml
File::MimeInfo text/html
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file application/xhtml+xml; text/plain; text/html; application/xml
MIME::Detect: from name text/html
MIME::Type::FileName text/html
MIME::Types text/html
Media::Type::Simple text/html
Valid-UTF-8 Yes

HTML5 file ("text/html")

File::LibMagic encoding utf-8
File::LibMagic mime type text/html
File::LibMagic::FFI text/html
File::MMagic text/html
File::MMagic::XS text/html
File::MimeInfo text/html
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain; text/html
MIME::Detect: from name text/html
MIME::Type::FileName text/html
MIME::Types text/html
Media::Type::Simple text/html
Valid-UTF-8 Yes

Empty file

File::LibMagic encoding binary
File::LibMagic mime type inode/x-empty
File::LibMagic::FFI inode/x-empty
File::MMagic x-system/x-unix; empty
File::MMagic::XS x-system/x-unix; empty
File::MimeInfo text/plain
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain
MIME::Detect: from name 
MIME::Type::FileName application/octet-stream
MIME::Types unknown
Media::Type::Simple no extension
Valid-UTF-8 Yes

JSON file ("application/json")

File::LibMagic encoding utf-8
File::LibMagic mime type text/plain
File::LibMagic::FFI text/plain
File::MMagic text/plain
File::MMagic::XS text/plain
File::MimeInfo application/json
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain
MIME::Detect: from name application/json
MIME::Type::FileName application/octet-stream
MIME::Types application/json
Media::Type::Simple Unknown extension
Valid-UTF-8 Yes

Manual page in troff format ("text/troff")

File::LibMagic encoding iso-8859-1
File::LibMagic mime type text/troff
File::LibMagic::FFI text/troff
File::MMagic text/x-roff
File::MMagic::XS text/plain
File::MimeInfo text/plain
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain
MIME::Detect: from name 
MIME::Type::FileName application/octet-stream
MIME::Types unknown
Media::Type::Simple Unknown extension
Valid-UTF-8 No

TrueType font ("font/ttf")

File::LibMagic encoding binary
File::LibMagic mime type application/x-font-ttf
File::LibMagic::FFI application/x-font-ttf
File::MMagic application/octet-stream
File::MMagic::XS text/plain
File::MimeInfo application/x-font-ttf
File::Type font/ttf
File::Type::WebImages undefined
MIME::Detect: from file application/x-font-ttf
MIME::Detect: from name application/x-font-ttf
MIME::Type::FileName application/octet-stream
MIME::Types application/x-font-ttf
Media::Type::Simple Unknown extension
Valid-UTF-8 No

CSV file in Shift-JIS encoding ("text/csv")

File::LibMagic encoding unknown-8bit
File::LibMagic mime type text/plain
File::LibMagic::FFI text/plain
File::MMagic text/plain
File::MMagic::XS text/plain
File::MimeInfo text/csv
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain
MIME::Detect: from name text/csv
MIME::Type::FileName text/csv
MIME::Types text/csv
Media::Type::Simple text/csv
Valid-UTF-8 No

JavaScript program ("application/javascript")

File::LibMagic encoding us-ascii
File::LibMagic mime type text/plain
File::LibMagic::FFI text/plain
File::MMagic text/plain
File::MMagic::XS text/plain
File::MimeInfo application/javascript
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file text/plain
MIME::Detect: from name application/javascript
MIME::Type::FileName application/x-javascript
MIME::Types application/javascript
Media::Type::Simple application/javascript
Valid-UTF-8 Yes

Ogg audio ("audio/ogg")

File::LibMagic encoding binary
File::LibMagic mime type audio/ogg
File::LibMagic::FFI audio/ogg
File::MMagic application/octet-stream
File::MMagic::XS application/ogg
File::MimeInfo audio/ogg
File::Type application/octet-stream
File::Type::WebImages undefined
MIME::Detect: from file application/ogg; audio/ogg; video/ogg
MIME::Detect: from name audio/x-vorbis+ogg; audio/x-flac+ogg; audio/x-speex+ogg; video/x-theora+ogg; audio/ogg; video/ogg
MIME::Type::FileName application/ogg
MIME::Types audio/ogg
Media::Type::Simple application/ogg
Valid-UTF-8 No

PDF document ("application/pdf")

File::LibMagic encoding binary
File::LibMagic mime type application/pdf
File::LibMagic::FFI application/pdf
File::MMagic application/pdf
File::MMagic::XS application/pdf
File::MimeInfo application/pdf
File::Type application/pdf
File::Type::WebImages undefined
MIME::Detect: from file application/pdf; text/plain; text/x-matlab; text/x-tex
MIME::Detect: from name application/pdf
MIME::Type::FileName application/pdf
MIME::Types application/pdf
Media::Type::Simple application/pdf
Valid-UTF-8 No

Scores

File::MimeInfo14/16
MIME::Detect: from name14/16
MIME::Types13/16
File::LibMagic::FFI10/16
File::LibMagic mime type10/16
Media::Type::Simple10/16
MIME::Type::FileName8/16
MIME::Detect: from file8/16
File::MMagic7/16
File::MMagic::XS6/16
File::Type::WebImages4/16
File::Type4/16

These scores give the number of times the module got the correct mime type for the particular file, if there was an authoritative mime type assigned for that file type. The "hits" are shown with a blue background above.

Test program

The program I used to compare the modules is this:

#!/home/ben/software/install/bin/perl

# Test various file-to-mime programs on CPAN.

use warnings;
use strict;
use utf8;
use FindBin '$Bin';
use Unicode::UTF8 'valid_utf8';
use File::Slurper 'read_binary';
use Table::Readable 'read_table';
use HTML::Make;
use Getopt::Long;
use List::UtilsBy 'nsort_by';

# Candidate modules

use File::LibMagic;
use File::MMagic;
use File::MMagic::XS ':compat';
use File::MimeInfo;
use File::Type;
use File::Type::WebImages ();
use File::LibMagic::FFI;
use Media::Type::Simple;
use MIME::Types;
use MIME::Type::FileName;

GetOptions (
    html => \my $html,
    mimedetect => \my $mimedetect,
);

if ($mimedetect) {
    eval "use MIME::Detect;";
}


my $flm = File::LibMagic->new ();
my $mm = new File::MMagic;
my $mmx = File::MMagic::XS->new ();
my $ft = File::Type->new();
my $mt    = MIME::Types->new();
my $mime;
if ($mimedetect) {
    $mime = MIME::Detect->new();
}
my $magic = File::LibMagic::FFI->new ();

my @files = read_table ("$Bin/good-bad.txt");

my @fresults;

# Number of the mime types for each module/method which are correct.

my %score;

# Total of known mime types

my $total;

for my $entry (@files) {
    my %results;
    $results{desc} = $entry->{desc};
    my $tfile = $entry->{file};
    my $i = $flm->info_from_filename ($tfile);
    $results{'File::LibMagic mime type'} = $i->{mime_type};
    $results{'File::LibMagic encoding'} = $i->{encoding};
    my $res = $mm->checktype_filename ($tfile);
    $results{'File::MMagic'} = $res;
    my $resxs = $mmx->checktype_filename ($tfile);
    $results{'File::MMagic::XS'} = $resxs;
    my $type_from_data = $ft->checktype_filename($tfile);
    $results{'File::Type'} = $type_from_data;
    my $mime_type = mimetype($tfile);
    $results{'File::MimeInfo'} = $mime_type;
    my $ffi = $magic->checktype_filename($tfile);
    $ffi =~ s/;.*$//;
    $results{'File::LibMagic::FFI'} = $ffi;


    my $ext = $tfile;
    if ($ext =~ s!^.*\.!!) {
        # Media::Type::Simple throws an exception when given a header
        # it doesn't know about.
        my $media_type;
        eval {
            $media_type = type_from_ext ($ext);
        };
        if ($@ && $@ =~ /Unknown extension/) {
            $media_type = 'Unknown extension';
        }
        $results{'Media::Type::Simple'} = $media_type;
    }
    else {
        $results{'Media::Type::Simple'} = "no extension";
    }
    my $mt_type  = $mt->mimeTypeOf($tfile);
    if (! defined $mt_type) {
        $mt_type = 'unknown';
    }
    $results{'MIME::Types'} = $mt_type;
    if ($mimedetect) {
        my @types = $mime->mime_types($tfile);
        $results{"MIME::Detect: from file"} = join '; ', (map {$_->mime_type} @types);
        my @ntypes = $mime->mime_types_from_name ($tfile);
        $results{"MIME::Detect: from name"} = join '; ', (map {$_->mime_type} @ntypes);
    }
    my $ftw = File::Type::WebImages::mime_type ($tfile);
    if (! defined ($ftw)) {
        $ftw = 'undefined';
    }
    $results{'File::Type::WebImages'} = $ftw;
    $results{'MIME::Type::FileName'} = MIME::Type::FileName::guess ($tfile);
    my $bytes = read_binary ($tfile);
    # This gives exactly the same results.
    #my $flmsi = $flm->info_from_string ($bytes);
    #    $results{'File::LibMagic from string'} = $flmsi->{mime_type};
    my $valid_utf8 = valid_utf8 ($bytes);
    $results{'Valid-UTF-8'} = $valid_utf8 ? 'Yes' : 'No';
    my $ok = $entry->{ok};
    if ($ok) {
        for my $k (keys %results) {
            if ($k =~ /encoding/) {
                next;
            }
            my $v = $results{$k};
            if ($v =~ /\b\Q$ok\E\b/) {
                $score{$k}++;
            }
        }
        $results{ok} = $entry->{ok};
        $total++;
    }
    # my $file = `file $tfile`;
    # chomp $file;
    # print "file: $file\n";
    # print "\n";
    push @fresults, \%results;
}
my @scorder = keys %score;
@scorder = reverse (nsort_by {$score{$_}} @scorder);
if ($html) {
    # HTML output
    my $div = HTML::Make->new ('div');
    for my $entry (@fresults) {
        my $desc = ucfirst ($entry->{desc});
        if ($entry->{ok}) {
            $desc .= " (\"$entry->{ok}\")";
        }
        $div->push ('h3', text => $desc);
        my $table = $div->push ('table');
        for my $k (sort keys %$entry) {
            if ($k =~ /^(desc|ok)$/) {
                next;
            }
            my $v = $entry->{$k};
            if (! defined $v) {
                $v = 'undefined';
            }
            my $tr = $table->push ('tr');
            my $ok = $entry->{ok};
            if ($ok && $v =~ /\b\Q$ok\E\b/) {
                $tr->add_attr (style => 'background:skyblue;');
            }
            $tr->add_text ("<td>$k</td><td>&nbsp;</td><td>$v</td>\n");
        }
    }
    $div->push ('h2', text => 'Scores', attr => {id => 'scores'});
    my $table = $div->push ('table');
    for my $k (@scorder) {
        $table->add_text ("<tr><td>$k<td>$score{$k}/$total\n");
    }
    print $div->text ();
}
else {
    # Text output
    for my $entry (@fresults) {
        print "File type: $entry->{desc}";
        if ($entry->{ok}) {
            print " (expect mime type \"$entry->{ok}\")";
        }
        print "\n\n";
        for my $k (sort keys %$entry) {
            print "$k: $entry->{$k}\n" unless $k =~ /^(desc|ok)$/;
        }
        print "\n";
    }
    print "\nScores:\n\n";
    for my $k (@scorder) {
        print "$k: $score{$k} / $total\n";
    }
}
exit;

(download)

Web links


Copyright © Ben Bullock 2009-2017. All rights reserved. For comments, questions, and corrections, please email Ben Bullock (benkasminbullock@gmail.com) or use the discussion group at Google Groups. / Privacy / Disclaimer