Perl and XS: The typemap
This page was created on Thu May 05 2011 and last changed on Sat Oct 05 2024.

- Introduction
- Concepts
- Example 1: Geometry
- The typemap
- Example 2: Math::Ackermann
- Example 3: Set::Bit
This article discusses how xsubpp
converts data between
Perl and C. When the Perl interpreter calls a subroutine, it pushes a
list of scalars onto the Perl stack. An xsub must get the scalars from
the stack and convert them to C form before calling the C routine,
then when the C routine has finished, the xsub must convert the C data
to Perl scalars and put the scalars back on the
stack. Xsubpp
makes the C code to do these conversions.
Conversion between Perl and C data types is handled with macros and routines in the Perl C API, but the necessary operations vary, depending on the C data types and the direction of the conversion. Consider:
C data type | input | output |
---|---|---|
int n |
n = (int ) SvIV(ST(0)) |
sv_setiv( ST(0), (IV )n ) |
double x |
x = (double) SvNV(ST(0)) |
sv_setnv( ST(0), (double)x ) |
char *psz |
psz = (char *) SvPV(ST(0),na) |
sv_setpv((SV*)ST(0), psz) |
We could imagine a big switch statement inside xsubpp
to
select the right code fragment for each C data type, but this would be
clumsy and inflexible. It would be better to put the code fragments in
a table, like the one shown above.
If we start writing such a table, we quickly discover that the mapping between Perl and C datatypes is not one-to-one. As a strongly typed language, C distinguishes more data types than Perl does. For example, these seven C integer types are all converted with essentially the same code fragment, the only variation being the typecast used to quiet the C compiler.
C data type | input | output |
---|---|---|
int n |
n = (int )SvIV(ST(0)) |
sv_setiv(ST(0), (IV)n) |
unsigned n |
n = (unsigned )SvIV(ST(0)) |
sv_setiv(ST(0), (IV)n) |
unsigned int n |
n = (unsigned int )SvIV(ST(0)) |
sv_setiv(ST(0), (IV)n) |
long n |
n = (long )SvIV(ST(0)) |
sv_setiv(ST(0), (IV)n) |
unsigned long n |
n = (unsigned long )SvIV(ST(0)) |
sv_setiv(ST(0), (IV)n) |
short n |
n = (short )SvIV(ST(0)) |
sv_setiv(ST(0), (IV)n) |
unsigned short n |
n = (unsigned short)SvIV(ST(0)) |
sv_setiv(ST(0), (IV)n) |
In view of this, xsubpp
uses a two-level mapping. First, it maps C data types to XS types, like this
C data type | XS type |
---|---|
int |
T_IV |
unsigned |
T_IV |
char |
T_CHAR |
char * |
T_PV |
Then it maps the XS types to code fragments, in two tables: one for input
XS type | input code fragment |
---|---|
T_IV |
$var = ($ntype)SvIV($arg) |
T_CHAR |
$var = (char)*SvPV($arg,na) |
T_PV |
$var = ($ntype)SvPV($arg,na) |
and one for output
XS type | output code fragment |
---|---|
T_IV |
sv_setiv ($arg, (IV)$var); |
T_CHAR |
sv_setpvn($arg, (char *)&$var, 1); |
T_PV |
sv_setpv ((SV*)$arg, $var); |
These tables constitute the typemap.
The XS types are meaningful only to xsubpp
, and appear only in the typemap. They do not appear in Perl code, XS code, or C code.
$var
, $ntype
, and $arg
The code fragments in the typemap are not pure C code: they contain Perl variables in their text. The variables are
$var
- The name of a C variable
$ntype
- The type of
$var
$arg
- Code to access a Perl scalar
xsubpp
is a Perl program. When it needs to convert an argument from Perl to C, it sets $var
, $ntype
, and $arg
, obtains the appropriate code fragment from the typemap, and eval
s the fragment to replace the Perl variables with their values.
For example, consider this XS routine
int max(a, b) int a int b
To generate code to convert the first parameter from Perl to C, xsubpp
sets the Perl variables like this
variable | value |
---|---|
$var |
a |
$ntype |
int |
$arg |
ST(0) |
Then, it eval
s the fragment
$var = ($ntype)SvIV($arg)
to yield the C code
a = (int)SvIV(ST(0))
It is important to understand how these variables work, because
sometimes you have to arrange for them to have the right values in
order to make xsubpp
do what you want. The next article
in this series contains an example in the XS code
for Align::NW
.
Typemap files
The three tables that constitute the typemap are referred to as TYPEMAP, INPUT, and OUTPUT, respectively. All three tables may be stored in a single file, with each table headed by its own name. Here is an example to illustrate the file format
# A typemap file TYPEMAP int T_IV SV * T_SV INPUT T_SV $var = $arg T_IV $var = ($ntype)SvIV($arg) OUTPUT T_SV $arg = $var; T_IV sv_setiv($arg, (IV)$var);
The first TYPEMAP
header may be omitted.
Files containing typemaps are conventionally
named typemap
. Xsubpp
can read and aggregate
multiple typemap files to construct the typemap. Entries in later
files override entries in earlier files.
Perl supplies a default typemap in
/usr/local/lib/perl5/version/ExtUtils/typemap
XS modules may provide a local typemap file in the module directory. If the module declares structs or other C data types, it can map them to XS types in a TYPEMAP section. Local typemaps rarely need INPUT or OUTPUT sections; the default typemap almost always contains appropriate code fragments.

- Introduction
- Concepts
- Example 1: Geometry
- The typemap
- Example 2: Math::Ackermann
- Example 3: Set::Bit
Notes on these adapted articles
These pages are an adaptation of articles written in 2000 by Steven W. McDougall. My goal in modifying these articles is to simplify and update them. I hope you find these adapted versions of the articles useful. You can find the original articles at the link at the bottom of this page. The major changes in this update are:
- h2xs is not used;
- XSLoader is used in place of DynaLoader;
- It is assumed that the reader understands the basic concepts of C and Perl programming.
This adaptation is a work in progress and many of the links on these pages may not work.
XS Mechanics by Steven W. McDougall is licensed under a Creative Commons Attribution 3.0 Unported License.