Perl and XS: The typemap


  1. Introduction
  2. Concepts
  3. Example 1: Geometry
  4. The typemap
  5. Example 2: Math::Ackermann
  6. Example 3: Set::Bit

This article discusses how xsubpp converts data between Perl and C. When the Perl interpreter calls a subroutine, it pushes a list of scalars onto the Perl stack. An xsub must get the scalars from the stack and convert them to C form before calling the C routine, then when the C routine has finished, the xsub must convert the C data to Perl scalars and put the scalars back on the stack. Xsubpp makes the C code to do these conversions.

Conversion between Perl and C data types is handled with macros and routines in the Perl C API, but the necessary operations vary, depending on the C data types and the direction of the conversion. Consider:

C data type input output
int n n = (int ) SvIV(ST(0)) sv_setiv( ST(0), (IV )n )
double x x = (double) SvNV(ST(0)) sv_setnv( ST(0), (double)x )
char *psz psz = (char *) SvPV(ST(0),na) sv_setpv((SV*)ST(0), psz)

We could imagine a big switch statement inside xsubpp to select the right code fragment for each C data type, but this would be clumsy and inflexible. It would be better to put the code fragments in a table, like the one shown above.

If we start writing such a table, we quickly discover that the mapping between Perl and C datatypes is not one-to-one. As a strongly typed language, C distinguishes more data types than Perl does. For example, these seven C integer types are all converted with essentially the same code fragment, the only variation being the typecast used to quiet the C compiler.

C data type input output
int n n = (int )SvIV(ST(0)) sv_setiv(ST(0), (IV)n)
unsigned n n = (unsigned )SvIV(ST(0)) sv_setiv(ST(0), (IV)n)
unsigned int n n = (unsigned int )SvIV(ST(0)) sv_setiv(ST(0), (IV)n)
long n n = (long )SvIV(ST(0)) sv_setiv(ST(0), (IV)n)
unsigned long n n = (unsigned long )SvIV(ST(0)) sv_setiv(ST(0), (IV)n)
short n n = (short )SvIV(ST(0)) sv_setiv(ST(0), (IV)n)
unsigned short n n = (unsigned short)SvIV(ST(0)) sv_setiv(ST(0), (IV)n)

In view of this, xsubpp uses a two-level mapping. First, it maps C data types to XS types, like this

C data type XS type
int T_IV
unsigned T_IV
char T_CHAR
char * T_PV

Then it maps the XS types to code fragments, in two tables: one for input

XS type input code fragment
T_IV $var = ($ntype)SvIV($arg)
T_CHAR $var = (char)*SvPV($arg,na)
T_PV $var = ($ntype)SvPV($arg,na)

and one for output

XS type output code fragment
T_IV sv_setiv ($arg, (IV)$var);
T_CHAR sv_setpvn($arg, (char *)&$var, 1);
T_PV sv_setpv ((SV*)$arg, $var);

These tables constitute the typemap.

The XS types are meaningful only to xsubpp, and appear only in the typemap. They do not appear in Perl code, XS code, or C code.

$var, $ntype, and $arg

The code fragments in the typemap are not pure C code: they contain Perl variables in their text. The variables are

$var
The name of a C variable
$ntype
The type of $var
$arg
Code to access a Perl scalar

xsubpp is a Perl program. When it needs to convert an argument from Perl to C, it sets $var, $ntype, and $arg, obtains the appropriate code fragment from the typemap, and evals the fragment to replace the Perl variables with their values.

For example, consider this XS routine

int
max(a, b)
	int a
	int b

To generate code to convert the first parameter from Perl to C, xsubpp sets the Perl variables like this

variable value
$var a
$ntype int
$arg ST(0)

Then, it evals the fragment

$var = ($ntype)SvIV($arg)

to yield the C code

a = (int)SvIV(ST(0))

It is important to understand how these variables work, because sometimes you have to arrange for them to have the right values in order to make xsubpp do what you want. The next article in this series contains an example in the XS code for Align::NW.

Typemap files

The three tables that constitute the typemap are referred to as TYPEMAP, INPUT, and OUTPUT, respectively. All three tables may be stored in a single file, with each table headed by its own name. Here is an example to illustrate the file format

# A typemap file

TYPEMAP
int			T_IV
SV *			T_SV

INPUT
T_SV
	$var = $arg
T_IV
	$var = ($ntype)SvIV($arg)

OUTPUT
T_SV
	$arg = $var;
T_IV
	sv_setiv($arg, (IV)$var);

The first TYPEMAP header may be omitted.

Files containing typemaps are conventionally named typemap. Xsubpp can read and aggregate multiple typemap files to construct the typemap. Entries in later files override entries in earlier files.

Perl supplies a default typemap in

/usr/local/lib/perl5/version/ExtUtils/typemap

XS modules may provide a local typemap file in the module directory. If the module declares structs or other C data types, it can map them to XS types in a TYPEMAP section. Local typemaps rarely need INPUT or OUTPUT sections; the default typemap almost always contains appropriate code fragments.


  1. Introduction
  2. Concepts
  3. Example 1: Geometry
  4. The typemap
  5. Example 2: Math::Ackermann
  6. Example 3: Set::Bit

Notes on these adapted articles

These pages are an adaptation of articles written in 2000 by Steven W. McDougall. My goal in modifying these articles is to simplify and update them. I hope you find these adapted versions of the articles useful. You can find the original articles at the link at the bottom of this page. The major changes in this update are:

This adaptation is a work in progress and many of the links on these pages may not work.


Creative Commons License
XS Mechanics by Steven W. McDougall is licensed under a Creative Commons Attribution 3.0 Unported License.

For comments, questions, and corrections, please email Ben Bullock (benkasminbullock@gmail.com) or use the discussion group at Google Groups. / Privacy / Disclaimer