This teaches creating a Perl extension in C. It was written for Unix. =head1 Example 1: Hello world The first example prints "Hello world". Run C. This creates a directory F and files F, F, F, and F. F looks like this: #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include "ppport.h" MODULE = test PACKAGE = test Edit F to add void hello() CODE: printf("Hello, world!\n"); to the end. Run C: $ perl Makefile.PL Checking if your kit is complete... Looks good Writing Makefile for test $ This creates a file called F. Run the command "make": $ make cp lib/test.pm blib/lib/test.pm perl xsubpp -typemap typemap test.xs > test.xsc && mv test.xsc test.c Please specify prototyping behavior for test.xs (see perlxs manual) cc -c test.c Running Mkbootstrap for test () chmod 644 test.bs rm -f blib/arch/auto/test/test.so cc -shared -L/usr/local/lib test.o -o blib/arch/auto/test/test.so chmod 755 blib/arch/auto/test/test.so cp test.bs blib/arch/auto/test/test.bs chmod 644 blib/arch/auto/test/test.bs Manifying blib/man3/test.3pm $ Now we run the extension. Create a file called F containing @program_download{hello.pl} Make F executable with C, and run it: $ ./hello Hello, world! $ =head1 Example 2: Odd or even This extension returns 1 if a number is even, and 0 if the number is odd. Add the following to the end of F from example one: int is_even(input) int input CODE: RETVAL = (input % 2 == 0); OUTPUT: RETVAL Run "make" again. Create a test script, F, containing # 3 is the number of tests. use Test::More tests => 3; use test; is (test::is_even(0), 1); is (test::is_even(1), 0); is (test::is_even(2), 1); Run it by typing C: $ make test PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/test.t .. ok All tests successful. Files=1, Tests=3, 0 wallclock secs ( 0.03 usr 0.02 sys + 0.02 cusr 0.02 csys = 0.09 CPU) Result: PASS $ =head1 Files and directories C starts extensions. It creates F, which generates F, and F and F, which contain the extension. F is the C part, and F tells Perl how to load the extension. Running C creates a directory F for compiled output. C invokes perl such that it finds the extension files in F. To test an extension, use C, or run the test file using perl -I blib/lib -I blib/arch t/test.t Without this, the test script will fail to run, or, if there is another version of the extension installed, it will use that, instead of the version which was meant to be tested. =head1 Example 3: Rounding numbers This takes an argument and sets it to its rounded value. To the end of F, add void round(arg) double arg CODE: if (arg > 0.0) { arg = floor(arg + 0.5); } else if (arg < 0.0) { arg = ceil(arg - 0.5); } else { arg = 0.0; } OUTPUT: arg Add C<'-lm'> to the line containing C<'LIBS'> in F: 'LIBS' => ['-lm'], # e.g., '-lm' This adds a link to the C maths library which contains C and C. Change the number of tests in F to "8", use Test::More tests => 8; and add the following tests: my $i; $i = -1.5; test::round($i); is( $i, -2.0 ); $i = -1.1; test::round($i); is( $i, -1.0 ); $i = 0.0; test::round($i); is( $i, 0.0 ); $i = 0.5; test::round($i); is( $i, 1.0 ); $i = 1.2; test::round($i); is( $i, 1.0 ); Run C, C, then C. It should print out that eight tests have passed. =head1 Input and Output Parameters Parameters of the XSUB are specified after the function's return value and name. The output parameters are listed at the end of the function, after C. C tells Perl to send this value back as the return value of the XSUB function. In Example 3, the return value was placed in the original variable which was passed in, so it and not C was listed in the C section. C translates XS into C. Its rules to convert from Perl data types, such as "scalar" or "array", to C data types such as C, or C, are found in a file called a "typemap". This has three parts. The first part maps C types to a name which corresponds to Perl types. The second part contains C code which C uses for input parameters. The third part contains C code which C uses for output parameters. For example, look at a portion of the C file created for the extension, F: @program_download{test.c} In the typemap file, doubles are of type C. In the C section of F, an argument that is C is assigned to the variable C by calling C, then casting its value to C, then assigning that to C. In the C section, C is passed to C to be passed back to the calling subroutine. (C is discussed in L). =head1 XS file structure The lines before C in the XS file are C. C just copies them. Parts after C are XSUB functions. C translates them to C. =head1 Simplifying XSUBs In L the second part of the XS file contained the following description of an XSUB: double foo(a,b,c) int a long b const char * c OUTPUT: RETVAL In contrast with L, L and L. this description does not contain code for what is done during a call to C. Even if a CODE section is added to this XSUB: double foo(a,b,c) int a long b const char * c CODE: RETVAL = foo(a,b,c); OUTPUT: RETVAL the result is almost identical generated C code: B compiler figures out the C section from the first two lines of the description of XSUB. The C section can be removed as well, if a C section is not specified: xsubpp can see that it needs to generate a function call section, and will autogenerate the OUTPUT section too. Thus the XSUB can be double foo(a,b,c) int a long b const char * c This can also be done for int is_even(input) int input CODE: RETVAL = (input % 2 == 0); OUTPUT: RETVAL of L, if a C function C is supplied. As in L, this may be placed in the first part of the .xs file: int is_even(int arg) { return (arg % 2 == 0); } If this is in the first part of the xs file, before C, the XS part need only be int is_even(input) int input =head1 XSUB arguments When arguments to routines in the .xs file are specified, three things are passed for each argument listed. The first is the order of that argument relative to the others (first, second, third). The second is the type of argument (C, C). The third is the calling convention for the argument in the call to the library function. Suppose two C functions with similar declarations, for example int string_length (char *s); int upper_case_char (char *cp); operate differently on the argument: C inspects the characters pointed to by C without changing their values, but C manipulates what C points to. From Perl, these functions are used in a different manner. Tell B which is which by replacing the C<*> before the argument by C<&>. An ampersand, C<&>, means that the argument should be passed to a library function by its address. In the example, int string_length(s) char * s but int upper_case_char(cp) char & cp For example, consider: int foo(a,b) char & a char * b The first Perl argument to this function is treated as a char and assigned to C, and its address is passed into C. The second Perl argument is treated as a string pointer and assigned to C. The value of b is passed into the function foo. The call to C that xsubpp generates looks like this: foo (& a, b); =head1 The argument stack In the generated C code, there are references to C, C and so on. C is a macro that points to the Ith argument on the argument stack. C is thus the first argument on the stack and therefore the first argument passed to the XSUB, C is the second argument, and so on. The list of arguments to the XSUB in the .xs file tells B which argument corresponds to which of the argument stack (i.e., the first one listed is the first argument, and so on). These must be listed in the same order as the function expects them. The actual values on the argument stack are pointers to the values passed in. When an argument is listed as being an OUTPUT value, its corresponding value on the stack (i.e., C if it was the first argument) is changed. Verify this by looking at the C code generated for Example 3. The code for C contains lines that look like this: double arg = (double) SvNV(ST(0)); /* Round the contents of the variable arg */ sv_setnv (ST(0), (double)arg); The arg variable is initially set by taking the value from C, then is stored back into C at the end of the routine. XSUBs are also allowed to return lists, not just scalars. This must be done by manipulating stack values C, C, etc. See L. XSUBs are also allowed to avoid automatic conversion of Perl function arguments to C function arguments. See L. Some people prefer manual conversion by inspecting C even in the cases when automatic conversion will do, arguing that this makes the logic of an XSUB call clearer. Compare with L. =head1 Example 4: Returning an array This example illustrates working with the argument stack. The previous examples have all returned only a single value. This example shows an extension which returns an array. This example uses the C system call. Return to the F directory. Add the following to the top of F, after C<#include "XSUB.h">: #include or #include #include depending on your operating system (read "man statfs" for the correct details for your version of Unix). Add to the end: void statfs(path) char * path INIT: int i; struct statfs buf; PPCODE: i = statfs(path, &buf); if (i == 0) { XPUSHs(sv_2mortal(newSVnv(buf.f_bavail))); XPUSHs(sv_2mortal(newSVnv(buf.f_bfree))); XPUSHs(sv_2mortal(newSVnv(buf.f_blocks))); XPUSHs(sv_2mortal(newSVnv(buf.f_bsize))); XPUSHs(sv_2mortal(newSVnv(buf.f_ffree))); XPUSHs(sv_2mortal(newSVnv(buf.f_files))); XPUSHs(sv_2mortal(newSVnv(buf.f_type))); } else { XPUSHs(sv_2mortal(newSVnv(errno))); } In F, change the number of tests from 9 to 11, and add @a = test::statfs("/blech"); ok( scalar(@a) == 1 && $a[0] == 2 ); @a = test::statfs("/"); is( scalar(@a), 7 ); This routine returns a different number of arguments depending on whether the call to C succeeds. If there is an error, the error number is returned as a single-element array. If the call is successful, then a 7-element array is returned. C says to place the code following it immediately after the argument stack is decoded. C tells C that the xsub manages the return values put on the argument stack by itself. To place values to be returned to the caller onto the stack, use the series of macros that begin with C. There are five different versions, for placing integers, unsigned integers, doubles, strings, and Perl scalars on the stack. In the example, a Perl scalar was placed onto the stack. The values pushed onto the return stack of the XSUB are "mortal" Cs. They are made "mortal" so that once their values are copied by the calling program, the SV's that held the returned values can be deallocated. If they were not mortal, then they would continue to exist after the XSUB routine returned, but would not be accessible, causing a memory leak. =head1 Example 5: Arrays and hashes This example takes an array reference as input, and returns a reference to an array of hash references: my $stats = multi_statfs (['/', '/usr/']); my $usr_bfree = $stats->[1]->{f_bfree}; It is based on L. It takes a reference to an array of filenames as input, calls C for each file name, and returns a reference to an array of hashes containing the data for each of the filesystems. In the F directory add the following code to the end of F: SV * multi_statfs(paths) SV * paths INIT: /* The return value. */ AV * results; /* The number of paths in "paths". */ I32 numpaths = 0; int i, n; /* Check that paths is a reference, then check that it is an array reference, then check that it is non-empty. */ if ((! SvROK(paths)) || (SvTYPE(SvRV(paths)) != SVt_PVAV) || ((numpaths = av_len((AV *)SvRV(paths))) < 0)) { XSRETURN_UNDEF; } /* Create the array which holds the return values. */ results = (AV *) sv_2mortal ((SV *) newAV ()); CODE: for (n = 0; n <= numpaths; n++) { HV * rh; STRLEN l; struct statfs buf; /* Get the nth value from array "paths". */ char * fn = SvPV (*av_fetch ((AV *) SvRV (paths), n, 0), l); i = statfs (fn, &buf); if (i != 0) { av_push (results, newSVnv (errno)); continue; } /* Create a new hash. */ rh = (HV *) sv_2mortal ((SV *) newHV ()); /* Store the numbers in rh, under the given names. */ hv_store(rh, "f_bavail", 8, newSVnv(buf.f_bavail), 0); hv_store(rh, "f_bfree", 7, newSVnv(buf.f_bfree), 0); hv_store(rh, "f_blocks", 8, newSVnv(buf.f_blocks), 0); hv_store(rh, "f_bsize", 7, newSVnv(buf.f_bsize), 0); hv_store(rh, "f_ffree", 7, newSVnv(buf.f_ffree), 0); hv_store(rh, "f_files", 7, newSVnv(buf.f_files), 0); hv_store(rh, "f_type", 6, newSVnv(buf.f_type), 0); av_push(results, newRV((SV *)rh)); } RETVAL = newRV((SV *)results); OUTPUT: RETVAL Add to F $results = test::multi_statfs([ '/', '/blech' ]); ok( ref $results->[0] ); ok( ! ref $results->[1] ); This function does not use a typemap. Instead, it accepts one C (scalar) parameter, and returns an C. These scalars are populated within the code. Because it only returns one value, there is no need for a C directive, only C and C directives. When dealing with references, it is important to handle them with caution. The C block first checks that C returns true, which indicates that paths is a valid reference. It then verifies that the object referenced by paths is an array, using C to dereference paths, and C to discover its type. As an added test, it checks that the array referenced by paths is non-empty, using C, which returns -1 if the array is empty. The C macro aborts the XSUB and returns the undefined value whenever all three of these conditions are not met. We manipulate several arrays in this XSUB. An array is represented internally by a pointer to C. The functions and macros for manipulating arrays are similar to the functions in Perl: C returns the highest index in an C, much like C<$#array>; C fetches a scalar value from an array, given its index; C pushes a scalar value onto the array, extending it if necessary. Specifically, we read pathnames one at a time from the input array, and store the results in an output array (results) in the same order. If C fails, the element pushed onto the return array is the value of C after the failure. If C succeeds, the value pushed onto the return array is a reference to a hash containing some of the information in the C structure. As with the return stack, it would be possible (and a small performance win) to pre-extend the return array before pushing data into it, since we know how many elements we will return: av_extend(results, numpaths); We are performing only one hash operation in this function, which is storing a new scalar under a key using C. A hash is represented by an HV* pointer. Like arrays, the functions for manipulating hashes from an XSUB mirror the functionality available from Perl. See L and perlapi for details. To create a reference, use C. An C or an C can be cast to type C in this case. This allows taking references to arrays, hashes and scalars with the same function. Conversely, the C function always returns an SV*, which may need to be cast to the appropriate type if it is something other than a scalar (check with C). =head1 Example 6: Passing open files To create a wrapper around C, #define PERLIO_NOT_STDIO 0 #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include int fputs(s, stream) char * s FILE * stream =head1 SEE ALSO =over =item L This documents functions such as C which convert Perl scalars into C Cs. =item perlapi (see L) This lists functions such as C used for error handling. =item L This is the "official" documentation for Perl XS. =item L This is the "official" documentation for Perl modules. =item L This documents C. =back =head1 NOTES =head2 make This tutorial assumes that the "make" program that Perl is configured to use is called C. Instead of running "make" in the examples, a substitute may be required. The command B gives the name of the substitute program. =head1 AUTHOR Jeff Okamoto, reviewed and assisted by Dean Roehrich, Ilya Zakharevich, Andreas Koenig, and Tim Bunce. PerlIO material contributed by Lupe Christoph, with some clarification by Nick Ing-Simmons. Changes for h2xs as of Perl 5.8.x by Renee Baecker. This digest web version (L) was edited from that found in the Perl distribution by Ben Bullock.