Pliant->C Library Wrapping Tutorial

Pliant->C Library Wrapping Tutorial

Foreword

I'd like to invite any reader to comment on this tutorial or ask questions relating to the topic. This is my first document like this so I'm expecting there to be holes in my coverage of the topic. I hope to fill these based on my own experiences and feedback from others over time. I should note that this tutorial is mainly oriented toward Linux/*nix users. I'm not sure how much help it will be to a Windows user.

Introduction

As a long time non-C hacker, I've found that a languages ability to interface with C is crucial to the languages long term usefulness. Particularly on my preferred platform, Linux. Though I'm sure this is a truism for just about any modern platform, C (or C++) being the standard systems level programming language.

One of the many things that attracted me to Pliant is the ease which you can create interfaces to external C libraries. It cannot really even be compared with a foreign languages interface, as its almost as integrated with C as C++/ObjC. There are a few catches, but its really the best I've seen in a non-C language.

Anyways... this little tutorial is meant to help you use this aspect of Pliant (not as a general introduction). To date I've only partially wrapped 3 C libraries, the GNU readline library, the ncurses-5 library and the slang library. I'll be using the first 2 as examples as I proceed, and you can download them from the resources section.

Basic Concepts

Thus far I've encountered 3 basic types of C entities which need to considered when wrapping a library. Functions, global variables and structures. I'll first go over the basics of dealing each of these.

Functions

Wrapping function is very simple. Its simply a matter of using the external function attribute as described on the 'Defining a new function' page of the documentation. Here's the basic syntax:

function c_function_call argument -> result
  arg Type argument
  arg Type result
  external "c_library.so" "c_function"

The argument and result types can either be a Pliant type or a wrapped C type. The c_library.so library may be any cached by ld (on linux). The c_function is a standard function in that library. Finally, you use this function in Pliant just as a standard function.

The only complications in dealing with this is the types. For strings or character types, special Pliant variables must be used. For strings, representing 'char *' type arguments in C, use the CStr type. For characters, 'char' types, use the CChr type. This is due to differences between these pliant types and their C counterparts. Casting is done auto-magically (see implicit casting below).

Global Variables

Global variables are handled very similarly to C functions. Using the external attribute. Again, here's the basic syntax:

var Type var_name
  external "c_library.so" "c_global_variable"

The var Type var_name should seem familiar if you've used Pliant at all. The extra external line is taken from the syntax of functions, and works in the same way. The only thing to watch for is to make sure the types match up. The rules about character and string types mention above apply here as well.

Structures

Trickiest of these 3 are C structures. These can either be pretty straight forward or a bit of a pain (comparatively, anyway). It depends on whether the C compiler has done any alignment optimizations on the structure or not. Its hard to give much in the way of generalities for wrapping C structs, but here's what they tend to look like:

type c_struct
  packed
  field Type first
  field Type second

The packed keyword is the important difference between a wrapper and a normal type declaration. It keeps Pliant's compiler from aligning the memory, so you can make sure it stays matched to the memory layout of the wrapped structure. See the ncurses wrapper below for more an example and additional details. Particularly important is the part dealing with the memory alignment problems.

Readline Wrapper

After briefly messing around with the interpreter, I decided that the first library I had to wrap was the GNU Readline library. I'm sure any other CLI lovers will empathize. So, I needed to replace the interpreter's current CLI with a readline provided one. This turned out to be really simple and straightforward. Its one of the things that originally got me hooked. Here is everything that is required to wrap the basic prompt/command line provided by the readline lib.

module "/pliant/language/unsafe.pli"

function readline prompt -> line
  arg CStr prompt line
  external "libreadline.so" "readline"

The unsafe module is needed for the automatic CStr casting. But otherwise this is what most of the external function wrappers will look like. To use the history features of the readline lib, the add_history function also needs to be wrapped.

function add_history line
  arg CStr line
  external "libreadline.so" "add_history"

Using these 2 functions, most of the readline's basic functionality can be utilized. Here is the final function which uses both the above functions to provide the new CLI functionality.

function rl_get prompt -> line
  arg Str prompt line
  var CStr ret
  ret := readline:prompt
  if ret:characters = null
    line := "[0]"
  else
    line := ret
  if not line = "" and not line = "[0]"
    add_history:line

As you can see, the two wrapped function work just as if they were Pliant functions. The '[0]' is returned in place of a null, as the interpreter is expecting a normal Pliant string and Pliant Str types can't hold nulls except using their format, ie. '[0]'.

Ncurses-5 Wrapper

The Ncurses wrapper is more complicated than the readline wrapper. As it not only has many functions to wrap, it has global variables and a C structure to wrap.

Function Wrapping

Wrapping a function for Ncurses is much the same as for the readline library, just a lot more of them. So there isn't much new to go into in this section. Ncurses does have one twist when compared to readline, that is that it has quite a few functions exported as macros. In C, these are incorporated into the program by the compiler. This doesn't help, since we're accessing the library directly.

To deal with macros, you need to port them over manually. This is really pretty simple, as macros are defined in the header file and are usually constructed from the other functions exported from the library or an extremely simple bit of code. Each is handled in basically the same way. You simply re-implement them in Pliant. The only difference is whether you use another function exported by the library or not.

Here's an example of a reimplemented macro from the Ncurses wrapper. First, the C:

#define touchwin(win)           wtouchln((win), 0, getmaxy(win), 1)

And the Pliant version:

function touchwin w
  arg Address w
  wtouchln w 0 (w:_maxy + 1) 1

The touchwin function in ncurses marks a window so that the next refresh redraws it. It is implemented in C via a macro calling the wtouchln function (which marks a range of 1 or more lines for redraw) on the contents of the target window. The Pliant version does the same. The only is that instead of calling getmaxy (which is yet another macro), it accesses the window structure to get the number of lines in the window (ie. _maxy).

Global Variable Wrapping

Global variables are easy to wrap and there's not much to complicate things. So let's get right to an example from Ncurses:

var Address stdscr
  external curses "stdscr"

In Ncurses stdscr is the address of the default window created when you initialize the display. Pretty simple... eh.

C Structure Wrapping

The most difficult part of writing my Ncurses wrapper was dealing with Ncurses window structure. It probably wouldn't have been as hard for someone with more C experience than I, but with help from the forum I figured it out.

There are several steps I went through to get a working wrapper that I was happy with, and this section will reflect my development path. To start we need to know what we're dealing with.

The C structure

struct _win_st
{
    short   _cury, _curx;   /* current cursor position */
    /* window location and size */
    short   _maxy, _maxx;   /* maximums of x and y, NOT window size */
    short   _begy, _begx;   /* screen coords of upper-left-hand corner */
    short   _flags;         /* window state flags */
    /* attribute tracking */
    attr_t  _attrs;         /* current attribute for non-space character */
    chtype  _bkgd;          /* current background char/attribute pair */

    /* option values set by user */
    bool    _notimeout;     /* no time out on function-key entry? */
    bool    _clear;         /* consider all data in the window invalid? */
    bool    _leaveok;       /* OK to not reset cursor on exit? */
    bool    _scroll;        /* OK to scroll this window? */
    bool    _idlok;         /* OK to use insert/delete line? */
    bool    _idcok;         /* OK to use insert/delete char? */
    bool    _immed;         /* window in immed mode? (not yet used) */
    bool    _sync;          /* window in sync mode? */
    bool    _use_keypad;    /* process function keys into KEY_ symbols? */
    int     _delay;         /* 0 = nodelay, <0 = blocking, >0 = delay */

    struct ldat *_line;     /* the actual line data */

    /* global screen state */
    short   _regtop;        /* top line of scrolling region */
    short   _regbottom;     /* bottom line of scrolling region */
    /* these are used only if this is a sub-window */
    int     _parx;          /* x coordinate of this window in parent */
    int     _pary;          /* y coordinate of this window in parent */
    WINDOW  *_parent;       /* pointer to parent if a sub-window */
    /* these are used only if this is a pad */
    struct pdat
    {
        short _pad_y,      _pad_x;
        short _pad_top,    _pad_left;
        short _pad_bottom, _pad_right;
    } _pad;
    short   _yoffset;       /* real begy is _begy + _yoffset */
};

I won't get into the gory details of the above, but there are a items which need more explanation for what comes next to make sense. First, it should be noted that the bool type used above is an unsigned char and the attr_t type is an unsigned long. Finally there are the two other structures usedit this struct, namely ldat and pdat. The latter is already defined, so here's the C for the former:

struct ldat
{
    chtype  *text;          /* text of the line */
    short   firstchar;      /* first changed character in the line */
    short   lastchar;       /* last changed character in the line */
    short   oldindex;       /* index of the line at last update */
};

chtype is an unsigned long int.
As you will see below, I don't actually wrap this struct. Ncurses provides functions to deal with this struct via the window. So all that will be needed is to store the address in the window struct.

The Pliant version

First, lets wrap that pdat struct:

type PDat
  packed
  field Int16 _pad_y _pad_x
  field Int16 _pad_top _pad_left
  field Int16 _pad_bottom _pad_right

This creates the new type in Pliant, PDat. It looks pretty much just like the example, and there are no real surprises here. Notice here and below that the choice as to which Pliant type to use in the wrapper is based primarily on the size of the type its mapping. That's why you'll see the bool type, an unsigned char, in the C struct replaced with uInt8, an 8 bit Int (the same size as the wrapped char type).

public
  type Window
  packed

  field Int16 _cury _curx    # current cursor position
  # window location and size
  field Int16 _maxy _maxx    # maximums of x and y, NOT window size
  field Int16 _begy _begx    # screen coords of upper-left-hand corner
  field Int16 _flags         # window state flags
  # padding to correct offset

field (Array Byte 2) padding1

  # attribute tracking
  field uInt32 _attrs      # current attribute for non-space character
  field uInt32 _bkgd       # current background char/attribute pair

  # option values set by user
  field uInt8 _notimeout   # no time out on function-key entry?
  field uInt8 _clear       # consider all data in the window invalid?
  field uInt8 _leaveok     # OK to not reset cursor on exit?
  field uInt8 _scroll      # OK to scroll this window?
  field uInt8 _idlok       # OK to use insert/delete line?
  field uInt8 _idcok       # OK to use insert/delete char?
  field uInt8 _immed       # window in immed mode? (not yet used)
  field uInt8 _sync        # window in sync mode?
  field uInt8 _use_keypad  # process function keys into KEY_ symbols?
  # padding to correct offset

field (Array Byte 3) padding2

  field Int _delay         # 0 = nodelay, <0 = blocking, >0 = delay

  field Address _line      # the actual line data

  # global screen state
  field Int16 _regtop        # top line of scrolling region
  field Int16 _regbottom     # bottom line of scrolling region
  # these are used only if this is a sub-window
  field Int _parx          # x coordinate of this window in parent
  field Int _pary          # y coordinate of this window in parent
  field Address _parent    # pointer to parent if a sub-window
  # these are used only if this is a pad
  field PDat _pad
  field Int16 _yoffset       # real begy is _begy + _yoffset
  # padding to correct offset - not really needed, as its at the end

field (Array Byte 2) padding3

Most of this is pretty straight forward, expept those lines I've highlighted in bold. These are present due to the C compiler aligning the memory of the structure as an optimization trick. How you figure out if you need these and how much padding to add is covered next.

Memory Alignment with C structures

There are 2 related problems to solve if the struct you're writing gets aligned by the C compiler. First is determining this and then what to do about it.

If you think you've got your wrapper right, yet it still won't map onto the C struct its supposed to be wrapping (basically if all else fails), it's probably due to alignment issues. To be sure, compare the sizes of the C struct vs. the Pliant type.

In Pliant this is a snap, as all types have the built-in method size which, oddly enough, returns the memory size of the type. So, to check the size of the above Pliant Window type, in the file that contains its source I'd just add:

console "Window size: " (Window size) eol

In case you're curious, the correct size is 76.

To determine the size of the struct in C, you need to whip up a little C program. Here is the program I wrote to find out the size of the WINDOW struct Ncurses defines:

#include <curses.h>
#include <stddef.h>
int main()
{
  WINDOW w;
  printf("linesize: %d\n",sizeof(w)); 
}

After running this, you can see for sure whether the memory sizes of the Pliant and C code are the same. If the Pliant's type size is larger than the C struct, you need to check the field types you used in the Pliant type (eg. an uInt instead of uInt16). If the C struct is the larger of the two, then you probably need to add padding to the Pliant type. How much and where? To determine this you need to determine the offsets used for each variable in the struct. Here's the abridged version of the program I used to determine this for this:

#include <curses.h>
#include <stddef.h>
int main()
{
    WINDOW w;
    printf("cury offset: %d\n",offsetof(struct _win_st,_cury));
    printf("curx offset: %d\n",offsetof(struct _win_st,_curx));
    [... cut-n-paste code snipped ...]
    printf("pad offset: %d\n",offsetof(struct _win_st,_pad));
    printf("yoffset offset: %d\n",offsetof(struct _win_st,_yoffset));
}

After you have this information, you compare each offset with the size of the appropriate variable and look for increases in the offset which are different from the variable size. If so, you've found a place where padding needs to be added. I use arrays of bytes to explicitally allocate these chunks (Hubert suggested this).

The Magic of Implicit Casting

Now that we have our wrapper, there's one last detail to make it easy to deal with...

In Ncurses, all the functions and variables deal with the Window as an Address. They expect the windows passed in as addresses and return addresses. While you also need the information contained in the struct (wrapped via the pliant type) for things like the macro replacement above. It is possible to manually cast these each time, but that is a pain. Instead it is possible to have them cast automatically when necessary via implicit casting.

Setting up implicit casting for automatically converting Address types to Window types and vice versa requires 2 separate casting functions.
First, casting from a Window to an Address:

function 'cast Address' w -> a
  arg Window w ; arg Address a
  implicit
  a := addressof:w

export 'cast Address

Now, the other way:

function 'cast Window' a -> w
  arg Window w ; arg Address a
  implicit
  w := (a map Window)

export 'cast Window'

For more examples of implicit casting, see cchar.pli and cstr.pli in the Pliant distibution.

Questions?

Some parts of this unclear? Have an idea for an improvement?

Resources

I want to do a little more work on the Ncurses wrapper before releasing it. But here's a link to the readline wrapper, such as it is.

readline.pli - the simple wrapper for gnu's readline library
plncurses.tar.gz - will be a collection of modules for ncurses wrapping

View Pliant page source code