Gwydion Dylan User's Guide

Edited by

Peter Housel

Andreas Bogk

Use and copying of this software and preparation of derivative works based on this software are permitted, including commercial use, provided that the following conditions are observed:

  • This copyright notice must be retained in full on any copies and on appropriate parts of any derivative works.

  • Documentation (paper or online) accompanying any system that incorporates this software, or any part of it, must acknowledge the contribution of the Gwydion Project at Carnegie Mellon University.

This software is made available "as is". Neither the authors nor Carnegie Mellon University make any warranty about the software, its performance, or its conformity to any specification.

Bug reports, questions, comments, and suggestions should be sent by E-mail to the Internet address .

17 September 2002


Table of Contents

1. Introduction
What is Dylan?
About Gwydion Dylan
Gwydion Dylan Resources
2. Installing Gwydion Dylan
Prerequisite Software Installation
Supported Systems
RPM Installation
Debian Package Installation
FreeBSD Installation
3. Using d2c
Getting Started with d2c
A simple Hello World
Structure of a Dylan Program
Using make-dylan-app
A Complete Hello World
Working with LID (Library Interchange Definition) Files
Invoking d2c
Environment Variables
4. Melange
Introduction
A Concrete Example
Basic Use
Loading and Finding Objects
Importing Header Files
Specifying Object Names
Mapping functions
Prefixes
Explicit Renaming
Anonymous Types
Type Definitions
Implicit class definitions
Specifying class inheritance
Translating Object Representations
Specifying low level transformations
Specifying high level transformations
Other File Options
Function Clauses
Struct and Union Clauses
Pointer Clauses
Constant Clauses
Variable Clauses
Low level support facilities
Predefined types
Locating native C objects
Pointer manipulation operations
Static linking mechanisms
Differences from Creole
Known limitations
Proposed modifications
Enumeration clauses
Inheritance of "map" and "equate" options
Remerging of the "equate:" and "map:" options

List of Tables

4.1. Standard Name Mapping Functions

Chapter 1. Introduction

What is Dylan?

Dylan ™ is a fully buzzword-compliant programming language. It supports:

  • An advanced object model, including headache-free multiple inheritance

  • Multiple dispatch, which makes complex object-oriented designs simple (research at Harlequin shows that three quarters of the concepts illustrated in Design Patterns can be implemented more easily in Dylan than in C++ or Java)

  • A choice between efficient static typing as used in C++ and Eiffel, and convenient dynamic typing, which is commonly associated with scripting and prototyping languages

  • Convenient goodies to make development easier, including garbage collection, functional access to data members, and runtime safety

  • High performance (add a type declaration here and there, and speed will match that of a comparable C implementation)

  • Advanced development tools such as closures, definable language constructs and introspection

Dylan is easy to learn, easy to use, and more powerful than C++ or Java. It tries to balance the best features of traditional compiled languages with advantages of prototyping and scripting languages.

Dylan isn't without some serious shortcomings. It's not widely supported as of the summer of 1998, with only two majors implementations, one of which is still incomplete. Several features of the language could be more elegant, and there's a wickedly intricate set of tradeoffs when implementing shared libraries.

About Gwydion Dylan

Gwydion Dylan is an implementation of the Dylan programming language for Unix systems. Originally written as a research project by the Gwydion group at CMU, it is now maintained by a group of volunteers.

The current version of Gwydion is development code, and intended only for use by Dylan fanatics. The compiler is slow, shared libraries are somewhat fragile, and overall still needs lots of bug fixes. To make life more exciting, the documentation is incomplete, and you'll need to read the source and ask questions on the mailing list. If this sounds like fun, you'll enjoy Gwydion.

Thanks to the skilled programmers at CMU, Gwydion can already generate exceptionally efficient code (about the speed of C in most cases) and implements about 98% of the Dylan standard with many extra libraries.

Gwydion Dylan Resources

Chapter 2. Installing Gwydion Dylan

This chapter gives instructions for installing Gwydion Dylan.

Prerequisite Software Installation

Gwydion Dylan depends on a C compiler being installed, plus, of course, all the tools and header files the C compiler needs to do his job. Your best bet on most platforms is having gcc installed, plus the binutils , and whatever standard C library and headers are appropriate. Installation of GNU Make is also required.

If you want to use shared libraries (which is the default on most systems), you need to install the libtool package as well.

Supported Systems

As of the current writing, Gwydion Dylan supports several flavors of Unix-like systems, including GNU/Linux (on x86 and PowerPC machines), FreeBSD, and Solaris. Other systems are supported but not current, including Windows NT and several varieties of Unix. Binary packages are available for a wide range of Linux distributions. Bringing an ignored port up to date requires one or more recompilations. Creating a new port requires several days to a few weeks of work, and a familiarity with both Dylan and the target system.

RPM Installation

To install Gwydion Dylan on a Linux system using RPM, type: (FIXME: FTP commands are now different.)

	$ ncftp ftp://berlin.ccc.de/pub/gd/
	ncftp> get gwydion-dylan*.rpm
	ncftp> quit
	$ su
	Password: secret!
	% rpm -Uvh gwydion-dylan*.rpm
	% exit
      

Debian Package Installation

Debian packages for Gwydion Dylan are available using the standard Debian package distribution mechanism. To install, type:

	$ sudo apt-get install gwydion-dylan-dev
      

FreeBSD Installation

Gwydion Dylan installation on FreeBSD systems can be done using the FreeBSD ports system, described in the FreeBSD Handbook. To install, type:

	# cd /usr/ports/lang/dylan
	# make install
      

Chapter 3. Using d2c

This chapter describes the Gwydion Dylan-to-C compiler, d2c.

Getting Started with d2c

This section explains the structure of a simple Dylan program, and shows how to use d2c

A simple Hello World

Usually, a Dylan program consists of multiple libraries, which contain multiple modules, which again can be spread across multiple files. While this is powerful, it's a little overkill for something small and simple like a "Hello, World" application.

When you need just one module, in a single source file, with no complex module import and export, you can use d2c's single file mode. For instance, you can put the following into a file called hello.dylan:

module: hello

format-out("Hello, World!\n");
      

The structure should be obvious: the first line is a header line, specifying the module this code belongs to. Since we are in single file mode, the library and executable name is derived from the module name too. Single file mode will generate proper module and library definitions for you.

After the empty line terminating the header is a line of code, which simply prints the message "Hello World!" on the screen, followed by a newline. There's no entry point, all the code lines are executed in order, just like in a perl or shell script.

To compile the program, just pass the name as an argument to d2c:

	$ d2c hello.dylan
	$ ./hello
	Hello, World!
        $
      

Structure of a Dylan Program

A Dylan program according to the Dylan Interchange Format consists of several files. These are the Library Interchange Definition (or LID) file, the import/exports file, and one or more source files.

The LID file instructs the compiler how to build a Dylan library or application. It includes a list of source files, compiler options and other descriptive information. It can be identified by the file name extenion lid.

A second file lists all the function and variable names imported and exported by the program. This generally ends in -exports.dylan or -library.dylan. Technically, this file consists of Dylan source code, but it almost never contains any declarations but define library or define module.

Other files end in the extension dylan and contain the actual Dylan source code.

Using make-dylan-app

Because d2c requires such a complex set of files for even a simple program, it's usually best to take advantage of the program make-dylan-app to create a new application. To use this, cd to an appropriate directory and type make-dylan-app program-name. This will create a new directory containing a simple program which uses a set of libraries useful for a command line utility.

A Complete Hello World

Create a new program called hello by running the command make-dylan-app hello. This will create a new directory named hello. Go there by typing cd hello and take a look at the files.

Open up the file hello/hello.dylan and fill in reasonable values for the author: and copyright: keywords. You will notice that make-dylan-app already filled in a statement printing “Hello, World!” for us, as a placeholder for the code we want to add to the program.

module:    hello
synopsis:  Print the string "Hello, World!"
author:    J. Random Hacker <jrandom@randomhacks.com>
copyright: Copyright 1998, J. Random Hacker

define function main(name, arguments)
  format-out("Hello world!\n");
  exit-application(0);
end function main;

// Invoke our main() function.
main(application-name(), application-arguments());
      

The three other files in this directory are a Makefile for compiling your program (it is not strictly needed, as you could call d2c directly), an export declaration file named hello-exports.dylan, which declares the libraries and modules that are used by your program, and a LID file listing all of your Dylan source files (two in this case) plus the name of the resulting executable.

To compile the program, simply type make. Alternatively, call d2c, passing it the LID file as a parameter, as in d2c hello.lid. You can now run your program by typing ./hello.

Note that, when using shared libraries, the file hello is just a script that calls the real program .libs/hello. The reason for that is rather intricate and has to do with the rpath to shared libraries which are part of your project. In other words, don't worry about it now, and read the libtool documentation when you need more advanced features.

Working with LID (Library Interchange Definition) Files

As described in the Dylan Reference Manual , in the Dylan language the basic unit of compilation is the library. FIXME: more to say about this.

A LID file is composed of entries of the form keyword: value, similar to mail headers and to the Dylan file header format. For reasons of backwards compatibility, d2c supports the list of source files to appear as the "main body" of the LID file, after the header and a blank line. In the Dylan Interchange Format LID, there is a Files: entry which is used instead.

d2c recognizes these LID entries:

Library: dylan-library-name

The Dylan name for the library that we are defining. There must be a corresponding define library somewhere in the source for this library.

Files: list-of-source-files

A whitespace-separated list of source files that constitute this library. The .dylan extension can be omitted. Note that you can continue header statements on the next line by indenting the continuation line with whitespace.

Executable: result-file-name

Specifies that we are building a runnable application rather than a library. The executable is generated with the specified name.

Linker-options: various-ld-flags

This option specifies flags which must be passed to ld when linking against this library. This is primarily used when a foreign library is called via one of the undocumented callout mechanisms. For example, Dylan.lid specifies -lm so that it can use the math library. This dependency is automatically propagated to users of the library.

Unique-ID-base: decimal-integer

Unique class identifiers for classes defined in this library are assigned sequentially starting with the specified integer. This should always be specified, but you won't get a sensible error if it is missing. The base should be sufficiently far from the base for any other library so that class IDs won't overlap. You will get a compile-time error if overlap occurs. A good base for user code would be 30000.

The Unique ID is used for multimethod dispatch, and it is bound to go away in the near future.

Entry-Point: dylan-module: dylan-variable

When generating an executable, this LID option specifies which Dylan function is called as the main entry point. You can also have no main entry point, in which case the program exits after running all of the top level forms. This entry-point function is called with two arguments, argc (an integer) and argv (a raw pointer). Note that this is incompatible with Mindy, and rather brutal as well. You can get the mindy semantics of calling Extensions:Main by using the Extensions module in your main module and then specifying: Entry-Point: mymodule:%main in the LID file. The %Main function parses the arguments and then calls Main.

This header item is deprecated. Just use the side effect of top-level statements instead.

Unit-prefix: c_legal_identifier_fragment

This prefix is used to make the C translation of names in this library unique w.r.t. any other libraries that might be used. This defaults to the library name, mangled according to the Dylan-to-C name mangling rules. You shouldn't need this.

Features: features-list

The argument is a space-separated list of features or misfeatures. If the token begins with ~, then the rest of the token is interpreted as a feature to remove. Otherwise, the token is added as a feature.

Float-precision:

Used to determine the precision of floating-point constants that are not suffixed with precision markers. Legal values are single (to default to <single-float>, double (to default to <double-float>, or extended (to default to <extended-float>). Without a float-precision: keyword, unsuffixed floating-point constants default to <double-float>.

A fourth alternative, auto, indicates that an unsuffixed floating-point constant should be a <double-float> if it has eight or more digits after the decimal point, or a <single-float> otherwise. This behavior corresponds to Functional Developer's default behavior, and is intended for code ported from Functional Developer.

Implicitly-define-next-method:

Yes (the default) if define method should implicitly define the next-method variable within the body, or no if it should not. This keyword is available for compatibility with mindy (which does not implicitly define next-method, or for those who don't want to pay the cost of implicitly defined next methods in every method definition.

Dynamic:

Yes if definitions in the library should be dynamic (modifyable at runtime, for example with add-method), or no if they should not. The default is no.

Here is a sample LID file:

library: my-program
unique-id-base: 30000
executable: mp

myprog-exports.dylan
myprog.dylan
      

Invoking d2c

Environment Variables

These environment variables are used by d2c:

DYLANDIR

The root of the installed Gwydion tree. In the default configuration, this defaults to /usr/local on Unix and c:\dylan on win32. This variable in turn establishes the defaults for DYLANPATH and the platforms.descr file.

DYLANPATH

The search path for Dylan libraries. Directories in the DYLANPATH are searched after any directories specified by explicit -L options. If not set, this defaults to ".:$DYLANDIR/lib/dylan" (".;%DYLANDIR%\lib\dylan" on Win32). If set, the value must include the directory where the Dylan library is to be found.

PATH

d2c expects to find make and the C compiler in PATH. On Unix we use the GNU tools gmake, gcc, and ld. Other compilers can work, but at a minimum this requires a new platform description in $DYLANDIR/share/dylan/platforms.descr. You must also have some of the GNU-win32 tools to run d2c on Windows, though make and Visual C++ are normally used for compilation. To build d2c, you also need perl and the various scripts in the tools directory.

The GNU assembler (gas) must be used in conjunction with the generated code from gcc. If you somehow end up running the HP/UX as with gcc, it will produce many errors about STAB entries, etc.

CCFLAGS

This variable holds the flags passed to the C compiler. The default is platform-specific, but always includes -I$DYLANDIR/include. If you do set this variable, you must also specify the Dylan system include directory. The default optimization flags for gcc are -O2 -fomit-frame-pointer -fno-strict-aliasing. Leaving out the optimize flags will speed compilation at the cost of runtime speed.

Using the Melange Interface Generator

Robert Stockton

Abstract

The Melange interface generator provides a mechanism for providing access to native C code. It is modeled upon Apple Computer's Creole, and shares Creole's goals of automatically providing full support for a foreign interface based upon existing interface descriptions. It also, like Creole, provides mechanisms for explicitly adapting these interfaces to provide a greater match between C and Dylan data models.

Melange, however, differs from Creole in a number of significant ways. This document, therefore, provides a gentle introduction to Melange without attempting to build upon any existing descriptions of Creole.

Introduction

Melange is an automatic interface generator which provides transparent access to both functions and data defined or generated by existing C libraries. It allows users to import "interfaces"[1] from existing C header files, controlled by the contents of a "define interface" top-level form which may be included in the same file as arbitrary Dylan code. The user may use the functions and data specified by this interface as if they were native Dylan objects, and may export them to other modules.

Melange provides reasonable interpretations for the various sorts of C declarations which may appear in a header file, as well as mechanisms for explicitly modifying the default interpretations when necessary. For example, users may:

  • specify rules for the translation of foreign names

  • explicitly specify new names for specific objects or routines

  • specify parameter passing conventions or mutability of foreign objects

  • specify mappings or equivalences between "foreign" data and native equivalents

  • choose to import only a subset of the declarations in the header file

All of these customizations, as well as the name of the C header file, are specified by a "define interface" clause. See the section called “A Concrete Example” for an example.

The basic model for interface importation is based upon that used within Apple Computer's "Creole" interface generator. There are, however, significant differences in some of the details. (For instance, the "equate", "map", and "object-file" directives used in the above example are unique to Melange. Likewise, Creole's "type" directive would not be accepted by Melange) You should, therefore, not expect Creole interface declarations to work within Melange without some modification.

A Concrete Example

In order to get a feel for using Melange, it is probably best to start with a concrete example. This section contains a complete program which will use native C libraries to list the contents of some directories. For now, you should simply skim this example to get a general overview of Melange's capabilities. These will be described in more detail in later sections.

We will first begin with an "interface file" which contains a mixture of basic Dylan code and "define interface" forms which will be processed by Melange. We will name this file "dirent.intr".

module: Junk
synopsis: A poor imitation of "ls"

define library junk
  use dylan;
  use streams;
end library junk;

define module junk
  use dylan;
  use extensions;
  use extern;
  use streams;
  use standard-io;
end module junk;

define interface
  // This clause is more complex than it needs to be, but it does
  // demonstrate a lot of Melange's features.
  #include "/usr/include/sys/dirent.h",
    mindy-include-file: "dirent.inc",
    equate: {"char /* Any C declaration is legal */ *" => <c-string>},
    map: {"char *" => <byte-string>},
    // The two functions require callbacks, which we don’t support.
    exclude: {"scandir", "alphasort", "struct _dirdesc"},
    seal-functions: open,
    read-only: #t,
    name-mapper: minimal-name-mapping;
  function "opendir", map-argument: {#x1 => <string>};
  function "telldir" => tell, map-result: <integer>;
  struct "struct dirent",
    prefix: "dt-",
    exclude: {"d_namlen", "d_reclen"};
end interface;

define method main (program, #rest args)
  for (arg in args)
    let dir = opendir(arg);
    for (entry = readdir(dir) then readdir(dir),
         until entry = $null-pointer)
      write-line(entry.dt-d-name, *standard-output*);
    end for;
    closedir(dir);
  end for;
end method main;
      

We will then process this file through Melange to produce a file of pure Dylan code. On a system with d2c support, there will be an executable named melange, which you can invoke like this:

	  melange dirent.intr dirent.dylan
	

On a mindy-only system, where melange is contained in a file melange.dbc, we would use the following command line:

	  mindy -f melange.dbc dirent.intr dirent.dylan
	

This command will process "melange.intr" and write a file named "dirent.dylan". In this case, it will also silently write a file named "dirent.inc", whose use will be explained later.

You can compile "dirent.dylan" normally, via mindycomp, but in order to execute it, you must make sure that the Mindy interpreter will be able to load the appropriate routines from the library containing the "dirent" routines. Ideally, you would simply let mindy load the appropriate code dynamically. However, this is presently only available for a few machines. Therefore, we will follow a messier approach and build a new version of the interpreter which is aware of the desired functions.

Move to the build directory for the Mindy interpreter, and edit the Makefile so the "EXTERN-INCLUDES" line mentions "your_dir_path/dirent.inc" and then run "make mindy". In this case, this is all that is required to build a new interpreter which is aware of the dirent routines.

You can now put it all together, invoking the new interpreter on the compiled program, with:

        mindy -f dirent.dbc .
      

This should print a list of all files in the current directory.

Because of the difficulty of relinking the interpreter for each new library, it is expected that administrators will build a set of "standard" library interfaces which are prelinked into the interpreter and exported as general Dylan library interfaces. In the future, as Melange (and the Gwydion environment) are extended to support better linking and loading capabilities, it should become easier to incorporate C libraries on an "as-needed" basis.

Basic Use

Although the "define interface" form provides a fairly rich sublanguage for specifying interfaces, it is often sufficient to use just the "minimal" form. For example, if "gc.h" contained the following code:

typedef char bool;
typedef struct obj obj_t;
typedef char *str;
extern obj_t alloc(obj_t class, int bytes);
extern void scavenge(obj_t *addr);
extern obj_t transport(obj_t obj, int bytes);
extern void shrink(obj_t obj, int bytes);
extern void collect_garbage(void);
extern bool TimeToGC;
#define ForwardingMarker ((obj_t)(0xDEADBEEF))
      

then you could import it by creating a file named "class.intr" which includes arbitrary Dylan code and the following:

define interface
   #include "gc.h";
end interface;

You would then run melange class.intr [Or possibly mindy -f melange.dbc class.intr, depending upon the installation on your particular machine.], which would produce a file of Dylan code which contains approriate definitions for the classes "<bool>", "<obj>", "<obj_t>", and "<str>"; the variable TimeToGC; and the functions alloc, scavenge, transport, shrink, and collect_garbage. (The constant ForwardingMarker will be excluded because it is not a simple literal.)

if (TimeToGC() ~= 0)
   collect_garbage();
end if;
      

This code fragment points out some of the hazards of "simple" imports. Melange has no way of knowing that "bool" should correspond to mindy's <boolean> class, so you are stuck with a simple integer. Likewise, the system wouldn't be able to guess that "char *" should correspond to the mindy class "<c-string>". We will explain in later sections how "map:" or "equate:" options may be used to provide this information to Melange.

Loading and Finding Objects

As mentioned above, the include directive in the previous example will only work for files which have been previously linked into mindy. There are extra facilities available to handle other situations.

If your machine is one for which we support dynamic loading [Currently support is primarily for HPUX machines, but some work has been done on Macintoshes and ELF systems. Contact us for more details.] , and you wish to load some declared objects from a shared library, you can add one or more "object-file:" options to the "#include" clause, as in the following:

define interface
   #include "gc.h",
      object-file: "/usr/lib/mindy/gc.sl";
end interface;
	

This would cause the code from "gc.sl" to be loaded into mindy at run-time and make its functions and objects available just as they were in the previous example. If you are running on a non-HPUX machine, you will have to statically link mindy with the appropriate library and a list of mappings from names to addresses. This can be accomplished most easily by following these steps:

  1. Add a "mindy-include-file:" option to your interface definition. This specifies the name of an "interface description file" which will be written by Melange, and which can later be linked into mindy along with the appropriate library.

  2. Run Melange on the source file in the normal manner. You may wish to move the newly created interface description file into your mindy build directory.

  3. Change the Makefile in the Mindy build directory, by adding the imported library to LIBS and the interface description file to EXTERN-INCLUDES.

  4. Run make to rebuild mindy with the new library information.

  5. Compile and run the generated Dylan code as normal.

A typical interface definition for this approach might be:

define interface
   #include "gc.h",
      mindy-include-file: "/usr/local/mindy-build/gc.inc"
end interface;
	

Importing Header Files

You import C definitions into Dylan by specifying one or more header files in an "#include" clause. This may take one of two different forms:

define interface
   #include "file1.h";
end interface;

or

define interaface
   #include {"file1.h", "file2.h"};
end interface;
      

Melange will parse all of the named files in the specified order, and produce Dylan equivalents for (i.e. "import") some fraction of declarations in these files. By default, Melange will import all of the declarations from the named files, and any declarations in recursively included files (i.e. those specified via "#include" directives in the ".h" file) which are referenced by imported definitions. It will not, however, import every declaration in recursively included files. This insures that you will see a complete, usable, set of declarations without having to closely control the importation process. If you wish to exert more control over the set of objects to be imported, you may do so via the "import", "exclude", and "exclude-file" options.

If you only need a small set of definitions from a set of imported files, you can explicitly specify the complete list of declarations to be imported by using the "import:" option. You could, for example, say:

define interface
   #include "gc.h",
      import: {"scavenge", "transport" => move};
end interface;
      

This would result in Dylan definitions for "scavenge", "move", "<obj_t>", and "<obj>". The latter types would be dragged in because they are referenced by the two imported functions. Again, if you equated "obj_t" to <object> then neither of the types would be imported. The second import in the above example performs a renaming at the same time as it specifies the object to be imported. Other forms specify global behaviors. "Import: all" willcause Melange to import every "top level" definition which is not explicitly excluded. "Import: all-recursive" causes it to import definitions from recursively included files as well. "Import: none" restricts importation to those declarations which are explicitly imported or are used by an imported declaration. You may also use the "import:" option to specify importation behavior on a per-file basis. The options

import: "file.h" => {"import1", ...}
import: "file.h" => all
import: "file.h" => none
      

work like the options described above, except that they only apply to the symbols in a single imported file. The "exclude:" and "exclude-file:" options may be used to keep one or more unwanted definitions from being imported. For example, you could use:

define interface
   #include "gc.h",
      exclude: {"scavenge", "transport"},
      exclude-file: "gc1.h";
end interface;
      

This would prevent the two named functions and everything in the named file from being imported, while still including all of the other definitions from "gc.h". Note that these options should be used with care, as they can easily result in "incomplete" interfaces in which some declarations refer to types which are not defined. This could result in errors in the generated Dylan code. (The "import: file => none" option described above is a safer way of achieving an effect similar to "exclude-file:"

You may also prevent some type declarations from being imported by using the "equate:" option (described in a later section). If, for example, you equated "obj_t" to <object>, then Melange would ignore the definition for "obj_t" and simply assume that the existing definition for <object> was sufficient.

You may have any number of "import:", "exclude:", and "exclude-file:" options, and may name the same declarations in multiple clauses. "Exclude:" options take priority over "import:"s. If no "import:" options are specified, the system will import all non-excluded symbols, just as if you had said "import: all".

Specifying Object Names

Because naming conventions differ between C and Dylan, Melange attempts to translate the names specified in C declarations into a form more appropriate to Dylan. This involves

  • Adding angle brackets around type names.

  • Adding dollar signs at the beginning of constant names.

  • Translating (non-initial) underlines into hyphens.

  • Adding "struct-name$" prefixes to slot accessors.

In many cases, this default behavior will be precisely what you want. However, Melange provides mechanisms for specifying different translations for some or all of the declarations.

Mapping functions

The translations described above are provided by calls to a built-in "name mapping function" named "minimal-name-mapping-with-structure-prefix". You may specify other mapping functions via a "name-mapper:" option. Our example interface might then look like this:

define interface
   #include "gc.h",
      object-file: "/usr/lib/mindy/gc.o",
      name-mapper: c-to-dylan;
end interface;
	

Table 4.1. Standard Name Mapping Functions

minimal-name-mapping-with-structure-prefixProvides the translations described above.
minimal-name-mappingSame as above, but excludes the "struct-name$" prefixes.
c-to-dylan

Like minimal-name-mapping, but:

  • Adds hyphens to reinforce "CaseBased" word separation.

  • Adds "get-" prefixes to slot accessors.

identity-name-mappingDoes no translation.

Table 4.1, “Standard Name Mapping Functions” describes the four standard mapping functions that are provided by Melange. Users may link new mapping functions into Melange. In the mindy implementation, this is done as follows:

  1. Create a new module which imports module "name-mappers" from library "c-parse".

  2. Define methods on the "map-name" generic function which accepts the following parameters:

    mapper

    a <symbol> which is typically specialized by a singleton to select a specific name mapper method.

    category

    a <symbol> which will always be one of: #"type", #"constant", #"variable", or #"function".

    prefix

    a <string> which is typically prepended to the result string.

    name

    a <string> which supplies the original C name.

    sequence-of-classes

    a sequence of simple names for the classes which logically "contain" the given object. For example, if we were processing the declaration "struct str {int size; char *chars", one of the calls to the mapping function would have with namebound to "size" and classes bound to #["str"].

    and returns a <string> which will be used as the Dylan name for the declaration.

  3. Compile this module and "link" it into Melange by concatenating it to the end of the melange.dbc.

Mapping functions may call "hyphenate-case-breaks" which performs the same "CaseBased separation" as is done by "c-to-dylan". The trivial "identity-name-mapping" described above might be implemented by:

define method map-name
   (mapper == #"identity-name-mapping", category, prefix, name, classes)
=> (result :: <string>)
   name;
end method map-name;
	

You may specify different name mappers to be applied to the slots of "container types". This capability is described in a later section.

Prefixes

As noted above, the name mapping function is passed a "prefix" argument. By default, it is an empty string, but users may specify a different value by adding a "prefix:" option to the interface definition. For example, we might expand the previous example to:

define interface
   #include "gc.h",
      object-file: "/usr/lib/mindy/gc.o",
      name-mapper: c-to-dylan,
      prefix: "gc-";
end interface;
	

This would cause Melange to tack "gc-" onto the beginning of every translated symbol. Because the system knows about the "standard" Dylan naming conventions, it can do this intelligently. You would, therefore, get names like "<gc-bool>", "gc-time-to-gc", and "gc-scavenge".

Note that the interpretation of the "prefix" is entirely up to the name mapping routine. Identity-name-mapping, for example, completely ignores the prefix. All of the other standard mapping functions prepend it to the name before adding brackets or dollar signs, but after performing all other transformations.

Facilities for adding "localized" prefixes to slot accessors, enumeration literals, etc. will be described in later sections.

Explicit Renaming

Although the automatic name mapping described above is sufficient for most objects named within a header file, there are cases in which you might wish to explicitly control the name of one or more specific objects. You can do this through a "rename:" option. This options specifies a list of translations between raw C names and Dylan identifiers. For example, we might have:

define interface
   #include "gc.h",
      object-file: "/usr/lib/mindy/gc.o",
      name-mapper: c-to-dylan,
      prefix: "gc-"
      rename: {"struct obj" => <C-Object>, "collect_garbage" => GC};
end interface;
	

Note that the "target" of the renaming is an ordinary Dylan variable and is therefore case-insensitive. However, the source is an "alien name", which is (like all C code) case sensitive. Alien names should refer to an object, function, or type in exactly the same way you would refer to them in C. We therefore say "struct obj" instead of simply "obj", and might also say "enum foo" or "union bar". Alien names are actually parsed according to the standard lexical conventions of C, so you may use arbitrary spacing and even include comments if you really wish.

Note that "rename:" options supply names for new objects (and types) that are being imported into Dylan. You cannot, therefore, simply rename "bool" to "<Boolean>" to make it equivalent to the existing type -- this would simply result in a name conflict. For these purposes, you would instead use the "equate" and "map" operations, which will be described later. (In fact, if the C declaration had defined a type name "boolean", you might have to explicitly rename it to something else in order to avoid name conflicts with the existing type. Of course, in the above example, the "gc-" prefix would be sufficient to make the name unique.)

Anonymous Types

The alien names described above can also be used to refer to C's so-called "anonymous types". You can therefore refer to "char *", "int [23]", or even "int (*) (char *foo)" (i.e. a pointer to function which takes a string and returns an integer) [At present, function types are not fully supported. You should not depend upon them to work as expected.] . The ability to refer to anonymous types is useful because it allows you to use the "rename" option to provide explicit names for such types. Normally Melange would simply generate a an arbitrary "anonymous" identifier for the type. Without knowing the name of this type, you could not define new operations upon it. However, by saying, for example, "rename: {"char * => <char-ptr>"}", you can provide a convenient handle to use in defining new operations.

Type Definitions

When Melange encounters a "type definition" [The definition may be implicit, as in "char ** int" or "struct foo *bar". Simply by being present these code fragments supply implicit definitions for "char *", "char **" and "struct foo".] within a header file, it will typically create a new Dylan class which corresponds to that C type. Usually, this will be a subclass of <statically-typed-pointer>, which encapsulates the raw C pointer value (i.e an object address). Each statically typed pointer class will have exactly the same structure (i.e. a single address), but the class itself can be used to determine what operations are supported on the data. This could include slot accessors for "struct"s and "union"s, dereference operations for "pointer" types, or general information about the objectUs size, etc.

There are times when you will find that some of the types defined in a header file are not really "new". It might be that they are completely identical to some type defined in another interface definition, or they might be "isomorphic" to some existing type which has more complete support. Melange provides support for both of these cases. The first case is handled by "equating" the two types, while the second is handled by "mapping" (i.e. transforming) one type into the other.

For example, many header files contain definitions use the types "char *" and "boolean". The declarations of these types don't provide any semantic interpretations -- "char *" is simply the address of a character, and boolean is nothing but a one-byte integer. However, by equating "char *" to the predefined <c-string> type, we can tell Melange that it is actually a <string> and should inherit all of the operations defined upon <string>s. Likewise, we can map the integral "boolean" values into "#t" and "#f" to get a <boolean>. These integral values will be automatically translated into <boolean>s when they are returned by a C function, and <boolean>s will be translated back into integers when passed as arguments to C functions.

Implicit class definitions

Unless otherwise specified, new classes will be created for each type defined in a C header file. When the header file provides meaningful names for these types, then Melange will pass those names to the mapping functions to generate names for the Dylan classes. Otherwise, an anonymous name will be generated, limiting your ability to refer to the new type. For example, "struct foo" would typically generate the class "<foo>", while "struct foo ***" might generate the class "<anonymous-107>". In either case, you can explicitly specify the name for the new class by using the "rename:" option described above.

Different sorts of C declarations will yield different sorts of Dylan classes as well as different sets of operations defined upon them. Therefore, we will consider each variety separately:

Primitive types

The types "int", "char", "long", "short" and their unsigned counterparts are simply translated into <integer>, while "float" and "double" are translated into <float>. However, Melange knows the sizes of each of these types so that pointers and native C "vectors" of them (described below) will work properly. No new types are created for these types.

Pointer types

Declarations like "int *" or "struct foo ***" generate new subclasses of "<statically-typed-pointer>". Note that "struct foo *" is actually treated as a synonym for "struct foo", and does not get a distinct class, although any extra levels of indirection (i.e. "struct foo **") will generate new pointer classes. Three operations are supported upon pointer classes:

pointer-value (pointer, #key index) => (value)
	      

This function "dereferences" the pointer and returns the value. If index is supplied, then "pointer" is treated as a vector of values and the appropriate element is returned.

content-size (cls) => integer
	      

Returns the size of the value referenced by instances of "cls". If the size is not known, this is 0.

Note that these types are not automatically treated as vectors. You may, however, make them so by using a "superclasses:" option to make them <c-vector>s.

Vector types

Declarations like "char [256]" are treated almost identically to pointer types, but they are automatically defined as subclasses of <c-vector>, so that all vector operations will be defined on them. However, because many systems depend upon the lack of bounds checking in C, vector types have a default size of "#f". You may explicitly define "size" functions to provide a more accurate size.

Structure types

Declarations like "struct bar {int a; char *b;}" also generate new subclasses of "<statically-typed-pointer>". Melange will define all of the operations defined for pointer values (described above), as well as accessors for each of the structure slots. Structure objects are always accessed through "pointers" to them. Therefore, unless a non-zero index is specified, "pointer-value" will simply return the object passed to it. (The operation is still defined because non-zero indices can be used for vector access.)

Union types

Declarations like "union bar {int a, char *b;}" are treated the same as struct declarations, except that the slot accessors all refer to the same areas in memory. Enumeration types -- Declarations like "enum foo {one, two, three};" are simply aliased to <integer>. However, constants are defined for each of the enumeration literals.

Typedefs

Declarations like "typedef struct foo bar" simply define new names for existing types.

Specifying class inheritance

When Melange creates new "<statically-typed-pointer>" classes, it typically creates them as simple subclasses of "<statically-typed-pointer>", with no other superclasses. However, you might sometimes need more control over the class hierarchy. For example, you might wish to specify that a C type should be considered a subtype of the abstract class "<sequence>". You could accomplish this via the following declarations:

define interface
   #include "sequence.h";
   struct "struct cons_cell" => <c-list>,
      superclasses: {<sequence>};
   function "c_list_size" => size;
end interface;

define method forward-iteration-protocol (seq :: <c-list>)
....
	

Note that the type "<c-list>" will still be a subclass of "<statically-typed-pointer>"—we have simply added "<sequence>" to the list of superclasses. If "<statically-typed-pointer>" is not explicitly included in the "superclasses:" option, then it will be added at the end of the superclass list.

As demonstrated in the above example, you are still responsible for specifying whatever functions are required to satisfy the contract for the declared superclasses. "<C-list>" will be declared as a sequence, but you must specify a forward iteration protocol before any of the standard sequence operations will work properly.

The "superclasses:" option may currently be used within "struct", "union", and "pointer" clauses.

Translating Object Representations

Whenever a native C object is returned from a function or a Dylan object is passed into a C function, it is necessary to translate between the object representations used by the two languages. From MelangeUs standpoint, native C objects consist of an arbitrary bit pattern which can be translated to or from a small number of "low level" Dylan types -- namely <integer>, <float>, or any subclass of <statically-typed-pointer>. This translation is handled automatically, although the user may explicitly specify which of the possible Dylan types should be chosen for any given C object type. In some cases, a further translation may take place, converting the "low level" Dylan value to or from some arbitrary "high level" Dylan type. (For example, an <integer> might be translated into a <boolean> or a <character>, and a <c-string> might be translated into a <byte-string>.) These "high level" translations are automatically invoked at the appropriate times, but both the "target" types and the methods for performing the translation must be specified by the user.

Specifying low level transformations

The target Dylan type for "low level" translations is typically chosen automatically by Melange. Integer and enumeration types are translated into <integer>; floating point types are translated to <float>; and all other types are translated into newly created subclasses of <statically-typed-pointer>. However, you may explicitly declare the target Dylan type for any C type by means of an "equate:" option:

define interface
   #include "gc.h",
      equate: {"char *" => <c-string>};
end interface;
	

This declaration makes the very strong statement that any values declared in C as "char *" are identical in form to the predefined type "<c-string>" (which is described in Appendix I). The system will therefore not define a distinct type for "char *" and will ignore any structural information provided in the header file. You migh also use an "equate:" option to equate a type mentioned in one interface definition with an identically named type which was defined in an earlier interface definition.

You should use caution when equating two types. Since Melange has no way of knowing when two types are equivalent, it must trust your declarations. No type checking can or will be done, so if you incorrectly equate two types, the results will be unpredictable. In some cases, you may wish to go with the less efficient but slightly safer technique of letting Melange create a new type and then "mapping" that new type into the desired type. (This is described in detail below.)

Note also that two types with identical purposes will not necessarily have identical representations. For example, C's boolean types are simple integers and are not equivalent to Dylan's <boolean>. Again, explicit "mapping" may be used to transform between these two representations.

In the current implementation, an "equate:" option only applies within a single interface definition. Other interface definitions will not automatically inherit the effects of the declaration. In future versions, we may add the ability to "use" other interface definitions (just as you would "use" another module withing a module definition) and thus pick up the effects of the "equate: (and "map:") options within those interfaces.

Specifying high level transformations

Sometimes you may wish to use instances of some C type as if they were instances of some existing Dylan class, even though they have different representations. In this case, you can specify a secondary translation phase which semi-automatically translates between a "low level" and a "high level" Dylan representation. In order to do this, you must provide a "map:" option:

define interface
   #include "gc.h",
      equate: {"char *" => <c-string>},
      map: {"bool" => <boolean>};
end interface;
	

This clause will cause any functions defined within the interface to call transformation functions wherever the original C functions accept or return values of type "bool". Two different functions may be called:

import-value (high-level-class :: <class>, low-level-value :: <object>)
	

This function is called to transform result values returned by C functions into a "high level" Dylan class. It should always return an instance of "high-level-class".

export-value (lowlevel-class :: <class>, high-level-value :: <object>)
	

This function is called to transform "high level" argument values passed to C functions into the "low level" representations which will be meaningful to native C code. It should always return an instance of "low-level-class".

Default methods, which simply call "as", are provided for each of these functions. This will be sufficient to transform C's integral "char"s into <character>s, <c-string>s into other <string>s, or one "pointer" type into another. There is also a predefined method which will transform <integer>s into <boolean>s. However, if you wish to perform arbitrary transformations upon the values, you may need to define additional methods for either or both of these functions. For example, the default methods for transforming to and from <boolean> are:

define method export-value (cls == <integer>, value :: <boolean>)
 => (result :: <integer>);
   if (value) 1 else 0 end if;
end method export-value;

define method import-value (cls == <boolean>, value :: <integer>)
 => (result :: <boolean>);
   value ~= 0;
end method import-value;
	

It is important to note that, unlike "equate:" options, "map:" options don't prevent Melange from creating new types. You may, in fact, both equate and map the same type. This will cause low level values to be created as instances of the "equated" type and then transformed into instances of the "target" type of the mapping. For example, you might take advantage of the defined transformations between string types by declaring:

define interface
   #include "/usr/include/sys/dirent.h",
      equate: {"char *" => <c-string>},
       map: {"char *" => <byte-string>};
end interface;
	

This causes the system to automatically translate "char *" pointers into <c-string>s (i.e. a particular variety of statically typed pointer) and then to call "import-value" ot translate the <c-string> into a <byte-string>. If we did not provide the "equate:" option, then we would have to explicitly provide a function to transform "pointers to characters" into <byte-string>s. The "equate:" option lets us take advantage of all of the predefined functions for <string>s, which includes transformation into other string types.

Other File Options

There are a few other options that may be specified within an "#include" clause, but which do not fit into any of the above categories. These options are "define:", "undefine:", "seal-functions:" and "read-only:".

The "define:" and "undefine:" options control the C preprocessor definitioins which will be implicitly defined during parsing of the header files. If you specify neither of these options, Melange will use a default set of definitions which correspond to those used by a typical C compiler for the machine you are running on. [At present, the only set of definitions provided will be those appropriate for the HPUX OS. However, it is straightforward to add dif ferent sets of definitions to Melange.] The define options takes a string containing a single C token and an optional string or integer literal, which will be used as the expansion. (If no literal is specified, the token will be expanded to "1".) The "undefine:" removes one or more of the default definitions. You might, for example, use:

define interface
   #include "gc.h",
      define: {"PMAX", "BSD_VERSION" => "4.3"},
      undefine: {"HPUX"};
end interface;
      

The "seal-functions:" option controls whether the various imported functions and slot accessors will be sealed or open. By default, functions are sealed, but you may explicitly specify this by using "seal-functions: sealed" or reverse it by using "seal-functions: open". Melange does not support the Creole's "inline" sealing option.

The "read-only:" option specifies whether setter functions should be defined for slot and object accessors. They will be defined by default, but if you specify "read-only: #t", no setters will be defined.

The effects of the "seal-functions:" and "read-only:" options can be modified for particular container types. We will explain how to do this in a later sections.

Function Clauses

Imported functions can be easily invoked, in almost every case, without any additional declarations. However, by exerting explicit control over argument handling, the interfaces to some functions may be made cleaner. This control is exerted via function clauses. The primary purpose of these clauses is to specify additional type information for specific parameters or to specify alternative argument passing conventions. For example, if we had two alternate "read-integers" functions with the following declarations:

int ReadInts1(int **VectorPtr);  /* result is a count of integers */
int *ReadInts2(int *Count);      /* result is a vector of  integers */
      

we might use the following interface definition:

define interface
   #include "readints.h",
      rename: {"int *" => int-vector};
   function "ReadInts1",
      output-argument: 1;
   function "ReadInts2" => Read-Integers-Vector,
   output-argument: Count;
end interface;
      

This would produce two functions, both of which take 0 arguments but return two values. The first would return an <integer> following by an "<int-vector>", while the second would return the <int-vector> first and the <integer> second.

let (count :: <integer>, values :: <int-vector>)
   = Read-Ints1();
let (values :: <int-vector>, count :: <integer>) 
   = Read-Integers-Vector();
      

The function clause consists of a function name (which is a string), an optional renaming (as illustrated above), and an optional sequence of "options". The options include the following:

seal:

specifies whether the resulting method should be sealed. Possible values are sealed or open, and the default is taken from the value specified in the initial file clause. (The "default default" is sealed.)

equate-result:

overrides the default interpretation of the result type. The named type is assumed to be fully defined.

map-result:

specifies that "import-value" should be called to map the result value to the named type.

ignore-result:

specifies that the functions result value should be ignored, just as if the function had been declared "void". Although you may specify any boolean literal, the only meaningful value is #t.

equate-argument:

overrides the default interpretation of some argument's type. The argument may be specified by name or by position.

map-argument:

specifies that "export-value" should be called to map the given argument into the named type. Again, the argument may be specified by position or by name.

input-argument:

indicates that the specified argument should be passed by value. This is the default.

output-argument:

indicates that the specified argument should be be treated as a return value rather than a "parameter". The effect is to declare that the C parameter will be passed by reference and that the reference variable need not be initialized to any object. This option assumes that the C parameter will have been declared as a "pointer" type, and will strip one "*" off of the argument type. Thus, if the parameter declaration specifies "int **", the actual value returned will have the Dylan type corresponding to "int *".

input-output-argument:

indicates that the specified argument should be considered both an input argument and that its (potentially modified) value should be returned as an additional result value. The effect is similar to that of "output-argument" except that the reference variable will be initialized with the argument value.

The following (nonsensical) example demonstrates all of the options, as they might be applied to the functions:

extern struct object *bar(int first, int *second,struct object **third);
extern baz(char first, struct object *second);

define interface
   #include "demo.h";
   function "bar",
      seal: open,
      equate-result: <object>,
      map-result: <bar-object>,
      input-argument: first,   // passed normally
      output-argument: 2,      // nothing passed in, second result value
            // will be <integer>
      input-output-argument: third;   // passed in as second argument, 
            // returned as third result
   function "baz" => arbitrary-function-name,
      seal: sealed,      // default
      ignore-result: #t,
      equate-argument: {second => <object>},
      map-argument: {2 => <baz-object>};
end interface;
      

Struct and Union Clauses

"Struct clauses" and "union clauses" (referred to collectively as "container clauses") are used to specify naming in inclusion of class slots in exactly the same way that the options in the file clause control the handling of global definitions. Like the function clauses described above, they consist of the reserved word "struct" or "union", a string which gives the full C name of the container declaration, an optional renaming, and a list of options. If we have a structure defined by

typedef struct cons {
   int index;
   struct object *head;
   struct cons *tail;
} cons_cell;
      

we could use the following interface definition:

define interface
   #include "cons.h";
   struct "struct cons" => <c-list>,
      superclasses: {<sequence>},
      prefix: "c-list-",
      name-mapper: identity-name-mapping,
      exclude: {"index"};
end interface;
      

Valid options for container clauses include: import:, prefix:, exclude:, rename:, seal-functions:, read-only:, equate:, and map:.These options act like the equivalent options which may be specified in a file clause, but they apply to the slots of a single "class" rather than to globally defined objects. Options specified within a container clause override any global defaults that might have been specified in the "#include" clause. Container clauses also permit the "superclasses:" option described in the section called “Specifying class inheritance”. Although the recommended method for specifying a container type is to use the full C name (i.e. "struct foo"), you may also use an alias defined by a typedef. Thus, in the above example, you could have specified either "struct cons" or "cons_cell", with identical results.

Pointer Clauses

"Pointer clauses" modify the definitions of pointer declarations such as "int *" or "struct foo ***", or vector declarations such as "char [256]". Like all such clauses, they may be used to specify renamings for the classes. This is particularly useful for pointer types since they are not automatically assigned user-meaningful names. It also allows specification of the "superclasses:" option described in the section called “Specifying class inheritance”. A typical use might be:

define interface
   #include "vec.h";
   pointer "int *" => <int-vector>,
      superclasses: {<c-vector>};
   pointer "struct person **" => <people>,
      superclasses: {<c-vector>};
   pointer "char [256]" => <fixed-string>;
end interface;
      

This clause is particularly useful for declaring pointer types to be subclasses of <c-vector> so that they can be indexed via "element". (Note that this is not necessary for vector declarations, since they are automatically declared to be <c-vectors>.)

Constant Clauses

Constant clauses are used to override the values of constants specified in header files (i.e. "#define MAXINT 27"). The "value:" option, which is the only one supported, specifies a Dylan literal which will be taken as the value of the named constant. A typical use might be:

define interface
   #include "const.h";
   constant "MAXINT" => $maximum-fixed-integer,
      value: 9999999;
end interface;
      

Variable Clauses

Global variables declared within C header files are translated into "getter" functions which retrieve the value of the C variables and optional "setter" functions to modify those values. In effect, they are treated as slots of a "null object"—the getter function takes no arguments and returns the value of the variable, while the setter function takes a single value which will be the new value of the variable. Type mapping takes place for the arguments and results of these functions, just as it would for slot accessors.

Variable clauses support the following options:

getter:

specifies a Dylan variable name which will be used to hold the getter function.

setter:

specifies either a Dylan variable name which will be used to hold the setter function, or #f to indicate that there should be no setter function.

read-only:

specifies whether the variable should be settable. "read-only: #t" is equivalent to "setter: #f".

seal:

specifies whether the getter and setter functions should be sealed. Possible values are "sealed" or "open", and the default is taken from the "seal-functions:" option in the "#include" clause (or "sealed" if not specified there).

map:

specifies the high-level type to which the variable should be mapped.

equate:

specifies the low-level type to which the raw C value should be implicitly converted.

Low level support facilities

The high level functions for calling C routines or for accessing global variables are all built upon a relatively small number of built-in primitives which perform specific low-level tasks. You should seldom have any need to deal with these primitives directly, but they are nonetheless available should you need to make use of them. To use these types and operations, you should "use" the module "system" from the "Dylan" library.

Predefined types

<statically-typed-pointer> [class]
	

Unless otherwise specified, C pointers are implicitly "equated" to newly created subclasses of <statically-typed-pointer>. This class is contains a single implicit slot which contains the raw C pointer. Because of implementation limitations in Mindy, you may not add any extra slots to subclasses of <statically-typed-pointer>, nor can such a subclass inherit slots from other classes. You may, however, create classes which are subclasses of both <statically-typed-pointer> and other (presumably abstract) classes which have no slots.

The "make" method for takes three keywords. The "pointer:" keyword tells it to initialize the new variable with the given value, which must be a <statically-typed-pointer> or an <integer>. If the no pointer value is specified, space will be allocated based upon the content-size of the specific type and upon the "extra-bytes:" and "element-count:" keywords. These keywords, which default to "0" and "1" respectively, tell how many objects are going to be stored in the memory and how many bytes of extra memory (beyond that specified by "content-size") should be allocated for each element.

<c-vector> [class]
	

<C-vector> is a subclass of <statically-typed-pointer> which inherits operations from <vector>. Because systems often depend upon C's lack of bounds checking, the default size for <c-vector>s is "#f". However, subclasses of <c-vector> may provide a concrete size if desired. Types corresponding to declarations such as "char [256]" are automatically declared as subclasses of <c-vector>, but pointer declarations such as "char *" are not.

<c-string> [class]
	

<C-string> is a subclass of <statically-typed-pointer> which also inherits operations from <string>. It is implemented as a C pointer to a null-terminated vector of characters, and provides a method on forward-iteration-protocol which understands this implementation. This class may, therefore, be used for manipulating C's native format for "string"s (i.e. "char *"). Note that the "null" string is considered to be a valid empty string. This is somewhat contrary to the semantics of many C operations, but provides a safer model for Dylan code.

The "make" method for <c-string>s accepts the "size:" and "fill:" keywords.

There are a few surprising properties of <c-strings> which may users should be aware of, both of which result from the "null-terminated" implementation. Firstly, the "size" of the string is computed by counting from the beginning of the string, and is therefore not nearly as efficient as you might expect. Secondly, you should expect odd results if you try to store "as(<character>, 0)" into such a string. Finally, the "element" and "element-setter" methods must scan the string in order to do bounds checking, and may therefore be fairly slow. If you wish to (unsafely) bypass this checking, you must use "pointer-value" instead.

<c-function> [class]
	

<c-function>s, like <statically-typed-pointer>s, encapsulate a raw "C" pointer. However, <c-function>s also encode information about the calling conventions of the function which is (presumably) located at the given address. They may, therefore, be directly invoked just like any other function.

<foreign-file> [class]
	

The <foreign-file> class is used to store information about the contents of a particular object file. It is created by "load-object-file", and may be passed as an option to "find-c-function" and "find-c-pointer". (All of these functions are described below.)

Locating native C objects

There are several functions provided which search for C functions or variables and return Dylan objects which refer to them. Note that Mindy does not have sufficient information to determine whether any given C object is a function, and therefore it depends upon the user (or, more often, Melange) to provide it with correct information.

load-object-file(files :: <list>, #key symbols) [function]
	

This function (which is presently only works on HPUX machines) attempts to dynamically load a given object file (i.e. ".o" or ".a") into the current Mindy process and load itUs symbol table to allow its contents to be located by "find-c-pointer" or "find-c-function". If it successfully loads the file, it will return a <foreign-file> encapsulating the symbol table information. Otherwise, it will return #f.

If you are not running on an HPUX machine, you will have to statically link object files into mindy, as described in Chapter II.

find-c-pointer(name :: string, #key file :: <foreign-file) [function]
	

This function searches through the symbol table for the object file corresponding to the specified file (or for Mindy itself) and attempts to locate a symbol with the given name. If it finds such a symbol, it converts the corresponding address to a <statically-typed-pointer> and returns it. Otherwise, it returns #f.

find-c-function (name :: <string>, #key file) [function] 

constrain-c-function (fun :: <c-function>, [function] 

specializer :: <list>, rest? :: <boolean>, 

results :: <list>)
	

The function "find-c-function" is like "find-c-pointer", except that the result is a <c-function> (or #f). The resulting function is specialized to "fun(#rest args) :: <object>". However, it may be constrained to a different set of specializers via "constrain-c-function". This function accepts lists of types for the arguments and for the return values, and a boolean value which states whether optional arguments are accepted. The result declarations are particularly important, since they are used to coerce the raw C result value into an appropriate low level Dylan type. The possible types are <boolean>, <integer>, or any subclass of <statically-typed-pointer>. Note that although a list of result types is accepted, only the first can be meaningful since C does not support multiple return values.

Note

The functions in this section are likely to change drastically in the near future.

Pointer manipulation operations

Each <statically-typed-pointer> encapsulates a pointer to some area of memory (i.e. a raw machine address). In itself, this does little good, except as an arbitrary token. However, Mindy provides a number of primitive operations which manipulate the contents of these addresses, or do basic comparisons and arithmetic upon the addresses themselves.

signed-byte-at (ptr :: <statically-typed-pointer>, #key offset) [function] 

unsigned-byte-at (ptr :: <statically-typed-pointer>, #key offset)
[function] 

signed-short-at (ptr :: <statically-typed-pointer>, #key offset) [function] 

unsigned-short-at( ptr :: <statically-typed-pointer>, #key offset)
[function] 

signed-long-at (ptr :: <statically-typed-pointer>, #key offset) [function] 

unsigned-long-at (ptr :: <statically-typed-pointer>, #key offset)
[function] 

pointer-at (ptr :: <statically-typed-pointer>, [function] 

#key offset, class)
	

These operations return an object which represents the value stored at the address corresponding to "ptr". The first six operations all return <integer>s -- the different versions are required because the same number may be represented in a variety of formats (differing in length and interpretation of the high-order bit) and because Mindy has no way of determining which might be used in a given situation. The final operation, "pointer-at", returns a new <statically-typed-pointer> encapsulating the address referenced by the origninal pointer. You may use the "class:" keyword to specify that the new object should be an instance of some particular subclass of <statically-typed-pointer>. (Thus, for example "pointer-at(foo, class: <bar>)" would be roughly equivalent to "as(<bar>, pointer-at(foo))".)

The offset parameter (if provided) is added to the integer corresponding to the machine address before the pointer is dereferenced. This is useful, for example, in loading an object from within a C "struct".

Setter functions are provided corresponding to each of the above functions. You can therefore, say

signed-short-at(ptr) := 32767;
pointer-at(ptr1) := ptr2;

as(cls == <integer>, ptr :: <statically-typed-pointer>) [G.F. Method] 

as(cls == <statically-typed-pointer>, ptr :: <statically-typed-pointer>) 

[G.F. Method] 

as(cls == <statically-typed-pointer>, int:: <integer>) [G.F. Method]
	

Method upon "as" are provided for converting from <integer> to any statically typed pointer class and from any statically typed pointer class to <integer> or to another statically typed pointer class.

\+ (ptr :: <statically-typed-pointer>, int :: <integer>) [G.F. Method] 

\- (ptr1 :: <statically-typed-pointer, ptr2 :: <statically-typed-pointer>) 

[G.F. Method] 

\= (ptr1 :: <statically-typed-pointer, ptr2 :: <statically-typed-pointer>) 

[G.F. Method]
	

These functions do arithmetic upon the integers corresponding to the given pointers. The following code fragment

let new-ptr = ptr1 + 3;
let difference = ptr2 + ptr3;
let same? = (ptr2 = ptr3)
	

is equivalent to

let new-ptr = as(ptr1.object-class, as(<integer>, ptr1) + 3);
let difference = as(<integer>, ptr2) - as(<integer>, ptr3);
let same = (as(<integer>, ptr2) = as(<integer>, ptr3));
	

Static linking mechanisms

Because object file formats vary widely by architecture, Mindy does not support dynamic loading of object files or automatic symbol table look up for all architectures. In the general case, it is necessary to depend upon a less elegant technique for explicitly making certain C objects available.

Simple instructions for using this mechanism from within Melange are given in the section called “Loading and Finding Objects”. This section simply provides more information on the underlying mechanism.

In order to make sure that the desired symbols can be located, it is necessary to build an explicit table which maps between the symbol's name and its address. This table is automatically created by running the "make-init.pl" script [This requires you to have PERL installed on your system.] upon a list of "interface definition files". This will create two files ",extern1.def" and ",extern2.def", which should then be renamed to "extern1.def" and "extern2.def" respectively. These files are automatically included by "ext-init.c" so that the table will be created after Mindy is rebuilt.

The interface definition files consist of zero or more lines of text, each of which should contain the name of one object. If the object is a function, it should be immediately followed by a set of parentheses. For example, the file which defines the memory allocation routines used by Melange's support code contains the following four lines:

free()
malloc()
strcmp()
strlen()
      

The only other step required to make the objects available is simply to ensure that the library which contains them is linked into Mindy. The easiest way to accomplish all of this is to simply modify the Makefile in Mindy's source directory. If you add the names of the required libraries to LIBS and the names of the interface definition files to EXTERN-INCLUDES, make will do the necessary work for you. You should be sure to leave "../compat/libcompat.a" or "-lm" in LIBS and "malloc.inc" in EXTERN-INCLUDES.

Differences from Creole

It would be difficult to produce an exhaustive list of the differences between Creole and Melange. We can, however, include a brief examination of the most important incompatibilities between the two systems.

  • Creole's "type:" options have been replaced by Melange's "equate:" and "map:" options.

  • Creole's "access path" options have been replaced by "object-file:" and "mindy-include-file:".

  • The interface to "import-value" and "export-value" differ between the two systems.

  • Melange does not inherit type mappings from other "define interface" forms.

  • Creole does not import definitions from "recursively included" header files, even if they are referenced by definitions which are imported.

  • Creole does not support C vectors or "sub-structures" as first class objects.

  • Melange does not presently support callbacks, "export-temporary-value", "<pascal-string>", "with-stack-structure", "with-stack-block", or "alien-method".

  • Creole will never consider instances of two distinct statically typed pointer classes to be "=", even if they refer to the same address.

Known limitations

Although mostly complete, the current implementation of Melange is missing a few elements which might be required for some applications. The following capabilites probably should be present, but are not yet supported:

  • Floating point numbers.

  • Callbacks.

  • Function types. (It is, however, possible to import a function as a simple <statically-typed-pointer> and then manipulate it like any other object.)

Proposed modifications

Although Melange seems to be fairly useful in its present form, we are currently considering a number of ways in which it may be made more useful. This section contains a brief discussion of several potential changes which may be implemented in the future.

Enumeration clauses

At present, there is no way to modify the default handling of a C enumeration declaration. It is clear that you might wish a mechanism to specify several different explicit options: prefixes for the enumeration constants; respecification of constant values; and, of course, explicit "import:" and "exclude:" options.

Inheritance of "map" and "equate" options

There are some cases in which a set of types imported within one interface definition might be used extensively within another. In the present implementation, the two interface definitions would be handled independently and equivalences between types would not be recognized in the abscence of explicit "equate:" options.

One proposed solution would involve the ability to explicitly "use" one interface definition within another. This would result in all identically named types being implicitly equated and all top-level "map:" options being inherited. The "use" clause could support roughly the same syntax as the "use" clauses in library and module definitions.In order to make this work, it would be necessary to assign arbitrary names to interface definitions. This would have the added benefit of making them more consistent with other standard Dylan definition forms.

If this change were implemented, a typical interface definition might look something like the following:

define interface date
   #include "date.h";
   use time, import: {"struct time"};
end interface date;
	

A less ambitious version might remain compatible with the current syntax by replacing the interface name with an "interface-name" option, which would default to the root of the file name. Thus,

define interface
   #include "date.h",
      interface-name: "date";
end interface;
	

would yield the same effect as the previous example.

Remerging of the "equate:" and "map:" options

It has been pointed out that the current method of specifying low-level and high-level mappings, while sufficiently expressive, is somewhat verbose and confusing. It would therefore be good to find an alternative notation.

It has been suggested that definitions like:

define interface
   #include "dirent.h",
      equate: {"char *" => <c-string>},
      map: {"char *" => <byte-string>};
end interface;
	

might be replaced by something like:

define interface
   #include "dirent.h",
      equate-and-map: {"char *" => <c-string> => <byte-string>};
end interface;

or

define interface
   #include "dirent.h";
   transform "char *",
      low-level: <c-string>,
      high-level: <byte-string>;
end interface;
	


[1] In fact, a C header file may contain arbitrary C code which Melange is unprepared to handle. By convention, however, ".h" files con tain only "interface declarations"—type declarations, function prototypes, global variable declarations, and "preprocessor constants." Since Melange can meaningfully process all of these, it is capable of handling the vast majority of header files which will be encountered in practice.