C2Ada - a translator from C to Ada

C2Ada is a translator from the C programming language to the Ada 95 programming language. It is written in C and is hosted on Unix. (We are aware of an OS/2 port.)

C2Ada was used by Intermetrics to produce Ada 95 bindings to X Windows, Microsoft Windows, and GCCS.

Using C2Ada is a way to lessen the work in translating C headers into Ada, to produce a binding, and in translating whole C programs into Ada, producing a translation. C2Ada can do about 80% to 90% of the work automatically but it still takes some manual work to do the last 10% or 20%.

The technology for translating C headers is more mature than the technology for translating C functions and statments. C2Ada does not translate C++, just C.

Table of contents:

  • C2Ada origins and authors
  • Downloading and building C2Ada
  • Using C2Ada
  • Main program issues
  • Configuring C2Ada
  • C2Ada restrictions and known bugs
  • Release notes on the various versions of C2ada
  • C2Ada copyright and warranty disclaimer

    C2Ada origins and authors

    C2Ada is based on the cbind program written by Mark Schimmel of Rational Software Corporation.

    C2Ada was written starting with the above sources, by Randy Hudson and Mitch Gart of Intermetrics. C2Ada translates C headers into Ada package specifications and also translates C functions and statements into Ada package bodies.

    In 2007 C2ada was ported to Linux by Nasser Abbasi. Jeffrey Creem set up a Sourceforge project to act as the public repository.


    Downloading and building C2Ada

    To download and build C2Ada:
  • if you don't have Python , download, install and build it (C2Ada was developed with version 1.3); make the libainstall target as well
  • if you don't have gperf, download, build and install it
  • checkout the source from SourceForge:
       svn co https://c2ada.svn.sourceforge.net/svnroot/c2ada/trunk c2ada
    
  • change directory to c2ada and edit setup, specifically to reference the correct version of Python. Execute ./setup to create Makefile.config.
  • make (this will create c2ada, the C2Ada executable)

    Using C2Ada

    The simplest use of the command is

    c2ada test1.c
    
    (test1.c is not the name of a file in the distribution.) With no command line options, this command translates the source test1.c into the Ada files bindings/test1.ads and bindings/test1.adb.

    It uses C as the name of the "predefined" or "parent" package. (See further discussion below.) Comments are not retained, no package filename map or configuration file is used.

    If test1.c contains any #include statements, then Ada files will be generated as well for the included C files.

    Command line options

    Running the command c2ada with no arguments will print a full list of the options available.

    The most important command line options are:

    -Dname[=value]
    Define a macro with an optional value. By default macros will be defined with the value 1.
    -Idir
    Add a search path for finding include files.
    -Sdir
    Add a search path for system include files
    -C
    Attempt to retain C comments in the translation.
    -pp name
    Predefined (or parent) package name (default is C)
    -mf
    Use file cbind.map to map unit names.
    -Pfilename
    configuration file name
    -Opathname
    output directory (default=bindings)

    Example usage

    When using C2Ada to produce the GCCS binding/translation, this command script was used:
    #!/bin/sh
    
    # -C:		 Preserve comments
    # -pp:		 parent package = gccs
    # -sih:          Suppress import declarations from included headers.
    # -src:          Suppress all record rep clauses.
    # -ap:           Automatic packaging.
    # -gnat and -95: Select gnat and Ada 95
    # -mwarn:        Warnings about untranslated macros.
    # -mf:		 use cbind.map file for package name mappings
    # -noref:	 no reference comments back to C
    # -I.:		 process #includes from this directory
    
    I="-Igccs/ -Igccs/AcctGrps/"
    P="-Pconfig-gccs"
    PP="-pp gccs"
    
    $c2ada/c2ada -C $P $PP -sih -src -ap -gnat -95 -mwarn -mf $I test.h
    
    The c2ada command is run from the Unix command line with a series of switches, mentioned above, on the file test.h. This file test.h was written to have a series of C #define and #include statements which cause the right constants to be set for the binding, and cause the right files to be brought into the binding.

    Parent package

    The -pp gccs switch says that the "parent package" of most other packages will be named gccs. (Using GNAT conventions, the package's Ada name is package gccs, the file gccs.ads contains its specification, and the file gccs.adb contains its body.) If no -pp command line option is given the default is a package named C in file c.ads.

    Another gloss of the -pp switch is that it defines the "predefined package". The translation relies on a number of predefined types and subroutines. (Many of these declarations are simply renamings of facilities in Interfaces.C.) The distribution includes files c.ads, c.adb, c-ops.ads, and c-ops.adb which contain these definitions. These names reflect the default predefined package name C. The user can make modified copies of these files to define a parent package for a particular C-to-Ada translation project.

    Ada 95 package hierarchy

    A useful feature of Ada 95 is parent-child package hierarchies. C2Ada translates Unix directory hierarchies into Ada 95 parent-child library hierarchies. If a C source file contains the lines
    #include "name1.h"
    #include "gccs/name2.h"
    #include "gccs/name3.h"
    
    and the parent package is set to gccs, the Ada package names that will be generated are
    package name1;
    package gccs.name2;
    package gccs.name3;
    
    Directories can be nested to any level, causing an equivalent nesting of Ada parent and child libraries.

    The cbind.map file

    If additional control is needed over the package names that are generated, the -mf command line option can be used causing the file cbind.map to change package name mappings. This file contains lines with 4 entries per line, separated by whitespace:
    gccs/string.h StringPkg stringpkg.ads stringpkg.adb
    
    This line says that a C #include file gccs/string.h will be mapped into Ada name "package StringPkg", and the Ada code that is generated will go into files stringpkg.ads and stringpkg.adb.

    The C2ADA_PYTHONPATH environment variable

    C2Ada requires no environment variables to be set. The build procedure predefines a path along which to find the required Python modules. These modules fall into two categories and two corresponding directories:
    1. source modules: modules written specifically as integral parts of C2Ada. These are in the source directory.
    2. library modules: these are modules distributed with the Python language system.

    If you have any reason to override the default path, you can define the environment variable C2ADA_PYTHONPATH. It must include the directories that contain the source and library modules.

    Translating C Preprocessor Macros

    C2Ada can automatically translate some C macros into Ada but it has a hard time with some others. There are two ways to get good translations of macros. One is Configuring C Macro Translations. The other is explained here. The idea is to change the C source to use C++ inline function syntax. For example
    #define max(a,b) ((a)>=(b)?(a):(b))
    
    can be changed in the C source to
    inline int max(int a,int b) {return ((a)>=(b)?(a):(b));}
    
    (presuming that the user decides the parameters and result are all of type "int".) This will then produce an Ada specification:
    function max (a, b: integer) return integer;
    pragma inline(max);
    
    ... and a body for max, in the Ada package body, that does the right thing. C2Ada cannot guess the types of the parameters and the type of the function result but if this information is supplied by the user C2Ada can do the rest.

    Other options

    Typing the command c2ada with no options causes it to print a full list of its options.

    Generating a main program

    If you're translating a complete C program, the source directory contains a utility program to write an Ada main procedure to call the translated C main procedure. The program is AdaMain.py, a Python script. The command line is:

    python Adamain.py cmain predef unit [filename]
    
    where cmain is the name of Ada unit containing the translated C main(); predef is the name of the predefined Ada package; unit is the name of the output compilation unit (an Ada procedure); filename is the pathname to which the unit is written -- if filename is omitted, then output appears on stdout.

    Configuration

    The C2Ada translator is able to translate most C constructs into equivalent Ada constructs. In some cases, however, the information in the C source is inadequate to suggest the proper action, or several translations are possible, or the translator simply needs a hint about what to do. Most of this information is specific to the corpus of code being translated.

    Specifying this additional information to the translator is the role of the configuration file. This file contains a series of statements relevant to the overall translation, to particular source files, and to particular C declarations and macros. C2Ada interprets these statements to build a data structure which guides the translation process. Hence configuration information about particular entities does not need to be presented in any particular order.

    Using the configuration file (rather than just editing the output Ada code) may be of particular benefit if you're tracking revisions of a C source. Since the configuration file identifies constructs by name, and the names of most existing constructs don't change from version to version, most of the changes you specify can be automatically performed for the new version.

    The configuration file facility is implemented using the programming language Python. Python is a simple yet powerful object-oriented language whose syntax and semantics (and the fact that it can be embedded in C!) make it quite suitable for use as a scripting language. The configuration file is in fact a series of Python statements which are interpreted in an environment that defines useful objects and classes.

    (These objects and classes are defined in the Python source file C2ada.py.)

    A knowlege of Python isn't necessary to read and write a configuration file, however. As new syntactic constructs are used in the examples which follow, a brief discussion of the syntax appears like this:

    (Blank lines and whitespace may be used freely. A comment starts with the character # and ends at the end of line.)

    The name of the configuration file is specified by using the -P command line switch.

    Specifying the configuration

    Statements in the configuration file which place information in the configuration data structure must reference the single configuration object. The expression which denotes this object is the.

    (Case is significant in identifiers.)

    The attributes of this object are:

    reserved_names
    A list of names to be avoided in naming Ada declarations. The names in this list are typically the names of Ada units which might conflict with symbol names.
    source
    A method which maps C source file names to the object representing that source file.

    Configuring reserved names

    Assigning a list of names to the.reserved_names configures those names as reserved names. For example, a project using X Windows is apt to have X as the name of a parent package. To allow direct used of qualified names referring to any X Windows symbols, the translator should avoid assigning the name X to any local entities. The declaration to do this is
    the.reserved_names = ['X']
    
    (The dot notation selects an attribute or method of an object. The equal sign is assignment. Brackets enclose a comma-separated list -- here the list is one item long. Single or double quotes enclose string literals. A statement ends at logical end-of-line, which is the physical end-of-line unless the last character on the line is backslash (\). )

    Specifying a source file

    The expression to specify the source file whatever.c is
    the.source('whatever.c')
    
    or, equivalently,
    the.source("whatever.c")
    
    (Methods are invoked by enclosing the comma-separated argument list in parentheses.)

    This construct always designates the same C_source object, which is created the first time the file name is used in this construct.

    In this version of C2Ada, the argument to the.source must be the pathname of the file exactly as it is used to open the file. For instance, if the C source contains the include directive

    #include "header.h"
    
    and C2Ada finds header.h in the directory dir, then the expression to designate the corresponding object is
    the.source('dir/header.h')
    

    C_source objects

    A C_source object holds information about the translation of a C source file, and about the translation of declarations and macros within that file. The attributes and methods are:

    decl
    A method that maps a C identifier into an object (of class Decl holding configuration information about the translation of the denoted C construct.
    macro
    A method that maps a C preprocessing identifier for a macro into an object (of class Macro) holding configuration information about the denoted macro.
    interfaces(filename)
    The source should be a .h file: then filename should be the .c file which it interfaces.
    unchecked_conversions_to_spec
    A boolean attribute that specifies whether the unchecked conversions which are generated in the body should be declared in the Ada package's spec rather than body. [TBD: This attribute is no longer necessary?]

    Configuring C modules

    A common style in C code is to write a module as a file whatever.c, then write an interface for that module as an include file whatever.h. (Of course, nothing in C requires, or particularly supports, doing things this way.)

    To specify that a .h and .c file are related, use a command like this:

    the.source('whatever.h').interfaces('whatever.c')
    

    Specifying a C declaration

    A C declaration is specified in an expression like this:
    the.source('whatever.h').decl('whichever')
    

    In this example, whichever is the name of a C entity declared in the file whatever.h.

    A C struct type with tag whatever can be specified in an expression like this:

    the.source('whatever.h').decl('struct whatever')
    

    Specifying a C macro

    A C macro is specified by indicating its source file and its name in an expression like this:

    the.source('whatever.h').macro('WHICHEVER')
    

    Configuring C Declarations

    The attributes and methods associated with a C declaration (class Decl) are:

    ada_name
    The Ada name to use for the equivalent symbol in the translation; overrides the default generated name.
    return_type_is_void
    A boolean which specifies whether the declaration (expected to be a function without an explicit type in C) should be translated into a procedure rather than a function returning int.
    private
    A boolean specifying whether the declaration (expected to be a pointer type or an incomplete struct type) is to be translated into an Ada private type.
    declare_in_spec
    A boolean specifying whether the declaration (expected to be a external function definition in a .c source file) is to be declared in the package specification, as well as the package body, of the resulting package.

    Configuring a private type

    A C idiom for an opaque or private type is a pointer to an incomplete struct. Typically the type is used or typedef'd in an include file that serves as a module interface, and the struct is actually defined in the source file serving as the module body. For instance, the include file whatever.h may contain

    typedef struct example * Handle;
    
    but with no definition of struct example. The straightforward Ada translation into a package specification,
        type struct_example;
        type struct_example_access is access all struct_example;
        subtype Handle is struct_example_access;
    
    is unsatisfactory because these declarations are only valid if there's a definition of struct_example in the public part of the package spec.

    Adding these statements to the configuration file:

    the.source('whatever.h').decl('struct example').private = True
    the.source('whatever.h').decl('Handle') = True
    
    (True and False are predefined.)
    causes the translation to be:
    -- (in public part of spec)
    
        type struct_example_access is private;
        type null_struct_example_access : constant struct_example_access;
    
        type Handle is private;
        null_Handle : constant Handle;
    
    private
    
        type Struct_example;
        type Struct_example_access is access all struct_example;
        null_struct_blah_access : constant struct_blah_access := null;
    
        type Handle is access all struct_example;
        null_Handle : constant Handle := null;
    
    

    The configuration statement for struct example causes the equivalent Ada type, struct_example to be declared in the private part, and an access-to-object type for this type to be declared private in the the public part, then defined in the private part. In addition, the name for the null value of this pointer type is declared.

    Similarly, the configuration statement for Handle causes a private type, and a null value for the access type, to be declared; the private part contains the requisite declarations and definitions.

    Configuring visible function declarations

    In C, an external function definition is implicitly visible in parts of the program outside the source file it is defined in, whether the call site has an explicit declaration visible or not.

    The default translation performed by C2Ada makes no externally visible declaration of such a function, unless there's a C declaration of the function in an include file, and that include file is configured with an interfaces statement.

    The declare_in_spec attribute of a Decl object is used to override this default behavior. If the function func is defined in the source file whatever.c, then a statement like this will cause a declaration of the function to be placed in the resulting Ada package spec:

    the.source('whatever.c').decl('func').declare_in_spec = True
    

    Configuring C Macro Translations

    Specifying a C macro in the configuration file indicates an intention to replace the macro's #define directive with a C declaration. the attribute(s) specified for the macro determine how to rewrite the #define directive as a C declaration. This facility for configuring macro rewrites currently provides no capabilities that couldn't be achieved by modifying the source file. (Note, however

    The attributes and methods associated with a C macro (class Macro) are:

    replacement
    The text which should completely replace the #define directive: the preprocessor deletes the #define directive and forwards the replacement text to the C parser.

    If the replacement attribute is specified, C2Ada does not examine any other attributes.

    signature
    Specifies the argument list of the inline function which will replace the macro definition.
    returns
    The type returned by the function which will replace the macro definition; if unspecified, the function will have void return type.

    Configuring a function-like macro

    Say the C source file whatever.h contains a macro like this:

    #define AccessCount(x) ((x)->hits)
    

    There is little that C2Ada can do to translate this into a function. (In fact what it does is perform C-preprocessor expansion at any call site.) It is not clear from examining the definition what the type of x, or what the type of the resulting expression.

    But the user may know that x is always a pointer to an object of type struct info, and that the hits field of this record has type short int. The user could then declare signature and returns attributes that provide C2Ada with enough information to transform the definition into a declaration:

    ac = the.source('whatever.h').macro('AccessCount')
    ac.signature = 'struct info * x'
    ac.returns   = 'short int'
    
    (The first statement, for convenience, assigns the Macro object to the local variable ac; this variable requires no declaration.)
    C2Ada then rewrites the declaration as:
    inline short int AccessCount(struct info * x) { return (x)->count; }
    

    inline is a special extension to the ANSI C syntax accepted by C2Ada to tag functions whose translations should be given a pragma Inline declaration in Ada.


    Restrictions

    Environment restrictions

    The C2Ada code was developed on SunOs 4.1.3 using the Gnu C compiler gcc. (In principle, there's no reason to think that the code can't be built and run in other environments.)

    C2Ada is written using ANSI C, so it cannot be built with an old K&R C compiler like the one that comes with SunOS.

    The translator generally produces portable Ada 95 code, relying for the most part only on the additional facilities of the Interfaces.C package. But in some places, C2Ada assumes that the target Ada compiler is GNAT, the GNU Ada compiler. Features specific to GNAT incorporated in the translation are:

    C Source Code Restrictions

    ANSI/ISO C

    C2Ada assumes that the C source code can be compiled by a compiler that conforms to the ANSI/ISO standard for the C language.

    C2Ada supports as an extension the keyword inline. C code which contains this word as an identifier must be modified.

    C2Ada supports C "/* ... */" comments and also C++ "//" comments.

    No function declarations without arguments

    In C, a function may be declared without specifying the number or types of its arguments. There is no corresponding construct in Ada (and a good thing, too). C2Ada will translate such a construct (and emit a warning), and assume that there's a 0-length argument list, but this assumption is usually wrong.

    No circular dependencies

    Suppose we have C files a.h and a.c that implement a module A, and files b.h and b.c that implement a module B. It is quite possible for b.c to include a.h, and a.c to include b.h.

    We cannot translate these files to the spec and body for packages A and B, however: the rules of Ada simply do not allow it.

    Bugs, Inelegant Translations, and Unimplemented Features

    Release notes

    Current release notes

    At present (7 June 2011) there is no formal release at Sourceforge. Please check out from Subversion:
       svn co https://c2ada.svn.sourceforge.net/svnroot/c2ada/trunk c2ada
    

    Old release notes

    The current release is the third beta release. Beta releases can be identified by examining the catalog.des file in the release. The first line of that file contains a Revision field. The first beta was Revision 1.135; second, 1.136; third, 1.138.

    Beta 3 corrects these problems:

    Beta 2 fixed a problem with hardwired pathnames. The initial beta release of C2Ada had a hardwired pathname in the file symset.c. The simplest way to correct this problem (aside from picking up the current release) is to edit Makefile, delete symset.o, and re-make. The required edit is to change the line

    PYTHON_SCRIPTS_PATH = $(HERE)
    
    to
    PYTHON_SCRIPTS_PATH = $(HERE):$(PYTHON)/Lib
    

    Aside from fixing the problem described above, Beta 2 added the optional C2ADA_PYTHONPATH environment variable.

    Copyright and Warranty Disclaimer

    The C2Ada source code is NOT copyrighted but is instead published to the public domain as free software. Any attempt to copyright the source code will be refutable in a court of law.

    If the C source file you are translating into Ada is copyrighted then the resulting Ada translation may also be copyrighted.

    This software is provided with the hope that it will be useful but it is provided "as is" without any warranty whatsoever.


    Please report problems to the Sourceforge bug tracker.