CodeGen tool

CodeGen is the tool used to generate Java code and a binding definition from an XML schema. It currently handles most types of schema definitions, but as with most data binding tools some aspects of schema with are not completely supported. These unsupported or partially-supported schema features including the following:

  • Schemas using <xs:any> extension points, where the <xs:any> is not the last item in a sequence. Most often <xs:any> is used at the end of a content model, since that way it can be used to provide compatiblity with future extensions of the model adding more details to the content model. It doesn't have to be used in this way, though, and in particular it can be used anywhere within a content model if it has the attribute namespace="##other". This usage is not currently supported by CodeGen.
  • Schemas using <xs:anyAttribute> extension points. <xs:anyAttribute> handling is not yet implemented by CodeGen, and is unlikely to be supported until JiBX version 2.0.
  • minOccurs values other than "0" or "1", maxOccurs values other than "1" or "unbounded"; CodeGen treats any maxOccurs value greater than "1" as equivalent to "unbounded" (and ignores the minOccurs value in this case, allowing any number of occurrences, including none). It also treats minOccurs values greater than "1" as equivalent to "1". This means that there are really only three variations of minOccurs/maxOccurs generated by CodeGen: Optional components, with minOccurs="0" and maxOccurs="1"; required components, with minOccurs="1" and maxOccurs="1"; and repeating components, with maxOccurs greater than "1".
  • <xs:union> simple type derivations are currently handled as simple string values
  • The only type of simple type <xs:restriction> facet currently processed by CodeGen is the <xs:enumeration> facet.

Running CodeGen

CodeGen executes as a Java application, meaning it needs to be run directly from a console window using the "java" command, or though some equivalent technique (such as an Ant <java> task, discussed below). However it's being run, you need to include jibx-tools.jar from your JiBX installation lib directory in the Java classpath. You'll also need several of the other jars from the JiBX lib directory (including jibx-bind.jar, jibx-schema.jar, and jibx-run.jar, along with log4j.jar and all the eclipse jars). As long as these jars are in the same directory as jibx-tools.jar you don't need to list them in the classpath, though - they'll be picked up automatically.

The CodeGen application main class is org.jibx.schema.codegen.CodeGen, and it takes as parameters the names or name patterns (using '*' wildcard characters) for schemas to be used as the basis for code generation. Only "top-level" schemas need to be specified; schemas referenced by means of xs:include or xs:import will automatically be loaded by CodeGen and included in the code generation.

Here's a sample of running CodeGen on Unix/Linux systems from the examples/codegen directory of the distribution (in a single line, shown split here only for formatting):

java -cp ../../lib/jibx-tools.jar org.jibx.schema.codegen.CodeGen
 otasubset/OTA_AirLowFareSearchRQ.xsd

On Windows, the corresponding command line is:

java -cp ..\..\lib\jibx-tools.jar org.jibx.schema.codegen.CodeGen
 otasubset\OTA_AirLowFareSearchRQ.xsd

By default, CodeGen output just goes to the current directory where it was executed. The generated root binding definition is named binding.xml, and the generated Java package(s) is derived from the schema namespace.

When working with large schemas you may find performance to be a problem using the standard JVM memory settings. You should be able to dramatically improve performance by increasing your Java runtime memory settings (with the JVMs provided by Sun this is done using the -Xms and -Xmx command line flags, so passing the command line parameters -Xms512M -Xmx512M would increase the memory available to the JVM from the standard 16 megabytes to 512 megabytes).

Using build tools

You can easily run CodeGen from an Ant build, just as you would any other Java application. The build.xml in the examples/codegen directory gives an example of this (which passes an optional generation directory path parameter, in addition to a schema file path pattern), as shown below:

  <!-- set classpath for compiling and running application with JiBX -->
  <path id="classpath">
    <fileset dir="${jibx-home}/lib" includes="*.jar"/>
    <pathelement location="bin"/>
  </path>
  ...
  <!-- generate using default settings -->
  <target name="codegen" depends="check-runtime,clean">
    
    <echo message="Running code generation from schema"/>
    <java classname="org.jibx.schema.codegen.CodeGen" fork="yes"
        classpathref="classpath" failonerror="true">
      <arg value="-t"/>
      <arg value="gen/src"/>
      <arg value="otasubset/OTA_AirLowFareSearch*.xsd"/>
    </java>
    
  </target>

Most IDEs allow you to directly execute an Ant build target, so you can use the Ant approach to running CodeGen from within your IDE.

You can change the default memory size for the <java> Ant task using nested <jvmarg> parameters. By way of example, here's how you'd change the above Ant target to use 512 megabytes of memory:

    <java classname="org.jibx.schema.codegen.CodeGen" fork="yes"
        classpathref="classpath" failonerror="true">
      <jvmarg value="-Xms512M"/>
      <jvmarg value="-Xmx512M"/>
      <arg value="-t"/>
      <arg value="gen/src"/>
      <arg value="otasubset/OTA_AirLowFareSearch*.xsd"/>
    </java>

Command line parameters

You can pass a variety of command line parameters to CodeGen, as listed below in alphabetical order:

Parameter Purpose
-b name Generated root binding definition file name (default name is binding.xml)
-c path Path to input customizations file
-i path1,path2,... Include existing bindings and use mappings from the bindings for matching schema global definitions (this is the basis for modular code generation)
-n package Default package for code generated from schema definitions with no namespace (default is the package "dflt", if not set)
-p package Default package for code generated from all schema definitions
-s path Root directory path for schema definitions (so that simple names can be used when specifying multiple schemas)
-t path Target directory path for generated output (default is current directory)
-u uri Namespace applied in code generation when no-namespaced schema definitions are found (to generate no-namespaced schemas as though they were included in a particular namespace)
-v Verbose output
-w Wipe all files from target directory before generating output (ignored if the target directory is the same as the current directory)

You need to specify one or more schema paths or file path patterns as command line parameters to CodeGen. Each schema you specify is used as a starting point for generating code and binding definitions. CodeGen examines each specified schema to find references to other schemas, and then recursively examines the referenced schemas, to find the complete set of schemas used to represent the data. It then generates code and binding definitions for all of these schemas. The schema names or file path patterns must be at the end of the command line, following any other command line parameters. '*' wildcard characters can be used in schema names, but only as part of file paths. Schema names can also be specified using HTTP or other forms of URLs, but wildcard characters are not allowed in this case.

Finally, you can pass global customizations to CodeGen as command-line parameters, by using -- as a special prefix to the customization attribute name. This is explained in more detail in the CodeGen customization reference page. Before digging into the details of customizations you may find it useful to review the CodeGen examples to learn how CodeGen works and see some basic applications of customizations.

Logging support

CodeGen includes logging code at a variety of levels of detail, using the log4j library. The jibx-tools.jar includes a default log4j.properties which only supports ERROR level logging, with output to the console. This default properties file can be overridden by another log4j.properties which is placed earlier in the classpath. The log4j.properties file in the examples/codegen directory is supplied as a sample, which can be activated by changing the classpath definition in the Ant build.xml as follows:

  <!-- set classpath for compiling and running application with JiBX -->
  <path id="classpath">
    <pathelement location="."/>
    <fileset dir="${jibx-home}/lib" includes="*.jar"/>
    <pathelement location="bin"/>
  </path>

This logging support is only intended for use by JiBX developers and others who are investigating the operation of the CodeGen program. The logging information is generally not useful to end users.