CodeGen customizations

Customizations are used to control how CodeGen interprets your schemas and generates both Java code and a corresponding JiBX binding. The customizations are normally supplied to CodeGen in the form of an XML document, though certain customizations can alternatively be set by means of command line parameters.

Customization document structure

The general form of a customizations document consists of a root schema-set element which applies to all the schemas being used, within which there may be both other schema-set elements and schema elements for individual schemas. If you're only working with a single schema, you can instead skip the schema-set element and use the schema element directly as the root of your customizations. Within a schema element, you can use elements with names corresponding to various schema definition components (attribute, complexType, element, etc.) to customize specific schema components, using XPath-like features to relate the customization elements to particular instances of the corresponding schema definition element.

Sound confusing? It's really pretty simple and intuitive, once you get the basic concepts down. Here's a sample using nested schema-set and schema elements for a complex collection of schemas, to show how this works:

<schema-set xmlns:xs="http://www.w3.org/2001/XMLSchema"
    type-substitutions="xs:integer xs:int xs:decimal xs:float">
  <schema-set package="org.ota.air" names="OTA_Air*.xsd">
    <schema-set generate-all="false" prefer-inline="true"
        names="OTA_AirCommonTypes.xsd OTA_AirPreferences.xsd"/>
    <schema name="OTA_AirAvailRS.xsd">
      <element path="element[@name=OTA_AirAvailRS]/**/element[@name=OriginDestinationOption]"
        ignore="true"/>
    </schema>
  </schema-set>
  <schema-set package="org.ota.hotel" names="OTA_Hotel*.xsd">
    <schema-set generate-all="false" prefer-inline="true"
        names="OTA_HotelCommonTypes.xsd OTA_HotelContentDescription.xsd
        OTA_HotelEvent.xsd OTA_HotelPreferences.xsd OTA_HotelReservation.xsd
        OTA_HotelRFP.xsd"/>
  </schema-set>
  ...

In this customizations document, the root schema-set element sets some customization options which apply to the full collection of schemas. The child schema-set elements each specify a particular subset of the schemas (which must be distinct from those specified by sibling schema-set elements); in this example, the first child schema-set applies to schemas with names matching the "OTA_Air*.xsd" pattern, the second to schemas with names matching "OTA_Hotel*.xsd". The first child schema-set has yet another schema-set element and a schema-set element as its children. Within the schema-set element, an element child is used to customize the handling of one particular xs:element component within the schema definition.

The remainder of this page discusses the customization elements used to model the structure of the collection of schemas being generated. In addition to these structural customization elements, there are a few other customization elements which are used to extend or modify the actual code generation. These extension customization elements must be direct children of a schema or schema-set element, and must precede any other child elements. See CodeGen Extension for the details of using these elements. The full schema definition for the customizations is included in the documentation.

schema-set and schema customizations

Customization attributes which apply at the schema level can be used with both schema-set and schema elements. These elements can be nested in the customizations document, and customizations are inherited through the nesting: A customization attribute on a schema-set applies to all the schemas in the set, but may be overridden by a different setting of the attribute on a nested schema-set or schema element. Here's the alphabetical list of these schema-level customization attributes:

Schema-level customization attributes

binding-file-name

Specify the file name to be used for the generated binding definition. If this is not specified the root binding is always named binding.xml and any other bindings use names derived from the namespace associated with the binding.

binding-per-schema

Control whether a separate binding definition will be generated for each schema definition. By default, bindings for schemas using the same namespace are merged into a single binding definition. Allowed values are true and false (the default).

delete-annotations

Delete annotations from schema fragments shown in Javadocs. This generally makes the schema fragments easier to understand, especially since xs:documentation elements in the schema are normally converted to Javadocs in any case. Allowed values are true (the default) and false.

enumeration-type

Control the type of classes generated for enumerations. Allowed values are java5 for Java 5 enum classes (the default) and simple for simple typesafe enumeration classes compatible with all Java compiler versions.

generate-all

If the value is false, skip any unused global schema definitions in the code generation. This is intended for use with schemas referenced by xs:include or xs:import, which often include definitions not needed by the original schema. By skipping code generation for these unnecessary definitions you can reduce the number of classes in the generated data model. Allowed values are true (the default) and false.

import-docs

Convert xs:documentation annotations in the schema definition to Javadocs in the generated code if true. Allowed values are true (the default) and false.

line-width

Specify the desired maximum line width in the generated Java code. The value can be any integer.

package

Give the name of the package to be used for generated Java code. The value can be any package name.

prefer-inline

If true, use inline definitions where possible rather than creating separate classees. Allowed values are true and false (the default).

prefix

Prefix to be used in the generated bindings for the namespace associated with a schema. Prefixes are normally assigned based on the usages found in the schema documents, but this customization can be used to change these values. If you use this customization on a schema-set element, all the schemas included in the set must use the same namespace.

repeated-type

Control how repeated schema components (both xs:list values, and particles with minOccurs > 1) are represented in Java code. Allowed values are array (for arrays), list (for untyped java.util.List), and typed (for Java 5 typed list, the default).

show-schema

If true, include schema fragments corresponding to the generated code in class Javadocs. The schema fragments included are based on a post-processing view of the schema, after processing type substitutions, deletions, and schema normalizations. Allowed values are true (the default) and false.

structure-optional

Control whether references to classes with no associated element and all components optional should be made optional in the generated binding. The effect of making such class references optional is that the reference will be set null when unmarshalling if none of the components are present, and will be checked for null when marshalling. Allowed values are true (the default) and false.

use-inner

Control whether inner classes are used for secondary structures within the generated Java code. If true inner classes will be used; otherwise, separate top-level classes will be used. This only applies for the equivalent of anonymous xs:complexType definitions, or definitions which have been inlined; top-level classes are always used when definitions are used in more than one place. Allowed values are true (the default) and false.

Besides these schema-level customizations, any of the nesting customizations listed in the next section can also be used on schema-set and schema elements. There are also some attributes which are only allowed with schema-set, and some which are only allowed with schema. Here's the list of these attributes for schema-set:

Attributes only allowed on schema-set element

names

List of name patterns for schemas included in this set. Individual patterns are whitespace-separated, and may include '*' characters as wildcards matching any number of arbitrary characters. Multiple '*' wildcards may be used within a single pattern, but may not be contiguous (i.e., there must be one or more regular characters between any pair of wildcards).

namespaces

List of namespace URIs for schemas included in this set. Individual URIs are whitespace-separated.

These attributes are not allowed on the root schema-set element of a customizations document, but at least one is required on any nested schema-set element (since they determine which schemas are actually included in the set). When multiple schema-set child elements are used, the sets of schemas identified by each schema-set must be disjoint.

Here's the list of attributes only allowed with schema customization elements:

Attributes only allowed on schema-set element

excludes

List of schema global definitions to be excluded from the code generation. Names are separated by whitespace characters. This overrides the normal reference checks used to determine which schema definitions are going to be generated as Java code.

includes

List of schema global definitions to be included in the code generation. Names are separated by whitespace characters. This overrides the normal reference checks used to determine which components are going to be generated as Java code.

name

The schema name, meaning the last component in the schema path (whether accessed from the file system, by using HTTP, or by any other means). No wildcard characters are allowed, so the name must be an exact match.

namespace

Schema target namespace URI. This can only be used to identify a schema if there's only one schema using that namespace.

Nesting customizations

Nesting customization attributes can be used on any customization element, including schema-set, schema, and schema component customizations (described below). Here's the alphabetical list of these attributes:

Nesting customization attributes

any-handling

Controls how xs:any particles are represented in the generated Java code and binding definition. Allowed values are discard (meaning discard when unmarshalling and don't generate when marshalling), dom (meaning use a org.w3c.dom.Element or list of elements for a repeating xs:any, the default), and mapped (meaning require any element(s) matching the xs:any to be defined as a global element in the schema).

choice-exposed

When true, the generated code directly exposes xs:choice states to the user in the generated code. In this case the constants used for the choice states are made public, and there's an added stateXXX() method which returns the current state of the choice. Otherwise, the choice state is only exposed to the user via ifXXX() methods checking if a particular state has been set. Allowed values are true and false (the default).

choice-handling

Control how xs:choice is implemented in the generated code. xs:choice handling always uses a separate property for each alternative in the choice, and in most cases also uses a state variable that tracks the most-recent setting. There are several options for how the state is set and changed, though, and this customization selects the option to be used. Allowed values are stateless (meaning there is no state variable, and it's up to the user to make sure only one of the choice values is set), checkset (meaning that when the 'set' access method for one of the choice properties is called the code will throw an exception if a different choice had previously been set and the clearXXX() method has not been called), checkboth (meaning that in addition to the checkset check on setting a choice property, the choice property 'get' access methods will also check that the current state is either unset or matches that property), overset (meaning that when the 'set' access method for one of the choice properties is called it will overwrite any previous choice), and overboth (meaning 'set' methods overwrite previous choices, while 'get' access methods check that the current state is either unset or matches that property).

enforced-facets

This is included for use once xs:simpleType facet handling is implemented, but is currently ignored.

ignored-facets

This is included for use once xs:simpleType facet handling is implemented, but is currently ignored.

union-exposed

This is included for use once full xs:union handling is implemented, but is currently ignored.

union-handling

This is included for use once full xs:union handling is implemented, but is currently ignored.

type-substitutions

Defines type substitutions to be applied before generating code. The substitutions are given as pairs of type names, with the type to be replaced first and the type to be substituted second. The type names are all treated as namespace-qualified values, and are separated by one or more whitespace characters.

Schema component customizations

Schema component customizations each apply to a particular element within a schema definition. The element name for the customization always matches the name of the schema definition element being customized (but without namespace), so the following are all legal schema component customization element names: any, anyAttribute, all, attribute, attributeGroup choice, complexType, element, extension, group, list, restriction, sequence, simpleContent, simpleType, and union. These component customizations can be nested, and use XPath-like path specifications (realative to the containing customization) to identify the particular instance of the schema definition element to which the component customization applies. All schema component customizations can use any of the nesting customization attributes defined in the last section, and also the following attributes:

Common component customization attributes

path

Path to the schema element to be customized. The path is in XPath-like form, with path steps separated by '/' characters. '*' can be used as a path step matching any arbitrary schema element, and '**' as a path step matching any nesting of arbitrary schema elements, with the restriction that these wildcard steps cannot be used as the initial path step if the customization element is a direct child of a schema-set customization - in other words, you can only use wildcards once you've identified the global schema component involved. Steps matching named components of the schema definition (global type, group, or attribute group definitions, or element or attribute definitions whether global or not) can use a '[@name=...]' predicate to single out a particular instance of the component type (which will match either a 'name' or 'ref' attribute value in the schema definition). Steps can also use a numeric predicate '[n]' to identify which of several potential matches is being referenced, where the numbering starts at '1' for the first match (as in XPath). The last path step may be left empty, since the element name for this path step must always be the same as the customization element name.

position

Number of the instance to be matched. This is equivalent to a '[n]' predicate on the last step in a path expression, but as a convenience can be used directly for cases where no path expression is otherwise required.

Beyond these common attributes, different types of customization elements support one or more added attributes as defined below:

Specialized component customization attributes

class-name

Java class name used for the representation of the schema component (ignored if no class required). This must be a simple class name, without package information (since the package is determined on a per-schema basis). This attribute is allowed on the following types of customization elements: all, attribute, attributeGroup, choice, complexType, element, group, sequence, simpleType.

exclude

Remove component from code generation if true. This effectively deletes the target component from the schema definition before code generation. Allowed values are true and false (the default). This attribute is allowed on the following types of customization elements: all, 'any', 'anyAttribute', attribute, attributeGroup, choice, complexType, element, group, sequence, simpleType.

ignore

Ignore element or attribute when unmarshalling documents. This drops the component from the generated data model, but accepts it and discards any content when unmarshalling (as opposed to the exclude behavior, which completely removes the component from the schema definition, meaning input documents containing an excluded element will cause errors in unmarshalling). Allowed values are true and false (the default). This attribute is allowed on the following types of customization elements: attribute and element.

inline

Control whether the schema component is represented using a separate class or inline values. The control provided by this customization is limited, in that all schema components which represent or contain data values can be represented by a separate class, but not all can be inlined by CodeGen. Allowed values are: default, meaning the default handling applies (the default); block, meaning the component should be represented using a separate class; and prefer, meaning inlining should be used if possible. Since the default CodeGen handling is to prefer inlining for local definitions, the prefer value generally only makes a difference when used on a global definition schema component. This attribute is allowed on all types of customization elements, but will be ignored if not applicable.

name

Schema component name attribute value. This is equivalent to an '[@name=XXX]' predicate on the last step in a path expression, but as a convenience can be used directly for cases where no path expression is otherwise required. The attribute is allowed on the following types of customization elements: attribute, attributeGroup, complexType, element, group, simpleType.

type

Substitute type to be used for component. This is used to replace the type specified in the schema for the target component with some other type before code generation. It is similar to using the 'type-substitutions' nesting customization attribute, but supports replacing anonymous type definitions in addition to global types. This attribute is allowed on the following customization elements: attribute, complexType, element, simpleType. Note: not yet supported

value-name

Name used for the Java property value representing the component, and also for the class name when the component is represented using a separate class. This customization is normally only useful when applied to nested components of a schema definition, rather than global definitions (so on group references, rather than group definitions, for instance). It is allowed on the following customization elements: all, attribute, attributeGroup, choice, element, group, sequence.

Command line customizations

If you just want to set some basic global customizations for CodeGen you do this using command-line parameters and avoid the need to create a customizations file. The special prefix "--" is used to do this. So to set delete-annotations="true" and any-handling="mapped", for instance, you'd add --delete-annotations=true and --any-handling=mapped to the BindGen command line. No quotes are needed for the attribute value when you use this technique. This technique only allows you to set global customizations, though, so if you're doing anything at the individual schema or component level you'll still need to use a customizations file.