XML Schema – Reusable groups of elements

Want to avoid sprawling, arbitrary inheritance hierarchies in your XML schema but keep the resulting XML free of nested, complex elements? Groups might be just what you’re looking for.

It is a truth universally acknowledged that clients are always a pain in the arse. Recently asked, for an undisclosed reason (but my usual fee), to flatten an XML structure I found myself wondering what to do with a nested sequence-based element whose fields did not fit into my nice, meaningful inheritance hierarchy but which I did not want to cut-and-paste into the elements that required them.

Put simply, what I really needed was aggregation rather than inheritance but without the common sets of fields being wrapped in a parent element.

The solution I found was XML Schema Groups.

The problem

To illustrate the problem I’ll use the example of modelling account details for the world’s most simple and least evil bank.

We’ve got a parent element account which we specialise into different types of account, mortgageAccount, loanAccount, savingsAccount, etc:-

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
        targetNamespace="https://www.devguerrilla.com/xml/accounts"
        xmlns:tns="https://www.devguerrilla.com/xml/accounts"
        elementFormDefault="qualified">

    <!--  Common properties of all accounts -->      
    <complexType name="accountType">
        <sequence>
            <element name="number" type="string" minOccurs="1" 
                     maxOccurs="1"/>
            <element name="name" type="string" minOccurs="1" 
                     maxOccurs="1"/>
            <element name="balance" type="float" minOccurs="1" 
                     maxOccurs="1"/>
            <element name="interestRate" type="float" minOccurs="1" 
                     maxOccurs="1"/>
        </sequence>
    </complexType>

   <!-- Delinquency details - only relevant to debt accounts (though 
        potentially also for, say a current account overdraft -->
   <complexType name="delinquencyDetailsType">
       <sequence>
           <element name="daysOverdue" type="unsignedInt" minOccurs="1" 
                    maxOccurs="1"/>
           <element name="totalOverdue" type="float" minOccurs="1" 
                    maxOccurs="1"/>
       </sequence>
   </complexType>

   <!-- Savings account with an optional notice period -->
   <complexType name="savingsAccountType">
       <complexContent>
           <extension base="tns:accountType">
               <sequence>
                   <element name="noticePeriod" type="unsignedInt" 
                            minOccurs="0" maxOccurs="1"/>
               </sequence>
           </extension>
       </complexContent>
    </complexType>

    <!--  Mortgage account - monthly payment, asset details 
          and optional delinquency details -->
    <complexType name="mortgageAccountType">
       <complexContent>
           <extension base="tns:accountType">
               <sequence>
                   <element name="securedAsset" type="string" 
                            minOccurs="1" maxOccurs="1"/>
                   <element name="repayment" type="float" 
                            minOccurs="1" maxOccurs="1"/>
                   <element name="delinquent" 
                            type="tns:delinquencyDetailsType" 
                            minOccurs="0" maxOccurs="1"/>
               </sequence>
           </extension>
       </complexContent>
    </complexType>

    <!--  Mortgage account - monthly payment amount and 
          optional delinquency details -->

    <complexType name="loanAccountType">
        <complexContent>
            <extension base="tns:accountType">
                <sequence>
                    <element name="repayment" type="float" 
                             minOccurs="1" maxOccurs="1"/>
                    <element name="delinquent" 
                             type="tns:delinquencyDetailsType" 
                             minOccurs="0" maxOccurs="1"/>
                </sequence>
            </extension>
        </complexContent>
    </complexType>

    <element name="accounts">
        <complexType>
            <choice minOccurs="1" maxOccurs="unbounded">
                <element name="savingsAccount" 
                         type="tns:savingsAccountType"/>
                <element name="mortgageAccount" 
                         type="tns:mortgageAccountType"/>
                <element name="loanAccount" 
                         type="tns:loanAccountType"/>
            </choice>
        </complexType>
    </element>

</schema>

Since mortgages and loans can go overdue we have a delinquencyDetails element to capture this, an element which is only meaningful for certain types of accounts:-

<?xml version="1.0" encoding="UTF-8"?>
<accounts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xmlns="https://www.devguerrilla.com/xml/accounts">

    <loanAccount>
        <number>000-001</number>
        <name>James Bloggs</name>
        <balance>-1000.00</balance>
        <interestRate>8.3</interestRate>
        <repayment>50.00</repayment>
        <delinquent>
            <daysOverdue>60</daysOverdue>
            <totalOverdue>100.00</totalOverdue>
        </delinquent>
    </loanAccount>

    <mortgageAccount>
       <number>000-002</number>
       <name>Fred Bloggs</name>
       <balance>100000.00</balance>
       <interestRate>3.5</interestRate>
       <securedAsset>24 Acacia Avenue</securedAsset>
       <repayment>800.00</repayment>
       <delinquent>
           <daysOverdue>45</daysOverdue>
           <totalOverdue>1600.00</totalOverdue>
       </delinquent>
   </mortgageAccount>

   <savingsAccount>
       <number>000-003</number>
       <name>Sally Smith</name>
       <balance>3000.00</balance>
       <interestRate>2.5</interestRate>
       <noticePeriod>30</noticePeriod>
   </savingsAccount>

</accounts>

Fussy about their XML whilst curiously indifferent to the sheer amount of relevant data they’re not capturing, our bank likes delinquencyDetails but doesn’t want these fields included in a parent element:-

<mortgageAccount>
    <number>000-002</number>
    <balance>100000.00</balance>
    <interestRate>3.5</interestRate>
    <name>Fred Bloggs</name>
    <repayment>800.00</repayment>
    <securedAsset>24 Acacia Avenue</securedAsset>
    <daysOverdue>45</daysOverdue>
    <totalOverdue>1600.00</totalOverdue>
</mortgageAccount>

So, what are our options here? We could add these fields to the account element and just ignore them for credit accounts but this allows us to create totally meaningless XML such as a savings account that’s in arrears, the sort of thing schema constraints are there to help catch.

We could create a second level of inheritance – debtAccount  – to store these additional fields, but using hierarchies to model aspects of data rather than classifications of data can quickly become painful. Say we wanted to add cardDetails to our accounts – a creditCardAccount and a currentAccount would have these, but a savingsAccount and mortgageAccount wouldn’t, yet a creditCardAccount would be a debtAccount whereas a currentAccount would be a creditAccount. Head spinning? It should be.

Favouring neither option, lazy developers (and I often am one) might be inclined to just cut and paste the nested fields into each element that needs them. Needless to say that’s both wasteful and a potential maintenance headache down the line.

Fortunately it turns out that there’s a next best thing to cut-and-paste.

Re-usable element groups.

The solution

To use a complexType defining a sequence of elements as just that sequence of elements and not an element in its own right, we simply need to change its declaration from <element/> to <group/>:-

<group name="delinquencyDetailsType">
    <sequence>
        <element name="daysOverdue" type="unsignedInt" minOccurs="1" 
                 maxOccurs="1"/>
        <element name="totalOverdue" type="float" minOccurs="1" 
                 maxOccurs="1"/>
    </sequence>
</group>

And, wherever we use this element in our XSD, we reference the group instead:-

<complexType name="mortgageAccountType">
    <complexContent>
        <extension base="tns:accountType">
            <sequence>
                <element name="securedAsset" type="string" 
                         minOccurs="1" maxOccurs="1"/>
                <element name="repayment" type="float" 
                         minOccurs="1" maxOccurs="1"/>
                <group ref="tns:delinquencyDetailsType" 
                       minOccurs="0" maxOccurs="1"/>
            </sequence>
        </extension>
    </complexContent>
</complexType>

Presto! Our flattened version of delinquencyDetails now validates.

Leave a Reply

Your email address will not be published. Required fields are marked *