Package tigase.xml


package tigase.xml

Simple XML parser implementation.

This package contains simple XML parser implementation. The main idea was to create lightweight parser supporting multithreaded processing with special focus on efficiency. So it supports only basic XML structures but enough for many simple cases like parsing XML streams from network connections, processing XML files containing configuration or for simple XML data base implementation.

Base classes define and implement SAX style parser:

  • SimpleParser - implementation of SAX parser. This is very basic implementation of XML parser designed especially to be light and parse XML streams like jabber XML stream. It is very efficient, capable of parsing parts of XML document received from the network connection as well as handling a few XML documents in one buffer. This is especially useful when parsing data received from the network. Packets received from the network can contain non-complete XML document as well as a few complete XML documents. It doesn't support XML comments, processing instructions, document inclusions. Actually it supports only:
    • Start element event (with all attributes found).
    • End element even.
    • Character data event.
    • 'Other-XML' data event - everything between '<' and '>' if after < is '?' or '!'. So it can 'catch' doc-type declaration, processing instructions but it can't process correctly commented blocks.
    Although very simple this implementation is sufficient for Jabber protocol needs and is even used by some other packages of this server like implementation of UserRepository based on XML file or server configuration.

    It is worth to note also that this class is fully thread safe. It means that one instance of this class can be simultaneously used by many threads. This is to improve resources usage when processing many client connections at the same time.

  • SimpleHandler - parser handler interface for event driven parser. It is very simplified version of org.xml.sax.ContentHandler interface created for SimpleParser needs. It allows to receive events like start element (with element attributes), end element, element c-data, other XML content and error event if XML error found.

Based on above SAX parser there is also DOM implementation. Classes used to build DOM for XML content are:

  • DomBuilderHandler - implementation of SimpleHandler building DOM structures during parsing time. It also supports creation of multiple, sperate document trees if parsed buffer contains a few XML documents. As a result of work it returns always Queue containing all found XML trees in the same order as they were found in network data.
    Document trees created by this DOM builder consist of instances of Element class or instances of class extending Element class. To receive trees built with instances of proper class user must provide ElementFactory implementation creating instances of required ELement extension.
  • Element - basic document tree node implementation. Supports Java 5.0 generic types to make it easier to extend and still preserve some useful functionality. Sufficient for simple cases but probably in the most more advanced cases should be extended with additional features. Look in API documentation for more details and information about existing extensions. The most important features apart from obvious tree implementation are:
    • toString() implementation so it can generate valid XML content from this element and all children.
    • addChild(...), getChild(childName) supporting generic types.
    • findChild(childPath) finding child in subtree by given path to element.
    • getChildCData(childPath), getAttribute(childPath, attName) returning element c-data or attribute from child in subtree by given path to element.
  • ElementFactory is interface definition for factories creating proper instances of Element class or its extension.
  • DefaultElementFactory is an ElementFactory implementation creating instances of basic Element class. This implementation exists to offer complementary implementation of DOM. It can be used when basic Element class is sufficient for particular needs.
  • SingletonFactory provides a way to use only one instance of SimpleParser in all your code. Since SimpleParser if fully thread safe implementation there is no sense to use multiple instances of this class. This in particular useful when processing a lot of network connections sending XML streams and using one instance for all connections can save some resources.
    Of course it is still possible to create as many instances of SimpleParser you like in normal way using public constructor.