Thursday, May 31, 2012

SAX vs DOM Parsers

SAX parser can get better speed. A tree-based API is centered around a tree structure and therefore provides interfaces on components of a tree (which is a DOM document) such as Document interface,Node interface, NodeList interface, Element interface, Attr interface and so on. By contrast, however, an event-based API provides interfaces on handlers. There are four handler interfaces, ContentHandler interface, DTDHandler interface, EntityResolver interface and ErrorHandler interface.
DOM parsers and SAX parsers work in different ways.

* A DOM parser creates a tree structure in memory from the input document and then waits for requests from client. But a SAX parser does not create any internal structure. Instead, it takes the occurrences of components of a input document as events, and tells the client what it reads as it reads through the input document.

* A DOM parser always serves the client application with the entire document no matter how much is actually needed by the client. But a SAX parser serves the client application always only with pieces of the document at any given time. * With DOM parser, method calls in client application have to be explicit and forms a kind of chain. But with SAX, some certain methods (usually overriden by the cient) will be invoked automatically (implicitly) in a way which is called "callback" when some certain events occur. These methods do not have to be called explicitly by the client, though we could call them explicitly.

Ideally a good parser should be fast (time efficient),space efficient, rich in functionality and easy to use . But in reality, none of the main parsers have all these features at the same time. For example, a DOMParser is rich in functionality (because it creates a DOM tree in memory and allows you to access any part of the document repeatedly and allows you to modify the DOM tree), but it is space inefficient when the document is huge, and it takes a little bit long to learn how to work with it. A SAXParser, however, is much more space efficient in case of big input document (because it creates no internal structure). What's more, it runs faster and is easier to learn than DOMParser because its API is really simple. But from the functionality point of view, it provides less functions which mean that the users themselves have to take care of more, such as creating their own data structures.

SAX DOM
Both SAX and DOM are used to parse the XML document. Both has advantages and disadvantages and can be used in our programming depending on the situation.
Parses node by node Stores the entire XML document into memory before processing
Doesn’t store the XML in memory Occupies more memory
We cant insert or delete a node We can insert or delete nodes
Top to bottom traversing Traverse in any direction.
SAX is an event based parser DOM is a tree model parser
SAX is a Simple API for XML Document Object Model (DOM) API
import javax.xml.parsers.*;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
 
doesn’t preserve comments preserves comments
SAX generally runs a little faster than DOM SAX generally runs a little faster than DOM
If we need to find a node and doesn’t need to insert or delete we can go with SAX itself otherwise DOM provided we have more memory.

6 comments:

Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.