mirror of
https://github.com/zeux/pugixml.git
synced 2025-01-03 01:55:25 +08:00
75a0d2379a
git-svn-id: http://pugixml.googlecode.com/svn/trunk@449 99668b35-9821-0410-8761-19e4c4f06640
824 lines
52 KiB
HTML
824 lines
52 KiB
HTML
<html>
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<title>pugixml documentation</title>
|
|
</head>
|
|
<body link="#0000ff" vlink="#800080">
|
|
<table border="0" cellpadding="4" cellspacing="0" width="100%" summary="header">
|
|
<tr>
|
|
<td valign="top" bgcolor="#eeeeeee">
|
|
<h2 align="left">pugixml documentation</h2>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<hr>
|
|
<h2>Contents</h2>
|
|
<dl class="index">
|
|
<dt><a href="#Introduction">Introduction</a></dt>
|
|
<dt><a href="#QuickStart">Quick start</a></dt>
|
|
<dt><a href="#Reference">Reference</a></dt>
|
|
<dt><a href="#Compliance">W3C compliance</a></dt>
|
|
<dt><a href="#ComparisonTable">Comparison with existing parsers</a></dt>
|
|
<dt><a href="#FAQ">FAQ</a></dt>
|
|
<dt><a href="#Bugs">Bugs</a></dt>
|
|
<dt><a href="#Future_work">Future work</a></dt>
|
|
<dt><a href="#Changelog">Changelog</a></dt>
|
|
<dt><a href="#Acknowledgements">Acknowledgements</a></dt>
|
|
<dt><a href="#License">License</a></dt>
|
|
</dl>
|
|
|
|
<hr>
|
|
|
|
<a name="Introduction">
|
|
<h2>Introduction</h2>
|
|
<p><i>pugixml</i> is just another XML parser. This is a successor to
|
|
<a href="http://www.codeproject.com/soap/pugxml.asp">pugxml</a> (well, to be honest, the only part
|
|
that is left as is is wildcard matching code; the rest was either heavily refactored or rewritten
|
|
from scratch). The main features are:</p>
|
|
|
|
<ul>
|
|
<li>low memory consumption and fragmentation (the win over <i>pugxml</i> is ~1.3 times, <i>TinyXML</i>
|
|
- ~2.5 times, <i>Xerces (DOM)</i> - ~4.3 times <a href="#annot-1"><sup>1</sup></a>). Exact numbers can
|
|
be seen in <a href="#ComparisonTable">Comparison with existing parsers</a> section.</li>
|
|
<li>extremely high parsing speed (the win over <i>pugxml</i> is ~6 times, <i>TinyXML</i> - ~10
|
|
times, <i>Xerces-DOM</i> - ~17.6 times <a href="#annot-1"><sup>1</sup></a></li>
|
|
<li>extremely high parsing speed (well, I'm repeating myself, but it's so fast, that it outperforms
|
|
<i>Expat</i> by <b>2.8 times</b> on test XML) <a href="#annot-2"><sup>2</sup></a></li>
|
|
<li>more or less standard-conformant (it will parse any standard-compliant file correctly, with the
|
|
exception of DTD related issues)</li>
|
|
<li>pretty much error-ignorant (it will not choke on something like <text>You & Me</text>,
|
|
like <i>expat</i> will; it will parse files with data in wrong encoding; and so on)</li>
|
|
<li>clean interface (a heavily refactored pugxml's one)</li>
|
|
<li>more or less Unicode-aware (actually, it assumes UTF-8 encoding of the input data, though
|
|
it will readily work with ANSI - no UTF-16 for now (see <a href="#Future_work">Future work</a>), with
|
|
helper conversion functions (UTF-8 <-> UTF-16/32 (whatever is the default for std::wstring & wchar_t))</li>
|
|
<li>fully standard compliant C++ code (approved by <a href="http://www.comeaucomputing.com/tryitout/">Comeau</a>
|
|
strict mode); the library is multiplatform (see <a href="#Reference">reference</a> for platforms
|
|
list)</li>
|
|
<li>high flexibility. You can control many aspects of file parsing and DOM tree building via parsing
|
|
options.
|
|
</ul>
|
|
|
|
<p>Okay, you might ask - what's the catch? Everything is so cute - it's small, fast, robust, clean solution
|
|
for parsing XML. What is missing? Ok, we are fair developers - so here is a misfeature list:</p>
|
|
|
|
<ul>
|
|
<li>memory consumption. It beats every DOM-based parser that I know of - but when SAX parser comes,
|
|
there is no chance. You can't process a 2 Gb XML file with less than 4 Gb of memory - and do it fast.
|
|
Though <i>pugixml</i> behaves better, than all other DOM-based parser, so if you're stuck with DOM,
|
|
it's not a problem.</li>
|
|
<li>memory consumption. Ok, I'm repeating myself. Again. When other parsers will allow you to provide
|
|
XML file in a constant storage (or even as a memory mapped area), <i>pugixml</i> will not. So you'll
|
|
have to copy the entire data into a non-constant storage. Moreover, it should persist during the
|
|
parser's lifetime (the reasons for that and more about lifetimes is written below). Again, if you're
|
|
ok with DOM - it should not be a problem, because the overall memory consumption is less (well, though
|
|
you'll need a contiguous chunk of memory, which can be a problem).</li>
|
|
<li>lack of validation, DTD processing, XML namespaces, proper handling of encoding. If you need those -
|
|
go take MSXML or XercesC or anything like that.</li>
|
|
<li>lack of UTF-16/32 parsing. This is not implemented for now, but this is the features for the next
|
|
release.</li>
|
|
</ul>
|
|
|
|
<hr>
|
|
|
|
<a name="annot-1"><sup>1</sup><small> The tests were done on a 1 mb XML file with a 4 levels deep tree
|
|
with a small amount of text. The times are that of building DOM tree. <i>pugixml</i> was run in default
|
|
parsing mode, so differences in speed are even bigger with minimal settings.</small> <br>
|
|
<a name="annot-2"><sup>2</sup><small> Obviously, you can't estimate time of building DOM tree for a
|
|
SAX parser, so the times of reading the data into storage that closely represented the structure of
|
|
an XML file were measured.</small>
|
|
|
|
<hr>
|
|
|
|
<a name="QuickStart">
|
|
<h2>Quick start</h2>
|
|
|
|
<p>Here there is a small collection of code snippets to help the reader begin using <i>pugixml</i>.</p>
|
|
|
|
<p>For everything you can do with <i>pugixml</i>, you need a document. There are several ways to obtain it:</p>
|
|
|
|
<table width = "100%" bgcolor="#e6e6e6"><tr><td><pre><font color="white">
|
|
<font color="#008000" >#include</font> <font color="#ff0000" ><fstream></font>
|
|
<font color="#008000" >#include</font> <font color="#ff0000" ><iostream></font>
|
|
|
|
<font color="#008000" >#include</font> <font color="#ff0000" >"pugixml.hpp"</font>
|
|
|
|
<b><font color="#0000ff" >using</font></b> <b><font color="#0000ff" >namespace</font></b> <font color="#000000" >std;</font>
|
|
<b><font color="#0000ff" >using</font></b> <b><font color="#0000ff" >namespace</font></b> <font color="#000000" >pugi;</font>
|
|
|
|
<b><font color="#0000ff" >int</font></b> <font color="#000000" >main()</font>
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Several ways to get XML document</font></i>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Load from string</font></i>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.load(</font><font color="#ff0000" >"<sample-xml>some text <b>in bold</b> here</sample-xml>"</font><font color="#000000" >)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
<font color="#000000" >}</font>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Load from file</font></i>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.load_file(</font><font color="#ff0000" >"sample.xml"</font><font color="#000000" >)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
<font color="#000000" >}</font>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Load from any input stream (STL)</font></i>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
|
|
<font color="#000000" >std::ifstream</font> <font color="#000000" >in(</font><font color="#ff0000" >"sample.xml"</font><font color="#000000" >);</font>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.load(in)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
<font color="#000000" >}</font>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// More advanced: parse the specified string without duplicating it</font></i>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
|
|
<b><font color="#0000ff" >char</font></b><font color="#000000" >*</font> <font color="#000000" >s</font> <font color="#000000" >=</font> <font color="#000000" >new</font> <b><font color="#0000ff" >char</font></b><font color="#000000" >[</font><b><font color="#40b440" >100</font></b><font color="#000000" >];</font>
|
|
<font color="#000000" >strcpy(s,</font> <font color="#ff0000" >"<sample-xml>some text <b>in bold</b> here</sample-xml>"</font><font color="#000000" >);</font>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.parse(transfer_ownership_tag(),</font> <font color="#000000" >s)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
<font color="#000000" >}</font>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Even more advanced: assume manual lifetime control</font></i>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
|
|
<b><font color="#0000ff" >char</font></b><font color="#000000" >*</font> <font color="#000000" >s</font> <font color="#000000" >=</font> <font color="#000000" >new</font> <b><font color="#0000ff" >char</font></b><font color="#000000" >[</font><b><font color="#40b440" >100</font></b><font color="#000000" >];</font>
|
|
<font color="#000000" >strcpy(s,</font> <font color="#ff0000" >"<sample-xml>some text <b>in bold</b> here</sample-xml>"</font><font color="#000000" >);</font>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.parse(</font><font color="#000000" >s)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
|
|
<font color="#000000" >delete[]</font> <font color="#000000" >s;</font> <i><font color="#808080" >// <-- after this point, all string contents of document is invalid!</font></i>
|
|
<font color="#000000" >}</font>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Or just create document from code?</font></i>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
|
|
<i><font color="#808080" >// add nodes to document (see next samples)</font></i>
|
|
<font color="#000000" >}</font>
|
|
<font color="#000000" >}</font>
|
|
</font></pre></td></tr><tr><td align="right"><b><i><a href="http://dobrokot.nm.ru/WinnieColorizer.html"><font color="#666666">_Winnie C++ Colorizer</font></a></i></b></td></tr></table>
|
|
|
|
<p>This sample should print a row of 1, meaning that all load/parse functions returned true (of course, if sample.xml does not exist or is malformed, there will be 0's)</p>
|
|
|
|
<p>Once you have your document, there are several ways to extract data from it.</p>
|
|
|
|
<table width = "100%" bgcolor="#e6e6e6"><tr><td><pre><font color="white">
|
|
<font color="#008000" >#include</font> <font color="#ff0000" ><iostream></font>
|
|
|
|
<font color="#008000" >#include</font> <font color="#ff0000" >"pugixml.hpp"</font>
|
|
|
|
<b><font color="#0000ff" >using</font></b> <b><font color="#0000ff" >namespace</font></b> <font color="#000000" >std;</font>
|
|
<b><font color="#0000ff" >using</font></b> <b><font color="#0000ff" >namespace</font></b> <font color="#000000" >pugi;</font>
|
|
|
|
<b><font color="#0000ff" >struct</font></b> <font color="#000000" >bookstore_traverser:</font> <b><font color="#0000ff" >public</font></b> <font color="#000000" >xml_tree_walker</font>
|
|
<font color="#000000" >{</font>
|
|
<b><font color="#0000ff" >virtual</font></b> <b><font color="#0000ff" >bool</font></b> <font color="#000000" >for_each(xml_node&</font> <font color="#000000" >n)</font>
|
|
<font color="#000000" >{</font>
|
|
<b><font color="#0000ff" >for</font></b> <font color="#000000" >(</font><b><font color="#0000ff" >int</font></b> <font color="#000000" >i</font> <font color="#000000" >=</font> <b><font color="#40b440" >0</font></b><font color="#000000" >;</font> <font color="#000000" >i</font> <font color="#000000" ><</font> <font color="#000000" >depth();</font> <font color="#000000" >++i)</font> <font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#ff0000" >" "</font><font color="#000000" >;</font> <i><font color="#808080" >// indentation</font></i>
|
|
|
|
<b><font color="#0000ff" >if</font></b> <font color="#000000" >(n.type()</font> <font color="#000000" >==</font> <font color="#000000" >node_element)</font> <font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >n.name()</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
<b><font color="#0000ff" >else</font></b> <font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >n.value()</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
|
|
<b><font color="#0000ff" >return</font></b> <b><font color="#0000ff" >true</font></b><font color="#000000" >;</font> <i><font color="#808080" >// continue traversal</font></i>
|
|
<font color="#000000" >}</font>
|
|
<font color="#000000" >};</font>
|
|
|
|
<b><font color="#0000ff" >int</font></b> <font color="#000000" >main()</font>
|
|
<font color="#000000" >{</font>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
<font color="#000000" >doc.load(</font><font color="#ff0000" >"<bookstore><book title='ShaderX'><price>3</price></book><book title='GPU Gems'><price>4</price></book></bookstore>"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// If you want to iterate through nodes...</font></i>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Get a bookstore node</font></i>
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >bookstore</font> <font color="#000000" >=</font> <font color="#000000" >doc.child(</font><font color="#ff0000" >"bookstore"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Iterate through books</font></i>
|
|
<b><font color="#0000ff" >for</font></b> <font color="#000000" >(xml_node</font> <font color="#000000" >book</font> <font color="#000000" >=</font> <font color="#000000" >bookstore.child(</font><font color="#ff0000" >"book"</font><font color="#000000" >);</font> <font color="#000000" >book;</font> <font color="#000000" >book</font> <font color="#000000" >=</font> <font color="#000000" >book.next_sibling(</font><font color="#ff0000" >"book"</font><font color="#000000" >))</font>
|
|
<font color="#000000" >{</font>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#ff0000" >"Book "</font> <font color="#000000" ><<</font> <font color="#000000" >book.attribute(</font><font color="#ff0000" >"title"</font><font color="#000000" >).value()</font> <font color="#000000" ><<</font> <font color="#ff0000" >", price "</font> <font color="#000000" ><<</font> <font color="#000000" >book.child(</font><font color="#ff0000" >"price"</font><font color="#000000" >).first_child().value()</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
<font color="#000000" >}</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// Book ShaderX, price 3</font></i>
|
|
<i><font color="#808080" >// Book GPU Gems, price 4</font></i>
|
|
<font color="#000000" >}</font>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Alternative way to get a bookstore node (wildcards)</font></i>
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >bookstore</font> <font color="#000000" >=</font> <font color="#000000" >doc.child_w(</font><font color="#ff0000" >"*[sS]tore"</font><font color="#000000" >);</font> <i><font color="#808080" >// this will select bookstore, anyStore, Store, etc.</font></i>
|
|
|
|
<i><font color="#808080" >// Iterate through books with STL compatible iterators</font></i>
|
|
<b><font color="#0000ff" >for</font></b> <font color="#000000" >(xml_node::iterator</font> <font color="#000000" >it</font> <font color="#000000" >=</font> <font color="#000000" >bookstore.begin();</font> <font color="#000000" >it</font> <font color="#000000" >!=</font> <font color="#000000" >bookstore.end();</font> <font color="#000000" >++it)</font>
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// Note the use of helper function child_value()</font></i>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#ff0000" >"Book "</font> <font color="#000000" ><<</font> <font color="#000000" >it->attribute(</font><font color="#ff0000" >"title"</font><font color="#000000" >).value()</font> <font color="#000000" ><<</font> <font color="#ff0000" >", price "</font> <font color="#000000" ><<</font> <font color="#000000" >it->child_value(</font><font color="#ff0000" >"price"</font><font color="#000000" >)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
<font color="#000000" >}</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// Book ShaderX, price 3</font></i>
|
|
<i><font color="#808080" >// Book GPU Gems, price 4</font></i>
|
|
<font color="#000000" >}</font>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// You can also traverse the whole tree (or a subtree)</font></i>
|
|
<font color="#000000" >bookstore_traverser</font> <font color="#000000" >t;</font>
|
|
|
|
<font color="#000000" >doc.traverse(t);</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// bookstore</font></i>
|
|
<i><font color="#808080" >// book</font></i>
|
|
<i><font color="#808080" >// price</font></i>
|
|
<i><font color="#808080" >// 3</font></i>
|
|
<i><font color="#808080" >// book</font></i>
|
|
<i><font color="#808080" >// price</font></i>
|
|
<i><font color="#808080" >// 4</font></i>
|
|
|
|
<font color="#000000" >doc.first_child().traverse(t);</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// book</font></i>
|
|
<i><font color="#808080" >// price</font></i>
|
|
<i><font color="#808080" >// 3</font></i>
|
|
<i><font color="#808080" >// book</font></i>
|
|
<i><font color="#808080" >// price</font></i>
|
|
<i><font color="#808080" >// 4</font></i>
|
|
<font color="#000000" >}</font>
|
|
|
|
<i><font color="#808080" >// If you want a distinct node...</font></i>
|
|
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// You can specify the way to it through child() functions</font></i>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.child(</font><font color="#ff0000" >"bookstore"</font><font color="#000000" >).child(</font><font color="#ff0000" >"book"</font><font color="#000000" >).next_sibling().attribute(</font><font color="#ff0000" >"title"</font><font color="#000000" >).value()</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// GPU Gems</font></i>
|
|
|
|
<i><font color="#808080" >// You can use a sometimes convenient path function</font></i>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.first_element_by_path(</font><font color="#ff0000" >"bookstore/book/price"</font><font color="#000000" >).child_value()</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// 3</font></i>
|
|
|
|
<i><font color="#808080" >// And you can use powerful XPath expressions</font></i>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >doc.select_single_node(</font><font color="#ff0000" >"/bookstore/book[@title = 'ShaderX']/price"</font><font color="#000000" >).node().child_value()</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// 3</font></i>
|
|
|
|
<i><font color="#808080" >// Of course, XPath is much more powerful</font></i>
|
|
|
|
<i><font color="#808080" >// Compile query that prints total price of all Gems book in store</font></i>
|
|
<font color="#000000" >xpath_query</font> <font color="#000000" >query(</font><font color="#ff0000" >"sum(/bookstore/book[contains(@title, 'Gems')]/price)"</font><font color="#000000" >);</font>
|
|
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >query.evaluate_number(doc)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// 4</font></i>
|
|
|
|
<i><font color="#808080" >// You can apply the same XPath query to any document. For example, let's add another Gems</font></i>
|
|
<i><font color="#808080" >// book (more detail about modifying tree in next sample):</font></i>
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >book</font> <font color="#000000" >=</font> <font color="#000000" >doc.child(</font><font color="#ff0000" >"bookstore"</font><font color="#000000" >).append_child();</font>
|
|
<font color="#000000" >book.set_name(</font><font color="#ff0000" >"book"</font><font color="#000000" >);</font>
|
|
<font color="#000000" >book.append_attribute(</font><font color="#ff0000" >"title"</font><font color="#000000" >)</font> <font color="#000000" >=</font> <font color="#ff0000" >"Game Programming Gems 2"</font><font color="#000000" >;</font>
|
|
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >price</font> <font color="#000000" >=</font> <font color="#000000" >book.append_child();</font>
|
|
<font color="#000000" >price.set_name(</font><font color="#ff0000" >"price"</font><font color="#000000" >);</font>
|
|
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >price_text</font> <font color="#000000" >=</font> <font color="#000000" >price.append_child(node_pcdata);</font>
|
|
<font color="#000000" >price_text.set_value(</font><font color="#ff0000" >"5.3"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Now let's reevaluate query</font></i>
|
|
<font color="#000000" >cout</font> <font color="#000000" ><<</font> <font color="#000000" >query.evaluate_number(doc)</font> <font color="#000000" ><<</font> <font color="#000000" >endl;</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// 9.3</font></i>
|
|
<font color="#000000" >}</font>
|
|
<font color="#000000" >}</font>
|
|
</font></pre></td></tr><tr><td align="right"><b><i><a href="http://dobrokot.nm.ru/WinnieColorizer.html"><font color="#666666">_Winnie C++ Colorizer</font></a></i></b></td></tr></table>
|
|
|
|
<p>Finally, let's get into more details about tree modification and saving.</p>
|
|
|
|
<table width = "100%" bgcolor="#e6e6e6"><tr><td><pre><font color="white">
|
|
<font color="#008000" >#include</font> <font color="#ff0000" ><iostream></font>
|
|
|
|
<font color="#008000" >#include</font> <font color="#ff0000" >"pugixml.hpp"</font>
|
|
|
|
<b><font color="#0000ff" >using</font></b> <b><font color="#0000ff" >namespace</font></b> <font color="#000000" >std;</font>
|
|
<b><font color="#0000ff" >using</font></b> <b><font color="#0000ff" >namespace</font></b> <font color="#000000" >pugi;</font>
|
|
|
|
<b><font color="#0000ff" >int</font></b> <font color="#000000" >main()</font>
|
|
<font color="#000000" >{</font>
|
|
<i><font color="#808080" >// For this example, we'll start with an empty document and create nodes in it from code</font></i>
|
|
<font color="#000000" >xml_document</font> <font color="#000000" >doc;</font>
|
|
|
|
<i><font color="#808080" >// Append several children and set values/names at once</font></i>
|
|
<font color="#000000" >doc.append_child(node_comment).set_value(</font><font color="#ff0000" >"This is a test comment"</font><font color="#000000" >);</font>
|
|
<font color="#000000" >doc.append_child().set_name(</font><font color="#ff0000" >"application"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Let's add a few modules</font></i>
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >application</font> <font color="#000000" >=</font> <font color="#000000" >doc.child(</font><font color="#ff0000" >"application"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Save node wrapper for convenience</font></i>
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >module_a</font> <font color="#000000" >=</font> <font color="#000000" >application.append_child();</font>
|
|
<font color="#000000" >module_a.set_name(</font><font color="#ff0000" >"module"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Add an attribute, immediately setting it's value</font></i>
|
|
<font color="#000000" >module_a.append_attribute(</font><font color="#ff0000" >"name"</font><font color="#000000" >).set_value(</font><font color="#ff0000" >"A"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// You can use operator=</font></i>
|
|
<font color="#000000" >module_a.append_attribute(</font><font color="#ff0000" >"folder"</font><font color="#000000" >)</font> <font color="#000000" >=</font> <font color="#ff0000" >"/work/app/module_a"</font><font color="#000000" >;</font>
|
|
|
|
<i><font color="#808080" >// Or even assign numbers</font></i>
|
|
<font color="#000000" >module_a.append_attribute(</font><font color="#ff0000" >"status"</font><font color="#000000" >)</font> <font color="#000000" >=</font> <b><font color="#40b440" >85.4</font></b><font color="#000000" >;</font>
|
|
|
|
<i><font color="#808080" >// Let's add another module</font></i>
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >module_c</font> <font color="#000000" >=</font> <font color="#000000" >application.append_child();</font>
|
|
<font color="#000000" >module_c.set_name(</font><font color="#ff0000" >"module"</font><font color="#000000" >);</font>
|
|
<font color="#000000" >module_c.append_attribute(</font><font color="#ff0000" >"name"</font><font color="#000000" >)</font> <font color="#000000" >=</font> <font color="#ff0000" >"C"</font><font color="#000000" >;</font>
|
|
<font color="#000000" >module_c.append_attribute(</font><font color="#ff0000" >"folder"</font><font color="#000000" >)</font> <font color="#000000" >=</font> <font color="#ff0000" >"/work/app/module_c"</font><font color="#000000" >;</font>
|
|
|
|
<i><font color="#808080" >// Oh, we missed module B. Not a problem, let's insert it before module C</font></i>
|
|
<font color="#000000" >xml_node</font> <font color="#000000" >module_b</font> <font color="#000000" >=</font> <font color="#000000" >application.insert_child_before(node_element,</font> <font color="#000000" >module_c);</font>
|
|
<font color="#000000" >module_b.set_name(</font><font color="#ff0000" >"module"</font><font color="#000000" >);</font>
|
|
<font color="#000000" >module_b.append_attribute(</font><font color="#ff0000" >"folder"</font><font color="#000000" >)</font> <font color="#000000" >=</font> <font color="#ff0000" >"/work/app/module_b"</font><font color="#000000" >;</font>
|
|
|
|
<i><font color="#808080" >// We can do the same thing for attributes</font></i>
|
|
<font color="#000000" >module_b.insert_attribute_before(</font><font color="#ff0000" >"name"</font><font color="#000000" >,</font> <font color="#000000" >module_b.attribute(</font><font color="#ff0000" >"folder"</font><font color="#000000" >))</font> <font color="#000000" >=</font> <font color="#ff0000" >"B"</font><font color="#000000" >;</font>
|
|
|
|
<i><font color="#808080" >// Let's add some text in module A</font></i>
|
|
<font color="#000000" >module_a.append_child(node_pcdata).set_value(</font><font color="#ff0000" >"Module A description"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Well, there's not much left to do here. Let's output our document to file using several formatting options</font></i>
|
|
|
|
<font color="#000000" >doc.save_file(</font><font color="#ff0000" >"sample_saved_1.xml"</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Contents of file sample_saved_1.xml (tab size = 4):</font></i>
|
|
<i><font color="#808080" >// <?xml version="1.0"?></font></i>
|
|
<i><font color="#808080" >// <!--This is a test comment--></font></i>
|
|
<i><font color="#808080" >// <application></font></i>
|
|
<i><font color="#808080" >// <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module></font></i>
|
|
<i><font color="#808080" >// <module name="B" folder="/work/app/module_b" /></font></i>
|
|
<i><font color="#808080" >// <module name="C" folder="/work/app/module_c" /></font></i>
|
|
<i><font color="#808080" >// </application></font></i>
|
|
|
|
<i><font color="#808080" >// Let's use two spaces for indentation instead of tab character</font></i>
|
|
<font color="#000000" >doc.save_file(</font><font color="#ff0000" >"sample_saved_2.xml"</font><font color="#000000" >,</font> <font color="#ff0000" >" "</font><font color="#000000" >);</font>
|
|
|
|
<i><font color="#808080" >// Contents of file sample_saved_2.xml:</font></i>
|
|
<i><font color="#808080" >// <?xml version="1.0"?></font></i>
|
|
<i><font color="#808080" >// <!--This is a test comment--></font></i>
|
|
<i><font color="#808080" >// <application></font></i>
|
|
<i><font color="#808080" >// <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module></font></i>
|
|
<i><font color="#808080" >// <module name="B" folder="/work/app/module_b" /></font></i>
|
|
<i><font color="#808080" >// <module name="C" folder="/work/app/module_c" /></font></i>
|
|
<i><font color="#808080" >// </application></font></i>
|
|
|
|
<i><font color="#808080" >// Let's save a raw XML file</font></i>
|
|
<font color="#000000" >doc.save_file(</font><font color="#ff0000" >"sample_saved_3.xml"</font><font color="#000000" >,</font> <font color="#ff0000" >""</font><font color="#000000" >,</font> <font color="#000000" >format_raw);</font>
|
|
|
|
<i><font color="#808080" >// Contents of file sample_saved_3.xml:</font></i>
|
|
<i><font color="#808080" >// <?xml version="1.0"?><!--This is a test comment--><application><module name="A" folder="/work/app/module_a" status="85.4">Module A description</module><module name="B" folder="/work/app/module_b" /><module name="C" folder="/work/app/module_c" /></application></font></i>
|
|
|
|
<i><font color="#808080" >// Finally, you can print a subtree to any output stream (including cout)</font></i>
|
|
<font color="#000000" >xml_writer_stream writer(cout);</font>
|
|
<font color="#000000" >doc.child(</font><font color="#ff0000" >"application"</font><font color="#000000" >).child(</font><font color="#ff0000" >"module"</font><font color="#000000" >).print(writer);</font>
|
|
|
|
<i><font color="#808080" >// Output:</font></i>
|
|
<i><font color="#808080" >// <module name="A" folder="/work/app/module_a" status="85.4">Module A description</module></font></i>
|
|
<font color="#000000" >}</font>
|
|
</font></pre></td></tr><tr><td align="right"><b><i><a href="http://dobrokot.nm.ru/WinnieColorizer.html"><font color="#666666">_Winnie C++ Colorizer</font></a></i></b></td></tr></table>
|
|
|
|
<p>Note, that these examples do not cover the whole <i>pugixml</i> API. For further information, look into reference section.</p>
|
|
|
|
<hr>
|
|
|
|
<a name="Reference">
|
|
<h2>Reference</h2>
|
|
|
|
<p><i>pugixml</i> is a library for parsing XML files, which means that you give it XML data some way,
|
|
and it gives you the DOM tree and the ways to traverse it and to get some useful information from it.
|
|
The library source consist of two headers, <b>pugixml.hpp</b> and <b>pugiconfig.hpp</b>, and two source
|
|
files, <b>pugixml.cpp</b> and <b>pugixpath.cpp</b>.
|
|
You can either compile cpp files in your project, or build a static library.
|
|
All library classes reside in namespace <b>pugi</b>, so you can either use fully qualified
|
|
names (<b>pugi::xml_node</b>) or write a using declaration (<b>using namespace pugi;</b>, <b>using
|
|
pugi::xml_node</b>) and use plain names. All classes have eitther <b>xml_</b> or <b>xpath_</b> prefix.</p>
|
|
|
|
<p>By default it's supposed that you compile the source file with your project (add it into your
|
|
project, or add relevant entry in your Makefile, or do whatever you need to do with your compilation
|
|
environment). The library is written in standard-conformant C++ and was tested on following platforms:</p>
|
|
|
|
<p>
|
|
<ul>
|
|
<li>Windows 32-bit (MSVC<sup><a href="#annot-3">3</a></sup> 6.0, MSVC 7.0 (2002), MSVC 7.1 (2003), MSVC 8.0 (2005), MSVC 9.0 (2008), MSVC 10.0 (2010), ICC<sup><a href="#annot-4">4</a></sup> 8.0, ICC 8.1, GCC 3.4.2 (MinGW), GCC 4.4.0 (MinGW), BCC<sup><a href="#annot-5">5</a></sup> 5.82, DMC<sup><a href="#annot-6">6</a></sup> 8.50, Comeau C++ 4.3.3, PGI<sup><a href="#annot-7">7</a></sup> 6.2, CW<sup><a href="#annot-8">8</a></sup> 8.0)
|
|
<li>Windows 64-bit (MSVC 9.0 (2008))
|
|
<li>Linux 32-bit (GCC 3.2)
|
|
<li>Sony Playstation Portable (GCC 3.4.2; in PUGIXML_NO_STL mode)
|
|
<li>Sony Playstation 3 (GCC 4.0.2; in PUGIXML_NO_EXCEPTIONS mode (-fno-exceptions))
|
|
<li>Microsoft Xbox (MSVC 7.1)
|
|
<li>Microsoft Xbox 360 (MSVC 8.0)
|
|
</ul>
|
|
</p>
|
|
|
|
<p>The documentation for <i>pugixml</i> classes, functions and constants <a href="html/index.html">is available here</a>.</p>
|
|
|
|
<hr>
|
|
|
|
<a name="annot-3"><sup>3</sup><small> MSVC is Microsoft Visual C++ Compiler</small> <br>
|
|
<a name="annot-4"><sup>4</sup><small> ICC is Intel C++ Compiler</small> <br>
|
|
<a name="annot-5"><sup>5</sup><small> BCC is Borland C++ Compiler</small> <br>
|
|
<a name="annot-6"><sup>6</sup><small> DMC is Digital Mars C++ Compiler</small> <br>
|
|
<a name="annot-7"><sup>7</sup><small> PGI is Portland Group C++ Compiler</small> <br>
|
|
<a name="annot-8"><sup>8</sup><small> CW is Metrowerks CodeWarrior</small>
|
|
|
|
<hr>
|
|
|
|
<a name="Compliance">
|
|
<h2>W3C compliance</h2>
|
|
|
|
<p><i>pugixml</i> is not a compliant XML parser. The main reason for that is that it does not reject
|
|
most malformed XML files. The more or less complete list of incompatibilities follows (I will be talking
|
|
of ones when using <b>parse_w3c</b> mode):
|
|
|
|
<ul>
|
|
<li>The parser is completely DOCTYPE-ignorant, that is, it does not even skip all possible DOCTYPEs
|
|
correctly, let alone use them for parsing
|
|
<li>It accepts multiple attributes with the same name in one node
|
|
<li>It is charset-ignorant
|
|
<li>It accepts invalid attribute values (those with < in them) and does not reject invalid entity
|
|
references or character references (in fact, it does not do DOCTYPE parsing, so it does not perform
|
|
entity reference expansion)
|
|
<li>It does not reject comments with -- inside
|
|
<li>It does not reject PI with the names of 'xml' and alike
|
|
<li>And some other things that I forgot to mention
|
|
</ul>
|
|
|
|
In short, it accepts some malformed XML files and does not do anything that is related to DOCTYPE.
|
|
This is because the main goal was developing fast, easy-to-use and error ignorant (so you can get
|
|
something even from a malformed document) parser, there are some good validating and conformant
|
|
parsers already.</p>
|
|
|
|
<hr>
|
|
|
|
<a name="ComparisonTable">
|
|
<h2>Comparison with existing parsers</h2>
|
|
|
|
<p>This table summarizes the comparison in terms of time and memory consumption between pugixml and
|
|
other parsers. For DOM parsers (all, except Expat, irrXML and SAX parser of XercesC), the process is
|
|
as follows:</p>
|
|
|
|
<ul>
|
|
<li>construct DOM tree from file, which is preloaded in memory (all parsers take const char* and size
|
|
as an input). 'parse time' means number of CPU clocks which is spent, 'parse allocs' - number of allocations,
|
|
'parse memory' - peak memory consumption
|
|
<li>traverse DOM tree to fill information from it into some structure (which is the same for all parsers,
|
|
of course). 'walk time' means number of CPU clocks which is spent, 'walk allocs' - number of allocations
|
|
</ul>
|
|
|
|
<p>For SAX parsers, the parse step is skipped (hence the N/A in relevant table cells), structure is
|
|
filled during 'walk' step.</p>
|
|
|
|
<p>For all parsers, 'total time' column means total time spent on the whole process, 'total allocs' -
|
|
total allocation count, 'total memory' - peak memory consumption for the whole process.</p>
|
|
|
|
<p>The tests were performed on a 1 Mb XML file with a small amount of text. They were compiled with
|
|
Microsoft Visual C++ 8.0 (2005) compiler in Release mode, with checked iterators/secure STL turned
|
|
off. The test system is AMD Sempron 2500+, 512 Mb RAM.</p>
|
|
|
|
<table cellspacing=0 cellpadding=2 border=1>
|
|
|
|
<tr><th>parser</th>
|
|
<th>parse time</th><th>parse allocs</th><th>parse memory</th>
|
|
<th>walk time</th><th>walk allocs</th>
|
|
<th>total time</th><th>total allocs</th><th>total memory</th></tr>
|
|
|
|
<tr><td><a href="http://xml.irrlicht3d.org/">irrXML</a></td>
|
|
<td>N/A</td><td>N/A</td><td>N/A</td>
|
|
<td>352 Mclocks</td><td>697 245</td>
|
|
<td>356 Mclocks</td><td>697 284</td><td>906 kb</td></tr>
|
|
|
|
<tr><td><a href="http://expat.sourceforge.net/">Expat</a></td>
|
|
<td>N/A</td><td>N/A</td><td>N/A</td>
|
|
<td>97 Mclocks</td><td>19</td>
|
|
<td>97 Mclocks</td><td>23</td><td>1028 kb</td></tr>
|
|
|
|
<tr><td><a href="http://tinyxml.sourceforge.net/">TinyXML</a></td>
|
|
<td>168 Mclocks</td><td>50 163</td><td>5447 kb</td>
|
|
<td>37 Mclocks</td><td>0</td>
|
|
<td>242 Mclocks</td><td>50 163</td><td>5447 kb</td></tr>
|
|
|
|
<tr><td><a href="http://www.codeproject.com/soap/pugxml.asp">PugXML</a></td>
|
|
<td>100 Mclocks</td><td>106 597</td><td>2747 kb</td>
|
|
<td>38 Mclocks</td><td>0</td>
|
|
<td>206 Mclocks</td><td>131 677</td><td>2855 kb</td></tr>
|
|
|
|
<tr><td><a href="http://xml.apache.org/xerces-c/">XercesC</a> SAX</td>
|
|
<td>N/A</td><td>N/A</td><td>N/A</td>
|
|
<td>411 Mclocks</td><td>70 380</td>
|
|
<td>411 Mclocks</td><td>70 495</td><td>243 kb</td></tr>
|
|
|
|
<tr><td><a href="http://xml.apache.org/xerces-c/">XercesC</a> DOM</td>
|
|
<td>300 Mclocks</td><td>30 491</td><td>9251 kb</td>
|
|
<td>65 Mclocks</td><td>1</td>
|
|
<td>367 Mclocks</td><td>30 492</td><td>9251 kb</td></tr>
|
|
|
|
<tr><td>pugixml</td>
|
|
<td>17 Mclocks</td><td>40</td><td>2154 kb</td>
|
|
<td>14 Mclocks</td><td>0</td>
|
|
<td>32 Mclocks</td><td>40</td><td>2154 kb</td></tr>
|
|
|
|
<tr><td>pugixml (test of non-destructive parsing)</td>
|
|
<td>12 Mclocks</td><td>51</td><td>1632 kb</td>
|
|
<td>21 Mclocks</td><td>0</td>
|
|
<td>34 Mclocks</td><td>51</td><td>1632 kb</td></tr>
|
|
|
|
</table>
|
|
|
|
<p>Note, that non-destructive parsing mode was just a test and is not yet in <i>pugixml</i>.</p>
|
|
|
|
<hr>
|
|
|
|
<a name="FAQ">
|
|
<h2>FAQ</h2>
|
|
|
|
<p><b>Q:</b> I do not have/want STL support. How can I compile <i>pugixml</i> without STL?</p>
|
|
<p><b>A:</b> There is an undocumented define PUGIXML_NO_STL. If you uncomment the relevant line
|
|
in <i>pugixml</i> header file, it will compile without any STL classes. The reason it is undocumented
|
|
are that it will make some documented functions not available (specifically, xml_document::load, that
|
|
operates on std::istream, xml_node::path function, XPath-related functions and classes and as_utf16/as_utf8
|
|
conversion functions). Otherwise, it will work fine.</p>
|
|
|
|
<p><b>Q:</b> Do paths that are accepted by <b>first_element_by_path</b> have to end with delimiter?</p>
|
|
<p><b>A:</b> Either way will work, both /path/to/node/ and /path/to/node is fine.</p>
|
|
|
|
<p>I'm always open for questions; feel free to write them to <a href="mailto:arseny.kapoulkine@gmail.com">arseny.kapoulkine@gmail.com</a>.
|
|
</p>
|
|
|
|
<hr>
|
|
|
|
<a name="Bugs">
|
|
<h2>Bugs</h2>
|
|
|
|
<p>I'm always open for bug reports; feel free to write them to <a href="mailto:arseny.kapoulkine@gmail.com">arseny.kapoulkine@gmail.com</a>.
|
|
Please provide as much information as possible - version of <i>pugixml</i>, compiling and OS environment
|
|
(compiler and it's version, STL version, OS version, etc.), the description of the situation in which
|
|
the bug arises, the code and data files that show the bug, etc. - the more, the better. Though, please,
|
|
do not send executable files.</p>
|
|
|
|
<p>Note, that you can also submit bug reports/suggestions at
|
|
<a href="http://code.google.com/p/pugixml/issues/list">project page</a>.
|
|
|
|
<hr>
|
|
|
|
<a name="Future_work">
|
|
<h2>Future work</h2>
|
|
|
|
<p>Here are some improvements that will be done in future versions (they are sorted by priority, the
|
|
upper ones will get there sooner).</p>
|
|
|
|
<ul>
|
|
<li>Support for UTF-16 files (parsing BOM to get file's type and converting UTF-16 file to UTF-8 buffer
|
|
if necessary)
|
|
<li>More intelligent parsing of DOCTYPE (it does not always skip DOCTYPE for now)
|
|
<li>XML 1.1 changes (changed EOL handling, normalization issues, etc.)
|
|
<li>Name your own?
|
|
</ul>
|
|
|
|
<hr>
|
|
|
|
<a name="Changelog">
|
|
<h2>Changelog</h2>
|
|
|
|
<dl>
|
|
<dt>15.07.2006 - v0.1
|
|
<dd>First private release for testing purposes
|
|
</dt>
|
|
<dt>6.11.2006 - v0.2
|
|
<dd>First public release. Changes: <ul>
|
|
<li>Introduced child_value(name) and child_value_w(name)
|
|
<li>Fixed child_value() (for empty nodes)
|
|
<li>Fixed xml_parser_impl warning at W4
|
|
<li>parse_eol_pcdata and parse_eol_attribute flags + parse_minimal optimizations
|
|
<li>Optimizations of strconv_t
|
|
</ul>
|
|
</dt>
|
|
<dt>21.02.2007 - v0.3
|
|
<dd>Refactored, reworked and improved version. Changes: <ul>
|
|
<li>Interface: <ul>
|
|
<li>Added XPath
|
|
<li>Added tree modification functions
|
|
<li>Added no STL compilation mode
|
|
<li>Added saving document to file
|
|
<li>Refactored parsing flags
|
|
<li>Removed xml_parser class in favor of xml_document
|
|
<li>Added transfer ownership parsing mode
|
|
<li>Modified the way xml_tree_walker works
|
|
<li>Iterators are now non-constant
|
|
</ul>
|
|
<li>Implementation: <ul>
|
|
<li>Support of several compilers and platforms
|
|
<li>Refactored and sped up parsing core
|
|
<li>Improved standard compliancy
|
|
<li>Added XPath implementation
|
|
<li>Fixed several bugs
|
|
</ul>
|
|
</ul>
|
|
</dd>
|
|
</dt>
|
|
<dt>31.10.2007 - v0.34
|
|
<dd>Maintenance release. Changes: <ul>
|
|
<li>Improved compatibility (supported Digital Mars C++, MSVC 6, CodeWarrior 8, PGI C++, Comeau, supported PS3 and XBox360)
|
|
<li>Fixed bug with loading from text-mode iostreams
|
|
<li>Fixed leak when transfer_ownership is true and parsing is failing
|
|
<li>Fixed bug in saving (\r and \n are now escaped in attribute values)
|
|
<li>PUGIXML_NO_EXCEPTION flag for platforms without exception handling
|
|
<li>Renamed free() to destroy() - some macro conflicts were reported
|
|
</ul>
|
|
</dd>
|
|
</dt>
|
|
<dt>18.01.2009 - v0.4
|
|
<dd>Changes: <ul>
|
|
<li>Bugs: <ul>
|
|
<li>Documentation fix in samples for parse() with manual lifetime control
|
|
<li>Fixed document order sorting in XPath (it caused wrong order of nodes after xpath_node_set::sort and wrong results of some XPath queries)
|
|
</ul>
|
|
<li>Node printing changes: <ul>
|
|
<li>Single quotes are no longer escaped when printing nodes
|
|
<li>Symbols in second half of ASCII table are no longer escaped when printing nodes; because of this, format_utf8 flag is deleted as it's no longer needed and
|
|
format_write_bom is renamed to format_write_bom_utf8.
|
|
<li>Reworked node printing - now it works via xml_writer interface; implementations for FILE* and std::ostream are available. As a side-effect, xml_document::save_file
|
|
now works without STL.
|
|
</ul>
|
|
<li>New features: <ul>
|
|
<li>Added unsigned integer support for attributes (xml_attribute::as_uint, xml_attribute::operator=)
|
|
<li>Now document declaration (<?xml ...?>) is parsed as node with type node_declaration when parse_declaration flag is specified (access to encoding/version is performed as if they
|
|
were attributes, i.e. doc.child("xml").attribute("version").as_float()); corresponding flags for node printing were also added
|
|
<li>Added support for custom memory management (see set_memory_management_functions for details)
|
|
<li>Implemented node/attribute copying (see xml_node::insert_copy_* and xml_node::append_copy for details)
|
|
<li>Added find_child_by_attribute and find_child_by_attribute_w to simplify parsing code in some cases (i.e. COLLADA files)
|
|
<li>Added file offset information querying for debugging purposes (now you're able to determine exact location of any xml_node in parsed file, see xml_node::offset_debug for details)
|
|
<li>Improved error handling for parsing - now load(), load_file() and parse() return xml_parse_result, which contains error code and last parsed offset; this does not break old interface as xml_parse_result can be implicitly casted to bool.
|
|
</ul>
|
|
</ul>
|
|
</dd>
|
|
</dt>
|
|
<dt>8.02.2009 - v0.41
|
|
<dd>Maintenance release. Changes: <ul>
|
|
<li>Fixed bug with node printing (occasionally some content was not written to output stream)
|
|
</ul>
|
|
</dd>
|
|
</dt>
|
|
<dt>17.09.2009 - v0.42
|
|
<dd>Maintenance release. Changes: <ul>
|
|
<li>Fixed deallocation in case of custom allocation functions or if delete[] / free are incompatible
|
|
<li>XPath parser fixed for incorrect queries (i.e. incorrect XPath queries should now always fail to compile)
|
|
<li>Added PUGIXML_API/PUGIXML_CLASS/PUGIXML_FUNCTION configuration macros to control class/function attributes
|
|
<li>Const-correctness fixes for find_child_by_attribute
|
|
<li>Improved compatibility (miscellaneous warning fixes, fixed cstring include dependency for GCC)
|
|
<li>Fixed iterator begin/end and print function to work correctly for empty nodes
|
|
<li>Added xml_attribute::set_value overloads for different types
|
|
</ul>
|
|
</dd>
|
|
</dt>
|
|
<dt>8.11.2009 - v0.5
|
|
<dd>Major bugfix release. Changes: <ul>
|
|
<li>XPath bugfixes: <ul>
|
|
<li>Fixed translate(), lang() and concat() functions (infinite loops/crashes)
|
|
<li>Fixed compilation of queries with empty literal strings ("")
|
|
<li>Fixed axis tests: they never add empty nodes/attributes to the resulting node set now
|
|
<li>Fixed string-value evaluation for node-set (the result excluded some text descendants)
|
|
<li>Fixed self:: axis (it behaved like ancestor-or-self::)
|
|
<li>Fixed following:: and preceding:: axes (they included descendent and ancestor nodes, respectively)
|
|
<li>Minor fix for namespace-uri() function (namespace declaration scope includes the parent element of namespace declaration attribute)
|
|
<li>Some incorrect queries are no longer parsed now (i.e. foo: *)
|
|
<li>Fixed text()/etc. node test parsing bug (i.e. foo[text()] failed to compile)
|
|
<li>Fixed root step (/) - it now selects empty node set if query is evaluated on empty node
|
|
<li>Fixed string to number conversion ("123 " converted to NaN, "123 .456" converted to 123.456 - now the results are 123 and NaN, respectively)
|
|
<li>Node set copying now preserves sorted type; leads to better performance on some queries
|
|
</ul>
|
|
<li>Miscellaneous bugfixes: <ul>
|
|
<li>Fixed xml_node::offset_debug for PI nodes
|
|
<li>Added empty attribute checks to xml_node::remove_attribute
|
|
<li>Fixed node_pi and node_declaration copying
|
|
<li>Const-correctness fixes
|
|
</ul>
|
|
<li>Specification changes: <ul>
|
|
<li>xpath_node::select_nodes() and related functions now throw exception if expression return type is not node set (instead of assertion)
|
|
<li>xml_node::traverse() now sets depth to -1 for both begin() and end() callbacks (was 0 at begin() and -1 at end())
|
|
<li>In case of non-raw node printing a newline is output after PCDATA inside nodes if the PCDATA has siblings
|
|
<li>UTF8 -> wchar_t conversion now considers 5-byte UTF8-like sequences as invalid
|
|
</ul>
|
|
<li>New features: <ul>
|
|
<li>Added xpath_node_set::operator[] for index-based iteration
|
|
<li>Added xpath_query::return_type()
|
|
<li>Added getter accessors for memory-management functions
|
|
</ul>
|
|
</ul>
|
|
</dd>
|
|
</dt>
|
|
<dt>7.05.2010 - v0.6
|
|
<dd>Changes: <ul>
|
|
<li>Bug fixes:<ul>
|
|
<li>Fixed document corruption on failed parsing bug
|
|
<li>XPath string <-> number conversion improvements (increased precision, fixed crash for huge numbers)
|
|
</ul>
|
|
<li>Major Unicode improvements:<ul>
|
|
<li>Introduced encoding support (automatic/manual encoding detection on load, manual encoding selection on save, conversion from/to UTF8, UTF16 LE/BE, UTF32 LE/BE)
|
|
<li>Introduced wchar_t mode (you can set PUGIXML_WCHAR_MODE define to switch pugixml internal encoding from UTF8 to wchar_t; all functions are switched to their Unicode variants)
|
|
<li>Load/save functions now support wide streams
|
|
</ul>
|
|
<li>Specification changes:<ul>
|
|
<li>parse() API changed to load_buffer/load_buffer_inplace/load_buffer_inplace_own; load_buffer APIs do not require zero-terminated strings.
|
|
<li>Renamed as_utf16 to as_wide
|
|
<li>Changed xml_node::offset_debug return type and xml_parse_result::offset type to ptrdiff_t
|
|
</ul>
|
|
<li>Miscellaneous:<ul>
|
|
<li>Optimized document parsing and saving
|
|
<li>All STL includes in pugixml.hpp are replaced with forward declarations
|
|
<li>Added contrib/ folder with Boost.Foreach compatibility helpers for iterators and header-only configuration support through special header
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
</dt>
|
|
<dt>25.05.2010 - v0.7
|
|
<dd>Changes: <ul>
|
|
<li>Compatibility:<ul>
|
|
<li>Added parse() and as_utf16 for compatibility (these functions are deprecated and will be removed in pugixml-1.0)
|
|
<li>Wildcard functions, document_order/precompute_document_order functions, format_write_bom_utf8 and parse_wnorm_attribute flags are deprecated and will be removed in version 1.0
|
|
</ul>
|
|
<li>Optimizations:<ul>
|
|
<li>Changed internal memory management: internal allocator is used for both metadata and name/value data; allocated pages are deleted if all allocations from them are deleted
|
|
<li>Optimized memory consumption: sizeof(xml_node_struct) reduced from 40 bytes to 32 bytes on x86
|
|
<li>Unicode conversion optimizations
|
|
<li>Optimized debug mode parsing/saving by order of magnitude
|
|
</ul>
|
|
<li>Major Unicode improvements:<ul>
|
|
<li>Introduced encoding support (automatic/manual encoding detection on load, manual encoding selection on save, conversion from/to UTF8, UTF16 LE/BE, UTF32 LE/BE)
|
|
<li>Introduced wchar_t mode (you can set PUGIXML_WCHAR_MODE define to switch pugixml internal encoding from UTF8 to wchar_t; all functions are switched to their Unicode variants)
|
|
<li>Load/save functions now support wide streams
|
|
</ul>
|
|
<li>Bug fixes / specification changes:<ul>
|
|
<li>Improved DOCTYPE parsing: now parser recognizes all well-formed DOCTYPE declarations
|
|
<li>Fixed as_uint() for large numbers (i.e. 2^32-1)
|
|
<li>Nodes/attributes with empty names are now printed as :anonymous
|
|
</ul>
|
|
</ul>
|
|
</ul>
|
|
</dt>
|
|
</dl>
|
|
|
|
<hr>
|
|
|
|
<a name="Acknowledgements">
|
|
<h2>Acknowledgements</h2>
|
|
|
|
<ul>
|
|
<li><a href="mailto:kristen@tima.net">Kristen Wegner</a> for <i>pugxml</i> parser
|
|
<li><a href="mailto:readonly@getsoft.com">Neville Franks</a> for contributions to <i>pugxml</i> parser
|
|
</ul>
|
|
|
|
<hr>
|
|
|
|
<a name="License">
|
|
<h2>License</h2>
|
|
|
|
<p>The <i>pugixml</i> parser is distributed under the MIT license:</p>
|
|
|
|
<pre>
|
|
Copyright (c) 2006-2010 Arseny Kapoulkine
|
|
|
|
Permission is hereby granted, free of charge, to any person
|
|
obtaining a copy of this software and associated documentation
|
|
files (the "Software"), to deal in the Software without
|
|
restriction, including without limitation the rights to use,
|
|
copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
copies of the Software, and to permit persons to whom the
|
|
Software is furnished to do so, subject to the following
|
|
conditions:
|
|
|
|
The above copyright notice and this permission notice shall be
|
|
included in all copies or substantial portions of the Software.
|
|
|
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
|
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
|
|
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
|
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
|
|
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
|
|
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
|
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
|
OTHER DEALINGS IN THE SOFTWARE.
|
|
</pre>
|
|
|
|
<hr>
|
|
|
|
<p>Revised 25 May, 2010</p>
|
|
<p><i>© Copyright <a href="mailto:arseny.kapoulkine@gmail.com">Arseny Kapoulkine</a> 2006-2010. All Rights Reserved.</i></p>
|
|
</body>
|
|
</html>
|