0
0
mirror of https://github.com/zeux/pugixml.git synced 2024-12-26 21:04:25 +08:00

docs: Adjust docs wrt parse_merge_pcdata x parse_embed_pcdata

parse_merge_pcdata expects to find node_pcdata nodes which aren't
present when parse_embed_pcdata is active. For now we mention this in
the documentation; changing this is possible in the future, but carries
a small performance penalty so it requires a specific use case.

Fixes #600
This commit is contained in:
Arseny Kapoulkine 2024-01-26 09:23:26 -08:00
parent 3cc2daf449
commit b2b4664030
3 changed files with 16 additions and 22 deletions

View File

@ -749,7 +749,7 @@ These flags control the resulting tree contents:
* [[parse_embed_pcdata]]`parse_embed_pcdata` determines if PCDATA contents is to be saved as element values. Normally element nodes have names but not values; this flag forces the parser to store the contents as a value if PCDATA is the first child of the element node (otherwise PCDATA node is created as usual). This can significantly reduce the memory required for documents with many PCDATA nodes. To retrieve the data you can use `xml_node::value()` on the element nodes or any of the higher-level functions like `child_value` or `text`. This flag is *off* by default. * [[parse_embed_pcdata]]`parse_embed_pcdata` determines if PCDATA contents is to be saved as element values. Normally element nodes have names but not values; this flag forces the parser to store the contents as a value if PCDATA is the first child of the element node (otherwise PCDATA node is created as usual). This can significantly reduce the memory required for documents with many PCDATA nodes. To retrieve the data you can use `xml_node::value()` on the element nodes or any of the higher-level functions like `child_value` or `text`. This flag is *off* by default.
Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is *off* by default. Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is *off* by default.
* [[parse_merge_pcdata]]`parse_merge_pcdata` determines if PCDATA contents is to be merged with the previous PCDATA node when no intermediary nodes are present between them. If the PCDATA contains CDATA sections, PI nodes, or comments in between, and either of the flags <<parse_cdata,parse_cdata>> ,<<parse_pi,parse_pi>> ,<<parse_comments,parse_comments>> is not set, the contents of the PCDATA node will be merged with the previous one. This flag is *off* by default. * [[parse_merge_pcdata]]`parse_merge_pcdata` determines if PCDATA contents is to be merged with the previous PCDATA node when no intermediary nodes are present between them. If the PCDATA contains CDATA sections, PI nodes, or comments in between, and either of the flags <<parse_cdata,parse_cdata>> ,<<parse_pi,parse_pi>> ,<<parse_comments,parse_comments>> is not set, the contents of the PCDATA node will be merged with the previous one. This flag is *off* by default. Note that this flag is not compatible with `parse_embed_pcdata`.
* [[parse_fragment]]`parse_fragment` determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid and permits multiple top-level element nodes (currently multiple top-level element nodes are also permitted when the flag is off, but that behavior should not be relied on). This flag is *off* by default. * [[parse_fragment]]`parse_fragment` determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid and permits multiple top-level element nodes (currently multiple top-level element nodes are also permitted when the flag is off, but that behavior should not be relied on). This flag is *off* by default.

View File

@ -4,7 +4,7 @@
<meta charset="UTF-8"> <meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 2.0.18"> <meta name="generator" content="Asciidoctor 2.0.20">
<meta name="author" content="website, repository"> <meta name="author" content="website, repository">
<title>pugixml 1.14 manual</title> <title>pugixml 1.14 manual</title>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700"> <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700">
@ -209,13 +209,10 @@ table.tableblock.fit-content>caption.title{white-space:nowrap;width:0}
.admonitionblock>table td.content{padding-left:1.125em;padding-right:1.25em;border-left:1px solid #dddddf;color:rgba(0,0,0,.6);word-wrap:anywhere} .admonitionblock>table td.content{padding-left:1.125em;padding-right:1.25em;border-left:1px solid #dddddf;color:rgba(0,0,0,.6);word-wrap:anywhere}
.admonitionblock>table td.content>:last-child>:last-child{margin-bottom:0} .admonitionblock>table td.content>:last-child>:last-child{margin-bottom:0}
.exampleblock>.content{border:1px solid #e6e6e6;margin-bottom:1.25em;padding:1.25em;background:#fff;border-radius:4px} .exampleblock>.content{border:1px solid #e6e6e6;margin-bottom:1.25em;padding:1.25em;background:#fff;border-radius:4px}
.exampleblock>.content>:first-child{margin-top:0}
.exampleblock>.content>:last-child{margin-bottom:0}
.sidebarblock{border:1px solid #dbdbd6;margin-bottom:1.25em;padding:1.25em;background:#f3f3f2;border-radius:4px} .sidebarblock{border:1px solid #dbdbd6;margin-bottom:1.25em;padding:1.25em;background:#f3f3f2;border-radius:4px}
.sidebarblock>:first-child{margin-top:0}
.sidebarblock>:last-child{margin-bottom:0}
.sidebarblock>.content>.title{color:#7a2518;margin-top:0;text-align:center} .sidebarblock>.content>.title{color:#7a2518;margin-top:0;text-align:center}
.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0} .exampleblock>.content>:first-child,.sidebarblock>.content>:first-child{margin-top:0}
.exampleblock>.content>:last-child,.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0}
.literalblock pre,.listingblock>.content>pre{border-radius:4px;overflow-x:auto;padding:1em;font-size:.8125em} .literalblock pre,.listingblock>.content>pre{border-radius:4px;overflow-x:auto;padding:1em;font-size:.8125em}
@media screen and (min-width:768px){.literalblock pre,.listingblock>.content>pre{font-size:.90625em}} @media screen and (min-width:768px){.literalblock pre,.listingblock>.content>pre{font-size:.90625em}}
@media screen and (min-width:1280px){.literalblock pre,.listingblock>.content>pre{font-size:1em}} @media screen and (min-width:1280px){.literalblock pre,.listingblock>.content>pre{font-size:1em}}
@ -394,7 +391,7 @@ b.conum *{color:inherit!important}
dt,th.tableblock,td.content,div.footnote{text-rendering:optimizeLegibility} dt,th.tableblock,td.content,div.footnote{text-rendering:optimizeLegibility}
h1,h2,p,td.content,span.alt,summary{letter-spacing:-.01em} h1,h2,p,td.content,span.alt,summary{letter-spacing:-.01em}
p strong,td.content strong,div.footnote strong{letter-spacing:-.005em} p strong,td.content strong,div.footnote strong{letter-spacing:-.005em}
p,blockquote,dt,td.content,span.alt,summary{font-size:1.0625rem} p,blockquote,dt,td.content,td.hdlist1,span.alt,summary{font-size:1.0625rem}
p{margin-bottom:1.25rem} p{margin-bottom:1.25rem}
.sidebarblock p,.sidebarblock dt,.sidebarblock td.content,p.tableblock{font-size:1em} .sidebarblock p,.sidebarblock dt,.sidebarblock td.content,p.tableblock{font-size:1em}
.exampleblock>.content{background:#fffef7;border-color:#e0e0dc;box-shadow:0 1px 4px #e0e0dc} .exampleblock>.content{background:#fffef7;border-color:#e0e0dc;box-shadow:0 1px 4px #e0e0dc}
@ -710,7 +707,7 @@ No documentation is perfect; neither is this one. If you find errors or omission
</div> </div>
<div class="literalblock"> <div class="literalblock">
<div class="content"> <div class="content">
<pre>Copyright (c) 2006-2023 Arseny Kapoulkine <pre>Copyright (c) 2006-2024 Arseny Kapoulkine
Permission is hereby granted, free of charge, to any person Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation obtaining a copy of this software and associated documentation
@ -740,7 +737,7 @@ OTHER DEALINGS IN THE SOFTWARE.</pre>
<div class="literalblock"> <div class="literalblock">
<div class="content"> <div class="content">
<pre>This software is based on pugixml library (https://pugixml.org). <pre>This software is based on pugixml library (https://pugixml.org).
pugixml is Copyright (C) 2006-2023 Arseny Kapoulkine.</pre> pugixml is Copyright (C) 2006-2024 Arseny Kapoulkine.</pre>
</div> </div>
</div> </div>
</div> </div>
@ -1802,7 +1799,7 @@ You should use the usual bitwise arithmetics to manipulate the bitmask: to enabl
Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is <strong>off</strong> by default.</p> Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is <strong>off</strong> by default.</p>
</li> </li>
<li> <li>
<p><a id="parse_merge_pcdata"></a><code>parse_merge_pcdata</code> determines if PCDATA contents is to be merged with the previous PCDATA node when no intermediary nodes are present between them. If the PCDATA contains CDATA sections, PI nodes, or comments in between, and either of the flags <a href="#parse_cdata">parse_cdata</a> ,<a href="#parse_pi">parse_pi</a> ,<a href="#parse_comments">parse_comments</a> is not set, the contents of the PCDATA node will be merged with the previous one. This flag is <strong>off</strong> by default.</p> <p><a id="parse_merge_pcdata"></a><code>parse_merge_pcdata</code> determines if PCDATA contents is to be merged with the previous PCDATA node when no intermediary nodes are present between them. If the PCDATA contains CDATA sections, PI nodes, or comments in between, and either of the flags <a href="#parse_cdata">parse_cdata</a> ,<a href="#parse_pi">parse_pi</a> ,<a href="#parse_comments">parse_comments</a> is not set, the contents of the PCDATA node will be merged with the previous one. This flag is <strong>off</strong> by default. Note that this flag is not compatible with <code>parse_embed_pcdata</code>.</p>
</li> </li>
<li> <li>
<p><a id="parse_fragment"></a><code>parse_fragment</code> determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid and permits multiple top-level element nodes (currently multiple top-level element nodes are also permitted when the flag is off, but that behavior should not be relied on). This flag is <strong>off</strong> by default.</p> <p><a id="parse_fragment"></a><code>parse_fragment</code> determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid and permits multiple top-level element nodes (currently multiple top-level element nodes are also permitted when the flag is off, but that behavior should not be relied on). This flag is <strong>off</strong> by default.</p>
@ -6156,7 +6153,7 @@ If exceptions are disabled, then in the event of parsing failure the query is in
</div> </div>
<div id="footer"> <div id="footer">
<div id="footer-text"> <div id="footer-text">
Last updated 2023-09-07 11:54:45 -0700 Last updated 2024-01-26 09:23:09 -0800
</div> </div>
</div> </div>
</body> </body>

View File

@ -4,7 +4,7 @@
<meta charset="UTF-8"> <meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 2.0.18"> <meta name="generator" content="Asciidoctor 2.0.20">
<meta name="author" content="website, repository"> <meta name="author" content="website, repository">
<title>pugixml 1.14 quick start guide</title> <title>pugixml 1.14 quick start guide</title>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700"> <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700">
@ -209,13 +209,10 @@ table.tableblock.fit-content>caption.title{white-space:nowrap;width:0}
.admonitionblock>table td.content{padding-left:1.125em;padding-right:1.25em;border-left:1px solid #dddddf;color:rgba(0,0,0,.6);word-wrap:anywhere} .admonitionblock>table td.content{padding-left:1.125em;padding-right:1.25em;border-left:1px solid #dddddf;color:rgba(0,0,0,.6);word-wrap:anywhere}
.admonitionblock>table td.content>:last-child>:last-child{margin-bottom:0} .admonitionblock>table td.content>:last-child>:last-child{margin-bottom:0}
.exampleblock>.content{border:1px solid #e6e6e6;margin-bottom:1.25em;padding:1.25em;background:#fff;border-radius:4px} .exampleblock>.content{border:1px solid #e6e6e6;margin-bottom:1.25em;padding:1.25em;background:#fff;border-radius:4px}
.exampleblock>.content>:first-child{margin-top:0}
.exampleblock>.content>:last-child{margin-bottom:0}
.sidebarblock{border:1px solid #dbdbd6;margin-bottom:1.25em;padding:1.25em;background:#f3f3f2;border-radius:4px} .sidebarblock{border:1px solid #dbdbd6;margin-bottom:1.25em;padding:1.25em;background:#f3f3f2;border-radius:4px}
.sidebarblock>:first-child{margin-top:0}
.sidebarblock>:last-child{margin-bottom:0}
.sidebarblock>.content>.title{color:#7a2518;margin-top:0;text-align:center} .sidebarblock>.content>.title{color:#7a2518;margin-top:0;text-align:center}
.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0} .exampleblock>.content>:first-child,.sidebarblock>.content>:first-child{margin-top:0}
.exampleblock>.content>:last-child,.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0}
.literalblock pre,.listingblock>.content>pre{border-radius:4px;overflow-x:auto;padding:1em;font-size:.8125em} .literalblock pre,.listingblock>.content>pre{border-radius:4px;overflow-x:auto;padding:1em;font-size:.8125em}
@media screen and (min-width:768px){.literalblock pre,.listingblock>.content>pre{font-size:.90625em}} @media screen and (min-width:768px){.literalblock pre,.listingblock>.content>pre{font-size:.90625em}}
@media screen and (min-width:1280px){.literalblock pre,.listingblock>.content>pre{font-size:1em}} @media screen and (min-width:1280px){.literalblock pre,.listingblock>.content>pre{font-size:1em}}
@ -394,7 +391,7 @@ b.conum *{color:inherit!important}
dt,th.tableblock,td.content,div.footnote{text-rendering:optimizeLegibility} dt,th.tableblock,td.content,div.footnote{text-rendering:optimizeLegibility}
h1,h2,p,td.content,span.alt,summary{letter-spacing:-.01em} h1,h2,p,td.content,span.alt,summary{letter-spacing:-.01em}
p strong,td.content strong,div.footnote strong{letter-spacing:-.005em} p strong,td.content strong,div.footnote strong{letter-spacing:-.005em}
p,blockquote,dt,td.content,span.alt,summary{font-size:1.0625rem} p,blockquote,dt,td.content,td.hdlist1,span.alt,summary{font-size:1.0625rem}
p{margin-bottom:1.25rem} p{margin-bottom:1.25rem}
.sidebarblock p,.sidebarblock dt,.sidebarblock td.content,p.tableblock{font-size:1em} .sidebarblock p,.sidebarblock dt,.sidebarblock td.content,p.tableblock{font-size:1em}
.exampleblock>.content{background:#fffef7;border-color:#e0e0dc;box-shadow:0 1px 4px #e0e0dc} .exampleblock>.content{background:#fffef7;border-color:#e0e0dc;box-shadow:0 1px 4px #e0e0dc}
@ -1057,7 +1054,7 @@ XPath functions throw <code>xpath_exception</code> objects on error; the sample
</div> </div>
<div class="literalblock"> <div class="literalblock">
<div class="content"> <div class="content">
<pre>Copyright (c) 2006-2023 Arseny Kapoulkine <pre>Copyright (c) 2006-2024 Arseny Kapoulkine
Permission is hereby granted, free of charge, to any person Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation obtaining a copy of this software and associated documentation
@ -1087,7 +1084,7 @@ OTHER DEALINGS IN THE SOFTWARE.</pre>
<div class="literalblock"> <div class="literalblock">
<div class="content"> <div class="content">
<pre>This software is based on pugixml library (https://pugixml.org). <pre>This software is based on pugixml library (https://pugixml.org).
pugixml is Copyright (C) 2006-2023 Arseny Kapoulkine.</pre> pugixml is Copyright (C) 2006-2024 Arseny Kapoulkine.</pre>
</div> </div>
</div> </div>
</div> </div>
@ -1101,7 +1098,7 @@ pugixml is Copyright (C) 2006-2023 Arseny Kapoulkine.</pre>
</div> </div>
<div id="footer"> <div id="footer">
<div id="footer-text"> <div id="footer-text">
Last updated 2023-09-07 11:54:46 -0700 Last updated 2024-01-26 09:23:09 -0800
</div> </div>
</div> </div>
</body> </body>