0
0
mirror of https://github.com/zeux/pugixml.git synced 2024-12-27 13:33:17 +08:00

777 Commits

Author SHA1 Message Date
Arseny Kapoulkine
58616a29aa Avoid store-load penalty on cursor->parent
This reclaims the performance lost in PCDATA reorganization and gains a
little more on top of that.
2023-09-05 22:23:31 -07:00
Arseny Kapoulkine
dfb2b7f7b4 Cache ~last position of merged PCDATA
This allows us to fix the quadratic complexity of parse_merge_pcdata.
After parsing the first PCDATA we need to advance by its length; we
still compute the length of each fragment twice with this approach, but
it's constant time.
2023-09-05 21:34:02 -07:00
Arseny Kapoulkine
e9d17a045e Streamline conditions with else if 2023-08-26 08:35:40 -07:00
Arseny Kapoulkine
6749789ec4 Fix interaction between parse_merge_pcdata and append_buffer
strconcat in the parsing loop only works if we know the source string
comes from the same buffer that we're parsing. This is somewhat
cumbersome to establish during parsing and it requires extra tracking
data, so we just disable this combination as it's unlikely to be
actually useful - usually append_buffer would be called on a possibly
empty collection of elements, not on something with PCDATA.
2023-08-25 19:32:42 -07:00
Arseny Kapoulkine
37ba937e05 Fix code style and simplify conditions
prev_sibling_c is never NULL so it should be safe to check first_child
instead, which ensures we may the minumum cost when the feature isn't
enabled.
2023-08-25 18:57:06 -07:00
Vineeth
3ad133a2ad Review Changes
i)Indentation added
ii)Optimal parsing ensured
2023-08-23 08:12:57 +05:30
vineeth-11316
a28252205a Final Commit 2023-08-20 10:21:26 +05:30
vineeth-11316
e15adbe704 Intial Commit for merge_pcdata 2023-08-19 09:17:54 +05:30
Arseny Kapoulkine
980cf57ff4 XPath: Account for non-English locales during number->string conversion
We use a special number formatting routine to generate the XPath
REC-compliant number representation; it relies on being able to get a
decimal representation of the source number, which we use sprintf for as
a fallback.

This is fairly insensitive to current locale, except for an assertion
that validates the decimal point as a precaution, and this check
triggers when the locale decimal point is not a dot.

Ideally we'd use a locale-insensitive routine here. On some systems we
have ecvt_r (similarly to MSVC's ecvt_s), but it's deprecated so
adopting it might be fraught with peril.

For now let's simply adjust the assertion to account for locales with
comma as a separator. This is probably not fully comprehensive but
probably gets us from a 90% solution to a 99% solution...

Fixes #574.
2023-07-27 09:24:10 -07:00
Sergey Abramov
1e9636303e Fix compilation errors on old GCC (2.95.3, 3.3.5) 2023-05-11 09:04:22 +07:00
Andy Maloney
058fc601a1 Fix weak vtable warning regarding xml_writer
Using Apple clang (clang-1400.0.29.202) with `-Wweak-vtables` would produce the following warning:

'xml_writer' has no out-of-line virtual method definitions; its vtable will be emitted in every translation unit [-Wweak-vtables]
2023-04-21 17:23:38 -04:00
Arseny Kapoulkine
a13b5cc08d Use stricter subset for now to avoid compat issues with Unix-like platforms 2023-04-15 13:41:15 -07:00
Arseny Kapoulkine
d3199a0c39 Fix get_file_size behavior inconsistency for folders
Different OSes have different behavior when trying to fopen/fseek/ftell
a folder. On Linux, some systems return 0 size, some systems return an
error, and some systems return LONG_MAX. LONG_MAX is particularly
problematic because that causes spurious OOMs under address sanitizer.

Using fstat manually cleans this up, however it introduces a new
dependency on platform specific headers that we didn't have before, and
also has unclear behavior on 64-bit systems wrt 32-bit sizes which will
need to be tested further as I'm not certain if the behavior needs to be
special-cased only for MSVC/MinGW, which are currently not handled by
this path (unless MinGW defines __unix__...)
2023-04-15 12:48:59 -07:00
Arseny Kapoulkine
a469fa2cfc Add assertion about header-dest relation to strcpy_insitu
May improve static analysis behavior for #555.
2023-03-19 14:57:10 -07:00
David Seifert
36aa487e9c
Fix -Wreserved-macro-identifier
* https://eel.is/c++draft/lex.name#3.1
  "Each identifier that contains a double underscore __ [...] is reserved to the implementation for any use."
2023-03-19 17:10:53 +01:00
Arseny Kapoulkine
c2c61a5905 Add a cautionary comment to xml_node::children(name)
Fixes #538
2023-01-23 09:37:11 -08:00
Arseny Kapoulkine
e11e0c965f Fix comment typo. 2022-11-06 13:47:53 -08:00
Arseny Kapoulkine
b6b747244e Adjust the workaround for -pedantic mode and fix tests 2022-11-06 10:21:35 -08:00
Arseny Kapoulkine
8be081fbbe Fix Xcode 14 sprintf deprecation warning
We use snprintf when stdc is set to C++11, however in C++98 mode we can't use variadic macros,
and Xcode 14 complains about the use of sprintf.

It should be safe however to use variadic macros on any remotely recent version of clang on Apple,
unless -pedantic is defined which warns against the use of variadic macros in C++98 mode...

This change fixes the problem for the builds that don't specify -pedantic, which is a problem for
another day.
2022-11-06 10:16:21 -08:00
Arseny Kapoulkine
76dcd89427 Update version number in preparation for 1.13 2022-10-20 20:08:52 -07:00
Arseny Kapoulkine
444963e269 Fix error handling in xml_document::save_file
There were two conditions under which xml_document::save_file could
previously return true even though the saving failed:

- The last write to the file was buffered in stdio buffer, and it's that
  last write that would fail due to lack of disk space
- The data has been written correctly but fclose failed to update file
  metadata, which can result in truncated size / missing inode updates.

This change fixes both by adjusting save_file to fflush before the check,
and also checking fclose results. Note that while fflush here is
technically redundant, because it's implied by fclose, we must check
ferror explicitly anyway, and so it feels a little cleaner to do most of
the error handling in save_file_impl, so that the changes of fclose()
failing are very slim.

Of course, neither change guarantees that the contents of the file are
going to be safe on disk following a power failure.
2022-10-07 22:13:04 -07:00
Arseny Kapoulkine
0cb4f02579 Final tweaks after #522
This cleans up xml_attribute::set_value to be uniform wrt
xml_node::set_value and xml_text::set_value - for now we duplicate the
body since the logic is trivial and this keeps debug performance
excellent.
2022-10-07 21:46:27 -07:00
Arseny Kapoulkine
c342266fae
Merge pull request #522 from Ferenc-/followup-on-pr-490
Followup on pr 490
2022-10-07 21:42:41 -07:00
Ferenc Géczi
f327371219 Add overloads with size_t type argument
* xml_node::set_value(const char_t* rhs, size_t sz)
* xml_text::set(const char_t* rhs, size_t sz)

Signed-off-by: Ferenc Géczi <ferenc.gm@gmail.com>
2022-09-29 18:26:05 +00:00
Matthäus Brandl
f7de324855 Enable usage of nullptr for MSVC 16 and newer (MSVS 2010) 2022-08-04 16:59:37 +02:00
Arseny Kapoulkine
2639dfd053
Merge pull request #477 from zeux/compactopt
Optimize compact mode
2022-05-31 20:00:01 -05:00
Arseny Kapoulkine
832a4f4914 Use more idiomatic code in this codebase 2022-05-16 19:14:29 -07:00
Arseny Kapoulkine
33a75c734b Fix memory leak during OOM in convert_buffer
This is the same fix as #497, but we're using auto_deleter instead
because if allocation function throws, we can't rely on an explicit call
to deallocate.

Comes along with two tests that validate the behavior.
2022-05-16 19:12:52 -07:00
TodorHryn
6fbf32140b Fix memory leak 2022-05-16 13:21:20 +03:00
Viktor Govako
effc46f0ed Added bool set_value(const char_t* rhs, size_t sz). 2022-04-13 12:25:01 +03:00
Arseny Kapoulkine
dd50fa5b45 Fix PUGIXML_VERSION macro
Also make sure the line shows up in grep when using the current version
number.

Fixes #478.
2022-02-10 08:36:19 -08:00
Arseny Kapoulkine
2fa9158b4f Optimize compact mode: xml_text 2022-02-08 23:04:31 -08:00
Arseny Kapoulkine
fad2d5e4ef Optimize compact mode: xml_attribute/xml_node implementation 2022-02-08 23:00:17 -08:00
Arseny Kapoulkine
f388c465dd Optimize compact mode: reuse access in insert/remove 2022-02-08 22:44:31 -08:00
Arseny Kapoulkine
25c4fb74a8 Update copyright year to 2022 2022-02-08 19:58:58 -08:00
Arseny Kapoulkine
c9e219c17b Update version to 1.12 2022-02-08 19:56:41 -08:00
Arseny Kapoulkine
9ba92a7fa7 Restore compatibility with WinCE
WinCE lacks most recent CRT additions to MSVC; we used to explicitly disable specific sections
of code, but it's more comprehensive to just specify that the CRT is from MSVC7 instead of MSVC8.

Fixes #401
2022-02-08 19:19:34 -08:00
Arseny Kapoulkine
8cece4b9fe Fix a bug in move construction when move source is empty
Previously when copying the allocator state we would copy an incorrect
root pointer into the document's current state; while this had a minimal
impact on the allocation state due to the fact that any new allocation
would need to create a new page, this used a potentially stale field of
the moved document when setting up new pages, which could create issues
in future uses of the pages.

This change fixes the core problem and also removes the use of the
_root->allocator from allocate_page since it's not clear why we need it
there in the first place.
2021-05-11 22:53:54 -07:00
Arseny Kapoulkine
56c9afa7c8 XPath: Improve recursion limit for deep chains of //
Since foo//bar//baz adds two nodes for each //, we need to increment the
depth by 2 on each iteration to limit the AST correctly.

Fixes the stack overflow found by cluster-fuzz (I suspect the issue
there is a bit deeper, but this part is definitely a bug and as such I'd
rather wait for the next test case for now).
2021-05-11 22:27:53 -07:00
Rosen Penev
c167259e60 add empty method
Simple and allows to avoid using std::distance.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2021-04-27 13:55:02 -07:00
Rosen Penev
ef257796db remove const from operator++/--
This prevents usage with C++20 ranges since it does not satisfy
std::weakly_incrementable.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2021-04-26 14:06:19 -07:00
Arseny Kapoulkine
fe4bc946b2 Update copyright year to 2020 everywhere 2020-11-26 01:00:27 -08:00
Arseny Kapoulkine
70bd6a6b0a Update version to 1.11 and update documentation 2020-11-25 10:18:42 -08:00
Arseny Kapoulkine
5f97d5d66f Fix -Wshadow in remove_children()
child variable was shadowing xml_node::child
2020-11-25 09:28:26 -08:00
Arseny Kapoulkine
28aebf2b22
Merge pull request #382 from zeux/TheNicker-master
Fix MSVC deprecation warnings when using clang-cl
2020-11-25 09:19:24 -08:00
Arseny Kapoulkine
df42668e18 Cleanup code and feature detection
We now use open_file similarly to open_file_wide, and activate the
workaround for MSVC 2005+ since that's when the _s versions were added
in the first place.
2020-11-25 08:38:22 -08:00
Arseny Kapoulkine
8e5b8e0f46 XPath: Fix stack overflow in functions with long argument lists
Function call arguments are stored in a list which is processed
recursively during optimize(). We now limit the depth of this construct
as well to make sure optimize() doesn't run out of stack space.
2020-09-11 09:50:41 -07:00
Arseny Kapoulkine
20aef1cd4b Fix stack overflow in tests on MSVC x64
The default stack on MSVC/x64/debug is sufficient for 1692 nested
invocations only, whereas on clang/linux it's ~8K...

For now set the limit to be conservative.
2020-09-10 09:11:46 -07:00
Arseny Kapoulkine
1f84db837b XPath: Restrict AST depth to prevent stack overflow
XPath parser and execution engine isn't stackless; the depth of the
query controls the amount of C stack space required.

This change instruments places in the parser where the control flow can
recurse, requiring too much C stack space to produce an AST, or where a
stackless parse is used to produce arbitrarily deep AST which will
create issues for downstream processing.

As a result XPath parser should now be fuzz safe for malicious inputs.
2020-09-10 00:55:26 -07:00
Lior Lahav
c258fba6f1 Replaced fopen and _wfopen deprecated functions with the safer fopen_s and _wfopen_s 2020-07-21 22:37:16 +03:00