mirror of
https://github.com/protobuf-c/protobuf-c.git
synced 2024-12-26 21:04:23 +08:00
d9a2f8470d
Originally, someone complained about protobuf_c_message_unpack() using alloca() for the allocation of the temporary bitmap used to detect that all required fields were present in the unpacked message (Issue #60). Commit 248eae1d eliminated the use of alloca(), replacing the variable-length alloca()'d bitmap with a 16 byte stack-allocated bitmap, treating field numbers mod 128. Andrei Nigmatulin noted in PR #137 problems with this approach: Apparently 248eae1d has introduced a serious problem to protobuf-c decoder. Originally the function of required_fields_bitmap was to prevent decoder from returning incomplete messages. That means, each required field in the message either must have a default_value or be present in the protobuf stream. The purpose of this behaviour was to provide user with 100% complete ProtobufCMessage struct on return from protobuf_c_message_unpack(), which does not need to be checked for completeness just after. This is exactly how original protobuf C++ decoder behaves. The patch 248eae1d broke this functionality by hashing bits of required fields instead of storing them separately. Consider a protobuf message with 129 fields where the first and the last fields set as 'required'. In this case it is possible to trick decoder to return incomplete ProtobufCMessage struct with missing required fields by providing only one of the two fields in the source byte stream. This can be considered as a security issue as well because user's code do not expect incomplete messages with missing required fields out from protobuf_c_message_unpack(). Such a change could introduce undefined behaviour to user programs. This patch is based on Andrei's fix and restores the exact detection of missing required fields, but avoids doing a separate allocation for the required fields bitmap except for messages whose descriptors define a large number of fields. In the "typical" case where the message descriptor has <= 128 fields we can just use a 16 byte array allocated on the stack. (Note that the hash-based approach also used a 16 byte stack allocated array.)
91 lines
3.4 KiB
Plaintext
91 lines
3.4 KiB
Plaintext
----------------------
|
|
--- IMPORTANT TODO ---
|
|
----------------------
|
|
- ensure enums are int-size
|
|
- per comments on the main page: wire_format_inl.h is now
|
|
wire_format_lite_inl.h .. adjust somehow .. possibly write an email
|
|
- BUG? shouldn't enum..by_name[] and enum..by_number[] in the generated-code
|
|
be static??
|
|
- dispatch-within-dispatch protection needed (ie assert that you can't
|
|
recursively invoke the main-loop)
|
|
|
|
--------------------
|
|
--- NEEDED TESTS ---
|
|
--------------------
|
|
- test:
|
|
- service method lookups
|
|
- out-of-order fields in messages (ie if the number isn't ascending)
|
|
- gaps in numbers: check that the number of ranges is correct
|
|
- default values
|
|
- message unpack alloc failures when allocating new slab
|
|
- message unpack alloc failures when allocating unknown field buffers
|
|
- packed message corruption.
|
|
- meta-todo: get a list of all the unpack errors together to check off
|
|
- run gcov
|
|
|
|
---------------------
|
|
--- DOCUMENTATION ---
|
|
---------------------
|
|
Document:
|
|
- services
|
|
- check over documentation again
|
|
|
|
--------------------------
|
|
--- LOW PRIORITY STUFF ---
|
|
--------------------------
|
|
- support Group (whatever it is)
|
|
- proper support for extensions
|
|
- slot for ranges in descriptor
|
|
- extends is implemented as c-style function
|
|
whose name is built from the package, the base message type-name
|
|
and the member. which takes the base message and returns the
|
|
value, if it is found in "unknown_values".
|
|
boolean package__extension_member_name__get(Message *message,
|
|
type *out);
|
|
void package__extension_member_name__set_raw(type in,
|
|
ProtobufCUnknownValue *to_init);
|
|
|
|
------------------------------------
|
|
--- EXTREMELY LOW PRIORITY STUFF ---
|
|
------------------------------------
|
|
- stop using qsort in the code generator: find some c++ish way to do it
|
|
|
|
----------------------------------------------
|
|
--- ISSUES WE ARE PROBABLY GOING TO IGNORE ---
|
|
----------------------------------------------
|
|
- strings may not contain NULs
|
|
|
|
-------------------------
|
|
--- IDEAS TO CONSIDER ---
|
|
-------------------------
|
|
|
|
- optimization: structures without repeated members could skip
|
|
the ScannedMember phase
|
|
|
|
- optimization: a way to ignore unknown-fields when unpacking
|
|
|
|
- optimization: certain functions are not well setup for WORDSIZE==64;
|
|
especially the int64 routines are inefficient that way.
|
|
The best might be an internal #define WORDSIZE (sizeof(long)*8)"
|
|
except w/ a real constant there, one that the preprocessor can use.
|
|
I think the functions in protobuf-c.c are already tagged.
|
|
|
|
- lifetime functions for messages:
|
|
message__new()
|
|
return a new message using an allocator with standard allocation policy
|
|
message__unpack_onto(...)
|
|
unpack onto an initialized message
|
|
message__clear(...)
|
|
clears all allocations, does not free the message itself
|
|
message__free(...)
|
|
free the message.
|
|
[yeah, right: after typing it out, i see it's way too complicated]
|
|
|
|
- switching to pure C.
|
|
- Rewrite the code-generator in C, including the parser.
|
|
- This would have the huge advantage that we could use ".proto" files
|
|
directly, instead of having to invoke the compilers.
|
|
- keep in a separate c file for static linking optimziation purposes
|
|
- need alignment tests
|
|
- the CAVEATS should discuss our structure-packing assumptions
|