Originally, someone complained about protobuf_c_message_unpack() using
alloca() for the allocation of the temporary bitmap used to detect that
all required fields were present in the unpacked message (Issue #60).
Commit 248eae1d eliminated the use of alloca(), replacing the
variable-length alloca()'d bitmap with a 16 byte stack-allocated bitmap,
treating field numbers mod 128.
Andrei Nigmatulin noted in PR #137 problems with this approach:
Apparently 248eae1d has introduced a serious problem to protobuf-c
decoder.
Originally the function of required_fields_bitmap was to prevent
decoder from returning incomplete messages. That means, each
required field in the message either must have a default_value or be
present in the protobuf stream. The purpose of this behaviour was to
provide user with 100% complete ProtobufCMessage struct on return
from protobuf_c_message_unpack(), which does not need to be checked
for completeness just after. This is exactly how original protobuf
C++ decoder behaves. The patch 248eae1d broke this functionality by
hashing bits of required fields instead of storing them separately.
Consider a protobuf message with 129 fields where the first and the
last fields set as 'required'. In this case it is possible to trick
decoder to return incomplete ProtobufCMessage struct with missing
required fields by providing only one of the two fields in the
source byte stream. This can be considered as a security issue as
well because user's code do not expect incomplete messages with
missing required fields out from protobuf_c_message_unpack(). Such a
change could introduce undefined behaviour to user programs.
This patch is based on Andrei's fix and restores the exact detection of
missing required fields, but avoids doing a separate allocation for the
required fields bitmap except for messages whose descriptors define a
large number of fields. In the "typical" case where the message
descriptor has <= 128 fields we can just use a 16 byte array allocated
on the stack. (Note that the hash-based approach also used a 16 byte
stack allocated array.)
protoc may not be on the default PATH, so augment $PATH with the
executable path registered by pkg-config for the protobuf package.
additionally declare PROTOC as a precious variable, thus allowing it to
be explicitly set by the user at ./configure time.
based on a patch from Andrei Nigmatulin.
the protobuf header files may be installed in a non-standard location
and thus we need to use the CFLAGS registered for protobuf in pkg-config
in order to find them.
based on a patch from Andrei Nigmatulin.
if pkg-config is installed, the libprotobuf-c .pc file will be
installed; if pkg-config is not installed, the .pc file won't be
installed.
this behavior only applies when we're building with ./configure
--disable-protoc, since pkg-config is required in order to detect the
protobuf dependency.
this is conditional on whether the linker supports version scripts, for
which we use the gl_LD_VERSION_SCRIPT macro from the gnulib project.
on platforms without version scripts, we fall back to libtool's
-export-symbols-regex.
it's possible for the <google/protobuf/compiler/> header files to be
shipped in a separate package (e.g., debian's libprotoc-dev). check for
this at configure time rather than allowing the build process to fail.
there is some confusion with regard to the use of lower case letters in
enum values. take the following message definition:
message LowerCase {
enum CaseEnum {
UPPER = 1;
lower = 2;
}
optional CaseEnum value = 1 [default = lower];
}
this generates the following C enum:
typedef enum _LowerCase__CaseEnum {
LOWER_CASE__CASE_ENUM__UPPER = 1,
LOWER_CASE__CASE_ENUM__lower = 2
_PROTOBUF_C_FORCE_ENUM_TO_BE_INT_SIZE(LOWER_CASE__CASE_ENUM)
} LowerCase__CaseEnum;
note that the case of the enum value 'lower' was preserved in the C
symbol name as 'LOWER_CASE__CASE_ENUM__lower', but that the _INIT macro
references the same enum value with the (non-existent) C symbol name
'LOWER_CASE__CASE_ENUM__LOWER':
#define LOWER_CASE__INIT \
{ PROTOBUF_C_MESSAGE_INIT (&lower_case__descriptor) \
, 0,LOWER_CASE__CASE_ENUM__LOWER }
additionally, the ProtobufCEnumValue array generated also refers to the
same enum value with the (non-existent) upper cased version:
const ProtobufCEnumValue lower_case__case_enum__enum_values_by_number[2] =
{
{ "UPPER", "LOWER_CASE__CASE_ENUM__UPPER", 1 },
{ "lower", "LOWER_CASE__CASE_ENUM__LOWER", 2 },
};
we should preserve the existing behavior of copying the case from the
enum values in the message definition and fix up the places where the
(non-existent) upper case version is used, rather than changing the enum
definition itself to match the case used in the _INIT macro and
enum_values_by_number array, because it's possible that there might be
existing working code that uses enum values with lower case letters that
would be affected by such a change.
incidentally, google's C++ protobuf implementation preserves case in
enum values. protoc --cpp_out generates the following enum declaration
for the message descriptor above:
enum LowerCase_CaseEnum {
LowerCase_CaseEnum_UPPER = 1,
LowerCase_CaseEnum_lower = 2
};
Still need to add the comments in the source code. Currently I've
seeded it with the libprotobuf-c files. I've configured it
to make man pages and html pages. Might not be ideal, but makes it easy
for me to check things (html is nicer, but man pages are handier for
remote servers).
It’s important to note that, differently from what we’ve seen for
the serial test harness (see Parallel Test Harness), the
AM_TESTS_ENVIRONMENT and TESTS_ENVIRONMENT variables cannot be use
to define a custom test runner; the LOG_COMPILER and LOG_FLAGS (or
their extension-specific counterparts) should be used instead:
## This is WRONG!
AM_TESTS_ENVIRONMENT = PERL5LIB='$(srcdir)/lib' $(PERL) -Mstrict -w
## Do this instead.
AM_TESTS_ENVIRONMENT = PERL5LIB='$(srcdir)/lib'; export PERL5LIB;
LOG_COMPILER = $(PERL)
AM_LOG_FLAGS = -Mstrict -w
(http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html)
"As with the serial harness above, by default one status line is printed
per completed test, and a short summary after the suite has completed.
However, standard output and standard error of the test are redirected
to a per-test log file, so that parallel execution does not produce
intermingled output. The output from failed tests is collected in the
test-suite.log file. If the variable ‘VERBOSE’ is set, this file is
output after the summary."
(http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html)
Makefile.am: add valgrind to the AM_TESTS_ENVIRONMENT
configure.ac: enable valgrind testing option for ./configure
m4/valgrind-tests.m4: enable tracing children for libtool wrapper
script compatibility, but ignore standard binaries in /usr or /bin
it turned out to be a bad idea to not export the default allocator's
methods via the ProtobufCAllocator's function pointers. there is at
least one user (protobuf-c-rpc) that makes allocations by directly
invoking those methods.
according to microsoft's platform evangelist, "we recommend that you
consider using a different compiler such as Intel or gcc" if you need a
conforming C compiler. since there's already a project that maintains
the stdint.h / inttypes.h headers for microsoft compilers
(https://code.google.com/p/msinttypes/) there's not much point in
maintaining this ourselves.
there's not much point to having the "private" definitions split out
into a separate header file. they're still in the namespace and there's
nothing that can be done to prevent "unauthorized" uses. just integrate
the definitions into the main header file but put them in the bottom and
note that they're "private".
this makes it very slightly easier to copy the protobuf-c support
library into another project wholesale, since one less file is required.
i'm not quite sure what this thing is used for or how useful it could
possibly be. it's not used anywhere in protobuf-c or protobuf-c-rpc, and
it wasn't in protobuf-c 0.14 or earlier. just delete it; if anything
actually needs this kind of logic, it's easy enough to replicate it.
libraries should never generate output on their own to stdout/stderr.
remove the PRINT_UNPACK_ERRORS macro and rename UNPACK_ERROR to
PROTOBUF_C_UNPACK_ERROR.
the error strings are left in but compiled out by default. they could
theoretically be re-enabled for a debugging session by changing the
PROTOBUF_C_UNPACK_ERROR macro to something like:
#define PROTOBUF_C_UNPACK_ERROR(...) do { fprintf(stderr, __VA_ARGS__); fputc('\n', stderr); } while (0)
some of these headers aren't used in the protobuf-c code base any more,
and in any case the results of these checks (the HAVE_*_H defines in
config.h) are not actually used anywhere and the absence of any of these
headers doesn't cause configure to fail, so just delete these useless
checks.
this reworks memory allocation throughout the support library.
the old DO_ALLOC macro had several problems:
1) only by reading the macro implementation is it possible to tell
what actually occurs. consider:
DO_ALLOC(x, ...);
vs.:
x = do_alloc(...);
only in the latter is it clear that x is being assigned to.
2) it looks like a typical macro/function call, except it alters the
control flow, usually by return'ing or executing a goto in the
enclosing function. this type of anti-pattern is explicitly called out
in the linux kernel coding style.
3) in one instance, setting the destination pointer to NULL is
actually a *success* return. in parse_required_member(), when parsing
a PROTOBUF_C_TYPE_BYTES wire field, it is possible that the field is
present but of zero length, in which case memory shouldn't be
allocated and nothing should actually be copied. this is not apparent
from reading:
DO_ALLOC(bd->data, allocator, len - pref_len, return 0);
memcpy(bd->data, data + pref_len, len - pref_len);
instead, make this behavior explicit:
if (len - pref_len > 0) {
bd->data = do_alloc(allocator, len - pref_len);
if (bd->data == NULL)
return 0;
memcpy(bd->data, data + pref_len, len - pref_len);
}
this is much more readable and makes it possible to write a
replacement for DO_ALLOC which returns NULL on failures.
this changes the protobuf_c_default_allocator to contain only NULL
values; if a replacement function pointer is not present (non-NULL) in
this struct, the default malloc/free implementations are used. this
makes it impossible to call the default allocator functions directly and
represents an API/ABI break, which required a fix to the
PROTOBUF_C_BUFFER_SIMPLE_CLEAR macro.
despite turning one-line allocations in the simple case:
DO_ALLOC(rv, allocator, desc->sizeof_message, return NULL);
into three-line statements like:
rv = do_alloc(allocator, desc->sizeof_message);
if (!rv)
return (NULL);
this changeset actually *reduces* the total number of lines in the
support library.
in general, libraries shouldn't be responsible for terminating the
program if memory allocation fails. if we need to allocate memory and
can't, we should be returning a failure indicator, not providing a
strange interface to the user for receiving a callback in the event of
such an error.
also in general, libraries should never write to stdout or stderr.
this breaks the API/ABI and will require a note in the ChangeLog.
i'm confused as to why these fields exist, since the typical
implementation of a "temporary alloc" would be something like alloca(),
and alloca() is usually just inlined code that adjusts the stack
pointer, which is not a function whose address could be taken.
this breaks the API/ABI and will require a note in the ChangeLog.
possibly we could revisit the idea of "temporary allocations" by using
C99 variable length arrays. this would have the advantage of being
standardized, unlike alloca().
this should silence Coverity #1153648, which complains because
tmp.length_prefix_len is uninitialized for certain wire types when
copied on line 2486:
scanned_member_slabs[which_slab][in_slab_index++] = tmp;
dave's original style drives me crazy. reformat the C code in
protobuf-c/ with "indent -kr -i8" and manually reflow for readability.
try to fit most lines in 80 columns, but due to the lengthy type and
function names in protobuf-c, enforcing an 80 column rule would result
in a lot of cramped statements, so try to fit lines in up to 100 columns
if it would improve readability. (e.g., one <=100 column line is
probably better than 3-4 <=80 column lines.)
ultimately i'd like to adopt most of the recommendations in the linux
coding style: https://www.kernel.org/doc/Documentation/CodingStyle.
this commit gets us most of the kernel indentation and comment coding
style recommendations. later commits will tackle style recommendations
that require more intrusive changes: breaking up large functions,
replacing macros that affect control flow (e.g., DO_ALLOC). this will
hopefully facilitate review and make the code base easier to maintain.
i ran the old and new versions of protobuf-c.c through something like:
gcc -S -D__PRETTY_FUNCTION__=0 -D__FILE__=0 -D__LINE__=0 -Wall -O0 \
-o protobuf-c.S -c protobuf-c.c
and reviewed the diffs of the assembly output to spot any functions that
changed, and went back to make sure that any differences were
functionally equivalent.
of the same field on the wire (Fixes#91)
t/generated-code2/test-generated-code2.c: add a test case for merging
messages
t/test-full.proto: expand message definitions to test for merging nested
messages