mirror of
https://github.com/protobuf-c/protobuf-c.git
synced 2025-01-02 01:18:08 +08:00
384 lines
12 KiB
XML
384 lines
12 KiB
XML
|
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
|
||
|
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" >
|
||
|
<article>
|
||
|
<title>The C Code Generator</title>
|
||
|
|
||
|
<section>
|
||
|
<title>Design</title>
|
||
|
|
||
|
<para>The overall goal is to keep the code-generator as simple
|
||
|
as possible. Hopefully performance isn't sacrificed to that end!</para>
|
||
|
|
||
|
<para>Anyways, we generate very little code: we mostly generate
|
||
|
structure definitions (for example enums and structures
|
||
|
for messages) and some metadata which is basically
|
||
|
reflection-type data.</para>
|
||
|
|
||
|
<para>The serializing and deserializing is implemented in a library,
|
||
|
called libprotobuf-c rather than generated code.</para>
|
||
|
|
||
|
</section>
|
||
|
<section>
|
||
|
<title>The Generated Code</title>
|
||
|
<para>
|
||
|
For each enum, we generate a C enum.
|
||
|
For each message, we generate a C structure
|
||
|
which can be cast to a <type>ProtobufCMessage</type>.
|
||
|
</para>
|
||
|
<para>
|
||
|
For each enum and message, we generate a descriptor
|
||
|
object that allows us to implement a kind of reflection
|
||
|
on the structures.
|
||
|
</para>
|
||
|
<para>First, some naming conventions:
|
||
|
<itemizedlist>
|
||
|
<listitem><para>
|
||
|
The name of the type for enums and messages and services
|
||
|
is camel case (meaning WordsAreCrammedTogether)
|
||
|
except that double-underscores are used to delimit
|
||
|
scopes. For example:
|
||
|
<programlisting><![CDATA[
|
||
|
package foo.bar;
|
||
|
message BazBah {
|
||
|
int32 val;
|
||
|
}
|
||
|
]]></programlisting>
|
||
|
would generate a C type <type>Foo__Bar__BazBah</type>.</para>
|
||
|
</listitem><listitem>
|
||
|
<para>Functions and globals are all lowercase, with camel-case
|
||
|
words separated by single underscores.
|
||
|
For example:
|
||
|
<programlisting><![CDATA[
|
||
|
Foo__Bar__BazBah *foo__bar__baz_bah__unpack
|
||
|
(ProtobufCAllocator *allocator,
|
||
|
size_t length,
|
||
|
const unsigned char *data);
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
</listitem><listitem>
|
||
|
<para>Enums values are all uppercase.</para>
|
||
|
</listitem>
|
||
|
<listitem><para>
|
||
|
Stuff we dd to your symbol names will also be
|
||
|
separated by a double-underscore. For example,
|
||
|
the unpack method above.</para></listitem>
|
||
|
</itemizedlist>
|
||
|
</para>
|
||
|
<para>
|
||
|
We also generate descriptor objects for messages
|
||
|
and enums. These are declared in the .h files:
|
||
|
<programlisting><![CDATA[
|
||
|
extern const ProtobufCMessageDescriptor
|
||
|
foo__bar__baz_bah__descriptor;
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
<para>
|
||
|
The message structures all begin with <type>ProtobufCMessageDescriptor*</type>
|
||
|
which is sufficient to allow them to be cast to <type>ProtobufCMessage</type>.
|
||
|
</para>
|
||
|
<para>
|
||
|
We generate some functions for each message:
|
||
|
<itemizedlist>
|
||
|
<listitem>
|
||
|
<para><function>unpack()</function>. Unpack data for a particular
|
||
|
message-format:
|
||
|
<programlisting><![CDATA[
|
||
|
Foo__Bar__BazBah *
|
||
|
foo__bar__baz_bah__unpack (ProtobufCAllocator *allocator,
|
||
|
size_t length,
|
||
|
const unsigned char *data);
|
||
|
]]></programlisting>
|
||
|
Note that <parameter>allocator</parameter> may be NULL.
|
||
|
</para>
|
||
|
</listitem>
|
||
|
<listitem>
|
||
|
<para><function>free_unpacked()</function>. Free a message
|
||
|
that you obtained with the unpack method:
|
||
|
<programlisting><![CDATA[
|
||
|
void
|
||
|
foo__bar__baz_bah__free_unpacked (Foo__Bar__BazBah *baz_bah,
|
||
|
ProtobufCAllocator *allocator);
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
</listitem>
|
||
|
<listitem>
|
||
|
<para><function>get_packed_size()</function>. Find how long
|
||
|
the serialized representation of the data will be:
|
||
|
message-format:
|
||
|
<programlisting><![CDATA[
|
||
|
size_t
|
||
|
foo__bar__baz_bah__get_packed_size
|
||
|
(const Foo__Bar__BazBah *message);
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
</listitem>
|
||
|
<listitem>
|
||
|
<para><function>pack()</function>. Pack message
|
||
|
into buffer; assumes that buffer is long enough (use get_packed_size first!).
|
||
|
<programlisting><![CDATA[
|
||
|
size_t
|
||
|
foo__bar__baz_bah__pack
|
||
|
(const Foo__Bar__BazBah *message,
|
||
|
unsigned char *packed_data_out);
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
</listitem>
|
||
|
<listitem>
|
||
|
<para><function>pack_to_buffer()</function>. Pack message
|
||
|
into virtualize buffer.
|
||
|
<programlisting><![CDATA[
|
||
|
size_t
|
||
|
foo__bar__baz_bah__pack_to_buffer
|
||
|
(const Foo__Bar__BazBah *message,
|
||
|
ProtobufCBuffer *buffer);
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
</listitem>
|
||
|
</itemizedlist>
|
||
|
</para>
|
||
|
|
||
|
</section>
|
||
|
|
||
|
<section>
|
||
|
<title>The protobuf-c Library</title>
|
||
|
|
||
|
<para>This library is used by the generated code;
|
||
|
it includes common structures and enums,
|
||
|
as well as functions that most users of the generated code
|
||
|
will want.</para>
|
||
|
|
||
|
<para>
|
||
|
There are three main components:
|
||
|
<orderedlist>
|
||
|
<listitem><para>the Descriptor structures</para></listitem>
|
||
|
<listitem><para>helper structures and objects</para></listitem>
|
||
|
<listitem><para>packing and unpacking code</para></listitem>
|
||
|
</orderedlist>
|
||
|
</para>
|
||
|
|
||
|
</section>
|
||
|
<section>
|
||
|
<title>protobuf-c: the Descriptor structures</title>
|
||
|
|
||
|
<para>For example, enums are described in terms of structures:
|
||
|
|
||
|
<programlisting><![CDATA[
|
||
|
struct _ProtobufCEnumValue
|
||
|
{
|
||
|
const char *name;
|
||
|
const char *c_name;
|
||
|
int value;
|
||
|
};
|
||
|
|
||
|
struct _ProtobufCEnumDescriptor
|
||
|
{
|
||
|
const char *name;
|
||
|
const char *short_name;
|
||
|
const char *package_name;
|
||
|
|
||
|
/* sorted by value */
|
||
|
unsigned n_values;
|
||
|
const ProtobufCEnumValue *values;
|
||
|
|
||
|
/* sorted by name */
|
||
|
unsigned n_value_names;
|
||
|
const ProtobufCEnumValue *values_by_name;
|
||
|
};
|
||
|
]]></programlisting></para>
|
||
|
|
||
|
<para>Likewise, messages are described by:
|
||
|
|
||
|
<programlisting><![CDATA[
|
||
|
struct _ProtobufCFieldDescriptor
|
||
|
{
|
||
|
const char *name;
|
||
|
int id;
|
||
|
ProtobufCFieldLabel label;
|
||
|
ProtobufCFieldType type;
|
||
|
unsigned quantifier_offset;
|
||
|
unsigned offset;
|
||
|
void *descriptor; /* for MESSAGE and ENUM types */
|
||
|
};
|
||
|
struct _ProtobufCMessageDescriptor
|
||
|
{
|
||
|
const char *name;
|
||
|
const char *short_name;
|
||
|
const char *package_name;
|
||
|
|
||
|
/* sorted by field-id */
|
||
|
unsigned n_fields;
|
||
|
const ProtobufCFieldDescriptor *fields;
|
||
|
};
|
||
|
]]></programlisting></para>
|
||
|
|
||
|
<para>
|
||
|
And finally services are described by:
|
||
|
|
||
|
<programlisting><![CDATA[
|
||
|
struct _ProtobufCMethodDescriptor
|
||
|
{
|
||
|
const char *name;
|
||
|
const ProtobufCMessageDescriptor *input;
|
||
|
const ProtobufCMessageDescriptor *output;
|
||
|
};
|
||
|
struct _ProtobufCServiceDescriptor
|
||
|
{
|
||
|
const char *name;
|
||
|
unsigned n_methods;
|
||
|
ProtobufCMethodDescriptor *methods; // sorted by name
|
||
|
};
|
||
|
]]></programlisting></para>
|
||
|
|
||
|
</section>
|
||
|
<section>
|
||
|
<title>protobuf-c: helper structures and typedefs</title>
|
||
|
|
||
|
<para>We defined typedefs for a few types
|
||
|
which are used in .proto files but do not
|
||
|
have obvious standard C equivalents:
|
||
|
<itemizedlist>
|
||
|
<listitem><para>a boolean type (<type>protobuf_c_boolean</type>)</para></listitem>
|
||
|
<listitem><para>a binary-data (bytes) type (<type>ProtobufCBinaryData</type>)</para></listitem>
|
||
|
<listitem><para>the various int types (<type>int32_t</type>, <type>uint32_t</type>, <type>int64_t</type>, <type>uint64_t</type>)
|
||
|
are obtained by including <filename>inttypes.h</filename></para></listitem>
|
||
|
</itemizedlist>
|
||
|
</para>
|
||
|
|
||
|
<para>We also define a simple allocator object, ProtobufCAllocator
|
||
|
that let's you control how allocations are done.
|
||
|
This is predominately used for parsing.</para>
|
||
|
|
||
|
<para>There is a virtual buffer facility that
|
||
|
only has to implement a method to append binary-data
|
||
|
to the buffer. This can be used to serialize messages
|
||
|
to different targets (instead of a flat slab of data).</para>
|
||
|
|
||
|
<para>We define a base-type for all messages,
|
||
|
for code that handles messages generically.
|
||
|
All it has is the descriptor object.</para>
|
||
|
|
||
|
<section id="buffers">
|
||
|
<title>Buffers</title>
|
||
|
<para>One important helper type is the <type>ProtobufCBuffer</type>
|
||
|
which allows you to abstract the target of serialization. The only
|
||
|
thing that a buffer has is an <function>append</function> method:
|
||
|
<programlisting><![CDATA[
|
||
|
struct _ProtobufCBuffer
|
||
|
{
|
||
|
void (*append)(ProtobufCBuffer *buffer,
|
||
|
size_t len,
|
||
|
const unsigned char *data);
|
||
|
}
|
||
|
]]></programlisting>
|
||
|
ProtobufCBuffer subclasses are often defined on the stack.
|
||
|
</para>
|
||
|
|
||
|
<para>
|
||
|
For example, to write to a <type>FILE</type> you could make:
|
||
|
<programlisting><![CDATA[
|
||
|
typedef struct
|
||
|
{
|
||
|
ProtobufCBuffer base;
|
||
|
FILE *fp;
|
||
|
} BufferAppendToFile
|
||
|
static void my_buffer_file_append (ProtobufCBuffer *buffer,
|
||
|
unsigned len,
|
||
|
const unsigned char *data)
|
||
|
{
|
||
|
BufferAppendToFile *file_buf = (BufferAppendToFile *) buffer;
|
||
|
fwrite (data, len, 1, file_buf->fp); // XXX: no error handling!
|
||
|
}
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
|
||
|
<para>
|
||
|
To use this new type of Buffer, you would do something like:
|
||
|
<programlisting><![CDATA[
|
||
|
...
|
||
|
BufferAppendToFile tmp;
|
||
|
tmp.base.append = my_buffer_file_append;
|
||
|
tmp.fp = fp;
|
||
|
protobuf_c_message_pack_to_buffer (&message, &tmp);
|
||
|
...
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
<para>
|
||
|
A commonly builtin subtype is the BufferSimple
|
||
|
which is declared on the stack and uses a scratch buffer provided by the user
|
||
|
for its initial allocation. It does exponential resizing.
|
||
|
To create a BufferSimple, use code like:
|
||
|
<programlisting><![CDATA[
|
||
|
unsigned char pad[128];
|
||
|
ProtobufCBufferSimple buf = PROTOBUF_C_BUFFER_SIMPLE_INIT (pad);
|
||
|
ProtobufCBuffer *buffer = (ProtobufCBuffer *) &simple;
|
||
|
protobuf_c_buffer_append (buffer, 6, (unsigned char *) "hi mom");
|
||
|
]]></programlisting>
|
||
|
You can access the data as buf.len and buf.data. For example,
|
||
|
<programlisting><![CDATA[
|
||
|
assert (buf.len == 6);
|
||
|
assert (memcmp (buf.data, "hi mom", 6) == 0);
|
||
|
]]></programlisting>
|
||
|
To finish up, use:
|
||
|
<programlisting><![CDATA[
|
||
|
PROTOBUF_C_BUFFER_SIMPLE_CLEAR (&buf);
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
</section>
|
||
|
</section>
|
||
|
<section>
|
||
|
<title>protobuf-c: packing and unpacking messages</title>
|
||
|
|
||
|
<para>
|
||
|
To pack messages one first computes their packed size,
|
||
|
then provide a buffer to pack into.
|
||
|
<programlisting><![CDATA[
|
||
|
size_t protobuf_c_message_get_packed_size
|
||
|
(ProtobufCMessage *message);
|
||
|
void protobuf_c_message_pack (ProtobufCMessage *message,
|
||
|
unsigned char *out);
|
||
|
]]></programlisting>
|
||
|
</para>
|
||
|
|
||
|
<para>
|
||
|
Or you can use the "streaming" approach:
|
||
|
<programlisting><![CDATA[
|
||
|
void protobuf_c_message_pack_to_buffer
|
||
|
(ProtobufCMessage *message,
|
||
|
ProtobufCBuffer *buffer);
|
||
|
]]></programlisting>
|
||
|
where <type>ProtobufCBuffer</type> is a base object with an append metod.
|
||
|
See <xref linkend="buffers" />.
|
||
|
</para>
|
||
|
|
||
|
|
||
|
|
||
|
<para>
|
||
|
To unpack messages, you should simple call
|
||
|
<programlisting><![CDATA[
|
||
|
ProtobufCMessage *
|
||
|
protobuf_c_message_unpack (const ProtobufCMessageDescriptor *,
|
||
|
ProtobufCAllocator *allocator,
|
||
|
size_t len,
|
||
|
const unsigned char *data);
|
||
|
]]></programlisting>
|
||
|
If you pass NULL for <parameter>allocator</parameter>, then
|
||
|
the default allocator will be used.
|
||
|
</para>
|
||
|
|
||
|
<para>
|
||
|
You can cast the result to the type that matches
|
||
|
the descriptor.
|
||
|
</para>
|
||
|
|
||
|
<para>
|
||
|
The result of unpacking should be freed with protobuf_c_message_free().
|
||
|
</para>
|
||
|
|
||
|
|
||
|
</section>
|
||
|
<section>
|
||
|
<title>Author</title>
|
||
|
<para>Dave Benson.</para>
|
||
|
</section>
|
||
|
</article>
|