5.3 KiB
Benchmarks
Hardware: Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz
Software: Windows 10, MSVC 2017, MinGW GCC 7.2.0
Time unit: milliseconds (unless explicitly specified)
EventQueue enqueue and process -- single threading
Iterations | Queue size | Event count | Event Types | Listener count | Time of single threading | Time of multi threading |
---|---|---|---|---|---|---|
100k | 100 | 10M | 100 | 100 | 401 | 1146 |
100k | 1000 | 100M | 100 | 100 | 4012 | 11467 |
100k | 1000 | 100M | 1000 | 1000 | 4102 | 11600 |
Given eventpp::EventQueue<size_t, void (size_t), Policies>
, which Policies
is either single threading or multi threading, the benchmark adds Listener count
listeners to the queue, each listener is an empty lambda. Then the benchmark starts timing. It loops Iterations
times. In each loop, the benchmark puts Queue size
events, then process the event queue.
There are Event types
kinds of event type. Event count
is Iterations * Queue size
.
The EventQueue is processed in one thread. The Single/Multi threading in the table means the policies used.
EventQueue enqueue and process -- multiple threading
Enqueue threads | Process threads | Event count | Event Types | Listener count | Time |
---|---|---|---|---|---|
1 | 1 | 10M | 100 | 100 | 2387 |
1 | 1 | 100M | 100 | 100 | 23656 |
1 | 3 | 10M | 100 | 100 | 3755 |
1 | 3 | 100M | 100 | 100 | 37983 |
2 | 2 | 10M | 100 | 100 | 4323 |
2 | 2 | 100M | 100 | 100 | 42263 |
There are Enqueue threads
threads enqueuing events to the queue, and Process threads
threads processing the events. The total event count is Event count
.
The multi threading version shows slower than previous single threading version, since the mutex locks cost time.
CallbackList append/remove callbacks
The benchmark loops 100K times, in each loop it appends 1000 empty callbacks to a CallbackList, then remove all that 1000 callbacks. So there are totally 100M append/remove operations.
The total benchmarked time is about 21000 milliseconds. That's to say in 1 milliseconds there can be 5000 append/remove operations.
CallbackList invoking VS native function invoking
Iterations: 100,000,000
Function | Compiler | Native invoking | CallbackList single threading | CallbackList multi threading |
---|---|---|---|---|
Inline global function | MSVC 2017 | 217 | 1501 | 6921 |
GCC 7.2 | 187 | 1489 | 4463 | |
Non-inline global function | MSVC 2017 | 241 | 1526 | 6544 |
GCC 7.2 | 233 | 1488 | 4787 | |
Function object | MSVC 2017 | 194 | 1498 | 6433 |
GCC 7.2 | 212 | 1485 | 4951 | |
Member virtual function | MSVC 2017 | 207 | 1533 | 6558 |
GCC 7.2 | 212 | 1485 | 4489 | |
Member non-virtual function | MSVC 2017 | 214 | 1533 | 6390 |
GCC 7.2 | 211 | 1486 | 4872 | |
Member non-inline virtual function | MSVC 2017 | 206 | 1522 | 6578 |
GCC 7.2 | 182 | 1666 | 4593 | |
Member non-inline non-virtual function | MSVC 2017 | 206 | 1491 | 6992 |
GCC 7.2 | 205 | 1486 | 4490 | |
All functions | MSVC 2017 | 1374 | 10951 | 29973 |
GCC 7.2 | 1223 | 9770 | 22958 |
Testing functions
#if defined(_MSC_VER)
#define NON_INLINE __declspec(noinline)
#else
// gcc
#define NON_INLINE __attribute__((noinline))
#endif
volatile int globalValue = 0;
void globalFunction(int a, const int b)
{
globalValue += a + b;
}
NON_INLINE void nonInlineGlobalFunction(int a, const int b)
{
globalValue += a + b;
}
struct FunctionObject
{
void operator() (int a, const int b)
{
globalValue += a + b;
}
virtual void virFunc(int a, const int b)
{
globalValue += a + b;
}
void nonVirFunc(int a, const int b)
{
globalValue += a + b;
}
NON_INLINE virtual void nonInlineVirFunc(int a, const int b)
{
globalValue += a + b;
}
NON_INLINE void nonInlineNonVirFunc(int a, const int b)
{
globalValue += a + b;
}
};
#undef NON_INLINE