1
0
mirror of https://github.com/wqking/eventpp.git synced 2024-12-27 00:17:02 +08:00
eventpp/doc/benchmark.md

6.9 KiB

Benchmarks

Hardware: HP laptop, Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz, 16 GB RAM
Software: Windows 10, MinGW GCC 11.3.0, MSVC 2022
Time unit: milliseconds (unless explicitly specified)

Unless it's specified, the default compiler is GCC.
The hardware used for benchmark is pretty medium to low end at the time of benchmarking (December 2023).

EventQueue enqueue and process -- single threading

Iterations Queue size Event count Event Types Listener count Time of single threading Time of multi threading
100k 100 10M 100 100 289 939
100k 1000 100M 100 100 2822 9328
100k 1000 100M 1000 1000 2923 9502

Given eventpp::EventQueue<size_t, void (size_t), Policies>, which Policies is either single threading or multi threading, the benchmark adds Listener count listeners to the queue, each listener is an empty lambda. Then the benchmark starts timing. It loops Iterations times. In each loop, the benchmark puts Queue size events, then process the event queue.
There are Event types kinds of event type. Event count is Iterations * Queue size.
The EventQueue is processed in one thread. The Single/Multi threading in the table means the policies used.

EventQueue enqueue and process -- multiple threading

Mutex Enqueue threads Process threads Event count Event Types Listener count Time
std::mutex 1 1 10M 100 100 1824
SpinLock 1 1 10M 100 100 1303
std::mutex 1 3 10M 100 100 2989
SpinLock 1 3 10M 100 100 3186
std::mutex 2 2 10M 100 100 3151
SpinLock 2 2 10M 100 100 3049
std::mutex 4 4 10M 100 100 1657
SpinLock 4 4 10M 100 100 1659
std::mutex 16 16 10M 100 100 708
SpinLock 16 16 10M 100 100 1891

There are Enqueue threads threads enqueuing events to the queue, and Process threads threads processing the events. The total event count is Event count. Mutex is the mutex type used to protect the data.
The multi threading version shows slower than previous single threading version, since the mutex locks cost time.
When there are fewer threads (about around the number of CPU cores which is 4 here), eventpp::SpinLock has better performance than std::mutex. But there are much more threads than CPU cores (here is 16 enqueue threads and 16 process threads), eventpp::SpinLock has worse performance than std::mutex.

CallbackList append/remove callbacks

The benchmark loops 100K times, in each loop it appends 1000 empty callbacks to a CallbackList, then remove all that 1000 callbacks. So there are totally 100M append/remove operations.
The total benchmarked time is about 16000 milliseconds. That's to say in 1 milliseconds there can be 6000 append/remove operations.

CallbackList invoking VS native function invoking

Iterations: 100,000,000

Function Compiler Native invoking CallbackList single threading CallbackList multi threading
Inline global function MSVC 139 1267 3058
GCC 141 1149 2563
Non-inline global function MSVC 143 1273 3047
GCC 132 1218 2583
Function object MSVC 139 1198 2993
GCC 141 1107 2633
Member virtual function MSVC 159 1221 3076
GCC 140 1231 2691
Member non-virtual function MSVC 140 1266 3054
GCC 140 1193 2701
Member non-inline virtual function MSVC 158 1223 3103
GCC 133 1231 2676
Member non-inline non-virtual function MSVC 134 1266 3028
GCC 134 1205 2652
All functions MSVC 91 903 2214
GCC 89 858 1852

Testing functions

#if defined(_MSC_VER)
#define NON_INLINE __declspec(noinline)
#else
// gcc
#define NON_INLINE __attribute__((noinline))
#endif

volatile int globalValue = 0;

void globalFunction(int a, const int b)
{
    globalValue += a + b;
}

NON_INLINE void nonInlineGlobalFunction(int a, const int b)
{
    globalValue += a + b;
}

struct FunctionObject
{
    void operator() (int a, const int b)
    {
        globalValue += a + b;
    }

    virtual void virFunc(int a, const int b)
    {
        globalValue += a + b;
    }

    void nonVirFunc(int a, const int b)
    {
        globalValue += a + b;
    }

    NON_INLINE virtual void nonInlineVirFunc(int a, const int b)
    {
        globalValue += a + b;
    }

    NON_INLINE void nonInlineNonVirFunc(int a, const int b)
    {
        globalValue += a + b;
    }
};

#undef NON_INLINE