llama.cpp/pocs/CMakeLists.txt

# dependencies

find_package(Threads REQUIRED)

# third-party

include_directories(${CMAKE_CURRENT_SOURCE_DIR})

if (EMSCRIPTEN)
else()
    if (NOT GGML_BACKEND_DL)
        add_subdirectory(vdot)
    endif()
endif()
Adding a simple program to measure speed of dot products (#1041) On my Mac, the direct Q4_1 product is marginally slower (~69 vs ~55 us for Q4_0). The SIMD-ified ggml version is now almost 2X slower (~121 us). On a Ryzen 7950X CPU, the direct product for Q4_1 quantization is faster than the AVX2 implementation (~60 vs ~62 us). --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com> 2023-04-18 21:00:14 +02:00			`# dependencies`

			`find_package(Threads REQUIRED)`

			`# third-party`

			`include_directories(${CMAKE_CURRENT_SOURCE_DIR})`

			`if (EMSCRIPTEN)`
			`else()`
ggml : add support for dynamic loading of backends (#10469) * ggml : add support for dynamic loading of backends --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2024-11-25 15:13:39 +01:00			`if (NOT GGML_BACKEND_DL)`
			`add_subdirectory(vdot)`
			`endif()`
Adding a simple program to measure speed of dot products (#1041) On my Mac, the direct Q4_1 product is marginally slower (~69 vs ~55 us for Q4_0). The SIMD-ified ggml version is now almost 2X slower (~121 us). On a Ryzen 7950X CPU, the direct product for Q4_1 quantization is faster than the AVX2 implementation (~60 vs ~62 us). --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com> 2023-04-18 21:00:14 +02:00			`endif()`