5592f278b6
ggml-cpu : remove stdlib include from repack.cpp (ggml/1276)
...
This commit removes the inclusion of `<cstdlib>`.
The motivation for this change is that this source file does not seem to
use any functions from this header and the comment about `qsort` is a
little misleading/confusing.
2025-07-24 20:27:23 +03:00
60ef23d6c1
ggml-cpu: enable IBM NNPA Vector Intrinsics ( #14317 )
...
* ggml-cpu: add nnpa compile flag
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 4a9f60c201
)
* ggml-cpu: add fp16->fp32 nnpa first
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 8d4a7987f9
)
* ggml-cpu: add fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 0ff0d65162
)
* ggml-cpu: better variable names
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 2f58bbcbb8
)
* docs: update s390x docs
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 01b929491b
)
* ggml-cpu: add debugging prints to see if dlf16 is correct
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix print vs printf
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix float placeholder
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: ensure fp16 and fp32 load and stores are called
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fp16 load ensured to hit
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: remove sigint from fp16 store
for some reason, the function is not getting a hit when debugged with
gdb. we will need to investigate further
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: activate nnpa for ggml_cpu_fp16_to_fp32
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: nnpa activate ggml_cpu_fp16_to_fp32 for 8 elements
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: nnpa switch to vec_xst test
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: switch to vec_xst for 4 element loops also
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: rework noop
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: remove noop, general code cleanup
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: clarify variable naming
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: activate nnpa for ggml_cpu_fp32_to_fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add breakpoint for debugging
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: test fix for conversion failure
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: disable fp32->fp16 nnpa conversions for now
there are some conversion failures in nnpa that requires the eyes of an
ibm stsm. will create a separate pr to introduce the fp32->fp16 change.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: switch to elif macro
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: reattempt fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix typo
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: reattempt fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix compiler types
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: change to typedef vector types
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add 4 element loops for fp32->fp16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: clarified vector naming
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: bring back fp32->fp16 store nnpa
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: activate nnpa fp32->fp16 or fp16->fp32 compute
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add nnpa macro check in ggml-impl
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add missing __func__
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: diagnose why __NNPA__ macro is not being defined
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: import vecintrin.h to fix compiler errors
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: update macro tests
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: move s390x typedef to own header file
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml-cpu: move s390x typedef to own header file"
This reverts commit 157f856c34
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: switch to importing ggml-cpu-impl instead
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix macro declaration
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: test more macros
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add debug prints
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: bruteforce macro definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: move macro definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add ggml-impl.h to cmakelists
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: switch to private macros
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: move s390x typedef to own header file
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 157f856c34
)
* ggml-cpu: move things around
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: bring back compile macros
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: switch to quotes for import
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add compiler error macro
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add s390x detection in ggml-src
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: bring back compile definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: undo cmakelists work
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml-cpu: move s390x typedef to own header file"
This reverts commit 18d79e1a30
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: remove typedefs.h
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: remove typedef from cmakelists
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add ggml-impl.h future notes
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: add todo comment for future reference
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: clarify naming of dlf16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: remove unnecessary target compile definitions
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: move nnpa fp16->fp32 and fp32->fp16 to simd-mappings
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* docs: update broken huggingface link for s390x
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix duplicate func names during compile
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml-cpu: fix duplicate func names during compile"
This reverts commit fbb733451f
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu"
This reverts commit bd288e8fa5
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml: refactor fp16<->fp32 simd to ggml-cpu
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix missing simd-mappings.h import in quants.c
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix missing simd-mappings.h within repack
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix amx mmq missing simd-mappings.h
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: attempt at fixing loongarch failing build
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: move nnpa together with other fp16<->fp32 simd
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: fix wrong refactor of ggml-base
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164176555
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml: remove dependency on ggml-cpu from ggml-base
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: rename all fp16<->fp32 macros to prefix with ggml_cpu
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164449406
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: remove mistaken fallback macro
fallback logic was already implemented but i was too sleepy to realise
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml: move ggml_table_f32_f16 to ggml-cpu
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164775006
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures"
This reverts commit 32a3533564
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml: move ggml_table_f32_f16 to ggml-cpu"
This reverts commit 9e40d984ad
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml: move ggml_table_f32_f16 to ggml-cpu
ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164775006
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
(cherry picked from commit 9e40d984ad
)
* ggml: move ggml_table_f32_f16 to ggml-cpu.c
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: extern c ggml_table_f32_f16 + chore docs
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h
we rely on the variable declaration in ggml-cpu.c instead
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h"
This reverts commit f71b21d2f7
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* ggml-cpu: bring back ggml_table_f32_f16
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* Revert "ggml-cpu: bring back ggml_table_f32_f16"
This reverts commit 2dce119178
.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
* fix ggml time initialization
* fix f32_f16 table init
* remove extra line
---------
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
Co-authored-by: slaren <slarengh@gmail.com >
2025-06-25 23:49:04 +02:00
6369be0735
Implement GGML_CPU_ALL_VARIANTS for PowerPC ( #14286 )
...
* Add PowerPC feature detection and scoring
* ggml-cpu: Implement GGML_CPU_ALL_VARIANTS for PowerPC
* ggml-cpu: Delay some initializations until function is called
When using GGML_BACKEND_DL=ON, these initializations might use
instructions that are not supported by the current CPU.
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com >
2025-06-20 14:17:32 +02:00
d27b3ca175
ggml : fix repack work size for mul_mat_id ( #14292 )
...
ggml-ci
2025-06-20 11:19:15 +03:00
860a9e4eef
ggml-cpu : remove the weak alias trick ( #14221 )
2025-06-17 12:58:32 +03:00
3555b3004b
ggml-cpu : rework weak alias on apple targets ( #14146 )
...
* ggml-cpu : rework weak alias on apple targets
* fix powerpc detection
* fix ppc detection
* fix powerpc detection on darwin
2025-06-16 13:54:15 +08:00
f470bc36be
ggml-cpu : split arch-specific implementations ( #13892 )
...
* move ggml-cpu-aarch64 to repack
* split quantize_row_q8_0/1
* split helper functions
* split ggml_vec_dot_q4_0_q8_0
* split ggml_vec_dot_q4_1_q8_1
* split ggml_vec_dot_q5_0_q8_0
* split ggml_vec_dot_q5_1_q8_1
* split ggml_vec_dot_q8_0_q8_0
* split ggml_vec_dot_tq1_0_q8_K
* split ggml_vec_dot_tq2_0_q8_K
* split ggml_vec_dot_q2_K_q8_K
* split ggml_vec_dot_q3_K_q8_K
* split ggml_vec_dot_q4_K_q8_K
* split ggml_vec_dot_q5_K_q8_K
* split ggml_vec_dot_q6_K_q8_K
* split ggml_vec_dot_iq2_xxs_q8_K
* split ggml_vec_dot_iq2_xs_q8_K
* split ggml_vec_dot_iq2_s_q8_K
* split ggml_vec_dot_iq3_xxs_q8_K
* split ggml_vec_dot_iq3_s_q8_K
* split ggml_vec_dot_iq1_s_q8_K
* split ggml_vec_dot_iq1_m_q8_K
* split ggml_vec_dot_iq4_nl_q8_0
* split ggml_vec_dot_iq4_xs_q8_K
* fix typos
* fix missing prototypes
* rename ggml-cpu-quants.c
* rename ggml-cpu-traits
* rename arm folder
* move cpu-feats-x86.cpp
* rename ggml-cpu-hbm
* update arm detection macro in quants.c
* move iq quant tables
* split ggml_quantize_mat_q8_0/K
* split ggml_gemv_*
* split ggml_gemm_*
* rename namespace aarch64 to repack
* use weak aliases to replace test macros
* rename GGML_CPU_AARCH64 to GGML_CPU_REPACK
* rename more aarch64 to repack
* clean up rebase leftover
* fix compilation errors
* remove trailing spaces
* try to fix clang compilation errors
* try to fix clang compilation errors again
* try to fix clang compilation errors, 3rd attempt
* try to fix clang compilation errors, 4th attempt
* try to fix clang compilation errors, 5th attempt
* try to fix clang compilation errors, 6th attempt
* try to fix clang compilation errors, 7th attempt
* try to fix clang compilation errors, 8th attempt
* try to fix clang compilation errors, 9th attempt
* more cleanup
* fix compilation errors
* fix apple targets
* fix a typo in arm version of ggml_vec_dot_q4_K_q8_K
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2025-06-09 16:47:13 +02:00