llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-25 04:02:50 +00:00

Files

Jeff Bolz af148c9386 vulkan: Optimize binary ops (#10270 )

Reuse the index calculations across all of src0/src1/dst. Add a shader
variant for when src0/src1 are the same dimensions and additional modulus
for src1 aren't needed. Div/mod are slow, so add "fast" div/mod that
have a fast path when the calculation isn't needed or can be done more
cheaply.

2024-11-14 06:22:55 +01:00

cmake

llama : reorganize source code + improve CMake (#8006 )

2024-06-26 18:33:02 +03:00

include

metal : optimize FA kernels (#10171 )

2024-11-08 13:47:22 +02:00

src

vulkan: Optimize binary ops (#10270 )

2024-11-14 06:22:55 +01:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

metal : opt-in compile flag for BF16 (#10218 )

2024-11-08 21:59:46 +02:00