Files
llama.cpp/ggml/src
Jeff Bolz af148c9386 vulkan: Optimize binary ops (#10270)
Reuse the index calculations across all of src0/src1/dst. Add a shader
variant for when src0/src1 are the same dimensions and additional modulus
for src1 aren't needed. Div/mod are slow, so add "fast" div/mod that
have a fast path when the calculation isn't needed or can be done more
cheaply.
2024-11-14 06:22:55 +01:00
..
2024-10-18 13:34:36 +08:00
2024-11-04 23:06:31 +01:00
2024-11-08 13:47:22 +02:00