llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-17 08:14:50 +00:00

Author	SHA1	Message	Date
slaren	2005469ea1	Add Q4_3 support to cuBLAS (#1086 )	2023-04-20 20:49:53 +02:00
slaren	02d6988121	Improve cuBLAS performance by dequantizing on the GPU (#1065 )	2023-04-20 03:14:14 +02:00