llama.cpp/ggml-cuda at 5f2d4e60e202aabee10051e6615bb821e51787be - llama.cpp - Cat's Mantra

tqcq/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-01 15:09:32 -04:00

Files

History

Clint Herron 07a3fc0608 Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )

2024-07-02 12:18:10 -04:00

..

template-instances

…

acc.cu

…

acc.cuh

…

arange.cu

…

arange.cuh

…

argsort.cu

…

argsort.cuh

…

binbcast.cu

…

binbcast.cuh

…

clamp.cu

…

clamp.cuh

…

common.cuh

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00

concat.cu

…

concat.cuh

…

convert.cu

…

convert.cuh

…

cpy.cu

Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )

2024-07-02 12:18:10 -04:00

cpy.cuh

…

dequantize.cuh

…

diagmask.cu

…

diagmask.cuh

…

dmmv.cu

…

dmmv.cuh

…

fattn-common.cuh

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00

fattn-tile-f16.cu

…

fattn-tile-f16.cuh

…

fattn-tile-f32.cu

…

fattn-tile-f32.cuh

…

fattn-vec-f16.cuh

…

fattn-vec-f32.cuh

…

fattn-wmma-f16.cuh

…

fattn.cu

…

fattn.cuh

…

getrows.cu

…

getrows.cuh

…

im2col.cu

…

im2col.cuh

…

mma.cuh

…

mmq.cu

…

mmq.cuh

…

mmvq.cu

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00

mmvq.cuh

…

norm.cu

…

norm.cuh

…

pad.cu

…

pad.cuh

…

pool2d.cu

…

pool2d.cuh

…

quantize.cu

…

quantize.cuh

…

rope.cu

…

rope.cuh

…

scale.cu

…

scale.cuh

…

softmax.cu

…

softmax.cuh

…

sumrows.cu

…

sumrows.cuh

…

tsembd.cu

…

tsembd.cuh

…

unary.cu

…

unary.cuh

…

upscale.cu

…

upscale.cuh

…

vecdotq.cuh

CUDA: refactor and optimize IQ MMVQ (#8215 )

2024-07-01 20:39:06 +02:00