llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-09-06 15:18:01 -04:00

Files

Georgi Gerganov 8c70a5ff25 batched : add bench tool (#3545 )

* batched : add bench tool

* batched : minor fix table

* batched-bench : add readme + n_kv_max is now configurable

* batched-bench : init warm-up batch

* batched-bench : pass custom set of PP, TG and PL

* batched-bench : add mmq CLI arg

2023-10-11 21:25:33 +03:00

baby-llama

build : enable more non-default compiler warnings (#3200 )

2023-09-28 17:41:44 -04:00

batched

batched : add bench tool (#3545 )

2023-10-11 21:25:33 +03:00

batched-bench

batched : add bench tool (#3545 )

2023-10-11 21:25:33 +03:00

batched.swift

examples : add batched.swift + improve CI for swift (#3562 )

2023-10-11 06:14:05 -05:00

beam-search

llama.cpp : split llama_context_params into model and context params (#3301 )

2023-09-28 22:42:38 +03:00

benchmark

benchmark-matmult : do not use integer abs() on a float (#3277 )

2023-09-20 12:06:08 -04:00

convert-llama2c-to-ggml

check C++ code with -Wmissing-declarations (#3184 )

2023-09-15 15:38:27 -04:00

embd-input

llama.cpp : split llama_context_params into model and context params (#3301 )

2023-09-28 22:42:38 +03:00

embedding

llama.cpp : split llama_context_params into model and context params (#3301 )

2023-09-28 22:42:38 +03:00

export-lora

train : finetune LORA (#2632 )

2023-09-28 21:40:11 +03:00

finetune

finetune : readme fix typo (#3465 )

2023-10-04 09:33:13 +03:00

gguf

check C++ code with -Wmissing-declarations (#3184 )

2023-09-15 15:38:27 -04:00

gptneox-wip

gguf : fix a few general keys (#3341 )

2023-09-27 12:18:07 -04:00

infill

infill. : fix tokenization (#3508 )

2023-10-10 10:31:21 +03:00

jeopardy

parallel : add option to load external prompt file (#3416 )

2023-10-06 16:16:38 +03:00

llama-bench

build : enable more non-default compiler warnings (#3200 )

2023-09-28 17:41:44 -04:00

main

main : consistent prefix/suffix coloring (#3425 )

2023-10-03 21:16:15 +03:00

main-cmake-pkg

cmake : fix transient definitions in find pkg (#3411 )

2023-10-02 12:51:49 +03:00

metal

gguf : new file format with flexible meta data (beta) (#2398 )

2023-08-21 23:07:43 +03:00

parallel

refact : fix convert script + zero out KV cache to avoid nans (#3523 )

2023-10-09 14:32:17 +03:00

perplexity

llama.cpp : split llama_context_params into model and context params (#3301 )

2023-09-28 22:42:38 +03:00

quantize

build : enable more non-default compiler warnings (#3200 )

2023-09-28 17:41:44 -04:00

quantize-stats

llama.cpp : split llama_context_params into model and context params (#3301 )

2023-09-28 22:42:38 +03:00

save-load-state

llama.cpp : split llama_context_params into model and context params (#3301 )

2023-09-28 22:42:38 +03:00

server

infill. : fix tokenization (#3508 )

2023-10-10 10:31:21 +03:00

simple

llama.cpp : split llama_context_params into model and context params (#3301 )

2023-09-28 22:42:38 +03:00

speculative

llama : fix session saving/loading (#3400 )

2023-10-03 21:04:01 +03:00

train-text-from-scratch

gguf : general usability improvements (#3409 )

2023-10-02 14:58:46 -04:00

alpaca.sh

alpaca.sh : update model file name (#2074 )

2023-07-06 19:17:50 +03:00

chat-13B.bat

Create chat-13B.bat (#592 )

2023-03-29 20:21:09 +03:00

chat-13B.sh

examples : read chat prompts from a template file (#1196 )

2023-05-03 20:58:11 +03:00

chat-persistent.sh

llama : fix session saving/loading (#3400 )

2023-10-03 21:04:01 +03:00

chat-vicuna.sh

examples : add chat-vicuna.sh (#1854 )

2023-06-15 21:05:53 +03:00

chat.sh

main : log file (#2748 )

2023-08-30 09:29:32 +03:00

CMakeLists.txt

batched : add bench tool (#3545 )

2023-10-11 21:25:33 +03:00

gpt4all.sh

examples : add -n to alpaca and gpt4all scripts (#706 )

2023-04-13 16:03:39 +03:00

json-schema-to-grammar.py

chmod : make scripts executable (#2675 )

2023-08-23 17:29:09 +03:00

llama2-13b.sh

gitignore : changes for Poetry users + chat examples (#2284 )

2023-07-21 13:53:27 +03:00

llama2.sh

gitignore : changes for Poetry users + chat examples (#2284 )

2023-07-21 13:53:27 +03:00

llama.vim

vim : streaming and more (#2495 )

2023-08-08 14:44:48 +03:00

llm.vim

llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )

2023-08-30 09:50:55 +03:00

make-ggml.py

make-ggml.py : compatibility with more models and GGUF (#3290 )

2023-09-27 19:25:12 +03:00

Miku.sh

MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287 )

2023-07-21 11:13:18 +03:00

reason-act.sh

chmod : make scripts executable (#2675 )

2023-08-23 17:29:09 +03:00

server-llama2-13B.sh

chmod : make scripts executable (#2675 )

2023-08-23 17:29:09 +03:00