llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-04 00:08:38 -04:00

Files

Francis Couture-Harpin 04eec58112 ggml : remove q1_3 and q2_2

* llama : remove the separate scale tensors of BitNet b1.58

They won't be needed, since the remaining ternary quant types have
built-in scales.

2024-08-02 20:16:26 -04:00

baby-llama

…

batched

batched: fix n_predict parameter (#8527 )

2024-07-17 10:34:28 +03:00

batched-bench

…

batched.swift

Detokenizer fixes (#8039 )

2024-07-05 19:01:35 +02:00

benchmark

…

convert-llama2c-to-ggml

…

cvector-generator

…

deprecation-warning

examples : remove finetune and train-text-from-scratch (#8669 )

2024-07-25 10:39:04 +02:00

embedding

Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )

2024-07-02 12:18:10 -04:00

eval-callback

ggml : reduce hash table reset cost (#8698 )

2024-07-27 04:41:55 +02:00

export-lora

examples : export-lora : fix issue with quantized base models (#8687 )

2024-07-25 23:49:39 +02:00

gbnf-validator

llama : move vocab, grammar and sampling into separate files (#8508 )

2024-07-23 13:10:17 +03:00

gguf

gguf : handle null name during init (#8587 )

2024-07-20 17:15:42 +03:00

gguf-hash

gguf-hash : update clib.json to point to original xxhash repo (#8491 )

2024-07-16 10:14:16 +03:00

gguf-split

…

gritlm

…

imatrix

ggml : reduce hash table reset cost (#8698 )

2024-07-27 04:41:55 +02:00

infill

infill : assert prefix/suffix tokens + remove old space logic (#8351 )

2024-07-08 09:34:35 +03:00

jeopardy

…

llama-bench

ggml : reduce hash table reset cost (#8698 )

2024-07-27 04:41:55 +02:00

llama.android

examples: fix android example cannot be generated continuously (#8621 )

2024-07-22 09:54:42 +03:00

llama.swiftui

llama.swiftui: fix end of generation bug (#8268 )

2024-07-20 16:09:37 +03:00

llava

ggml : reduce hash table reset cost (#8698 )

2024-07-27 04:41:55 +02:00

lookahead

…

lookup

lookup: fibonacci hashing, fix crashes (#8548 )

2024-07-17 23:35:44 +02:00

main

llama : fix llama_chat_format_single for mistral (#8657 )

2024-07-24 13:48:46 +02:00

main-cmake-pkg

…

parallel

…

passkey

passkey : add short intro to README.md [no-ci] (#8317 )

2024-07-05 09:14:24 +03:00

perplexity

…

quantize

ggml : remove q1_3 and q2_2

2024-08-02 20:16:26 -04:00

quantize-stats

ggml : minor naming changes (#8433 )

2024-07-12 10:46:02 +03:00

retrieval

…

rpc

…

save-load-state

llama : refactor session file management (#8699 )

2024-07-28 00:42:05 -04:00

server

server : add Speech Recognition & Synthesis to UI (#8679 )

2024-07-26 00:10:16 +02:00

simple

…

speculative

…

sycl

…

tokenize

ggml : reduce hash table reset cost (#8698 )

2024-07-27 04:41:55 +02:00

base-translate.sh

…

chat-13B.bat

…

chat-13B.sh

…

chat-persistent.sh

…

chat-vicuna.sh

…

chat.sh

…

CMakeLists.txt

examples : remove finetune and train-text-from-scratch (#8669 )

2024-07-25 10:39:04 +02:00

convert_legacy_llama.py

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor (#7499 )

2024-07-18 20:40:15 +10:00

json_schema_pydantic_example.py

py : type-check all Python scripts with Pyright (#8341 )

2024-07-07 15:04:39 -04:00

json_schema_to_grammar.py

py : type-check all Python scripts with Pyright (#8341 )

2024-07-07 15:04:39 -04:00

llama.vim

…

llm.vim

…

Miku.sh

…

pydantic_models_to_grammar_examples.py

examples : Rewrite pydantic_models_to_grammar_examples.py (#8493 )

2024-07-20 22:09:17 -04:00

pydantic_models_to_grammar.py

pydantic : replace uses of __annotations__ with get_type_hints (#8474 )

2024-07-14 19:51:21 -04:00

reason-act.sh

…

regex_to_grammar.py

py : switch to snake_case (#8305 )

2024-07-05 07:53:33 +03:00

server_embd.py

py : type-check all Python scripts with Pyright (#8341 )

2024-07-07 15:04:39 -04:00

server-llama2-13B.sh

…

ts-type-to-grammar.sh

…