llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-19 00:57:41 +00:00

Files

Clauszy 06a92a193a server : fix cache reuse logic (#12161 )

The first kv shift offsets the positions of all tokens after head_c.
When using llama_kv_cache_seq_rm next, using head_c will remove the valid tokens because their positions have already been offset.

2025-03-05 09:25:45 +02:00

batched

…

batched-bench

…

batched.swift

…

convert-llama2c-to-ggml

…

cvector-generator

…

deprecation-warning

…

embedding

…

eval-callback

…

export-lora

…

gbnf-validator

…

gen-docs

…

gguf

…

gguf-hash

…

gguf-split

…

gritlm

…

imatrix

…

infill

…

jeopardy

…

llama-bench

…

llama.android

…

llama.swiftui

…

llava

…

lookahead

…

lookup

…

main

…

parallel

…

passkey

…

perplexity

Fix: Compile failure due to Microsoft STL breaking change (#11836 )

2025-02-12 21:36:11 +01:00

quantize

…

quantize-stats

…

retrieval

…

rpc

…

run

Adding UTF-8 support to llama.cpp (#12111 )

2025-03-03 12:44:56 +00:00

save-load-state

…

server

server : fix cache reuse logic (#12161 )

2025-03-05 09:25:45 +02:00

simple

llama : add llama_vocab, functions -> methods, naming (#11110 )

2025-01-12 11:32:42 +02:00

simple-chat

…

simple-cmake-pkg

…

speculative

…

speculative-simple

…

sycl

…

tokenize

…

tts

…

chat-13B.bat

…

chat-13B.sh

…

chat-persistent.sh

…

chat-vicuna.sh

…

chat.sh

…

CMakeLists.txt

…

convert_legacy_llama.py

metadata: Detailed Dataset Authorship Metadata (#8875 )

2024-11-13 21:10:38 +11:00

json_schema_pydantic_example.py

…

json_schema_to_grammar.py

…

llama.vim

…

llm.vim

…

Miku.sh

…

pydantic_models_to_grammar_examples.py

…

pydantic_models_to_grammar.py

…

reason-act.sh

…

regex_to_grammar.py

…

server_embd.py

…

server-llama2-13B.sh

…

ts-type-to-grammar.sh

…