llama.cpp/common at e9b6350e61d592634263a14b3d77ecbf6c1fb096 - llama.cpp - Cat's Mantra

tqcq/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-05 02:23:54 +00:00

Files

History

matteo caf5681fcb server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196 )

* initial commit for handling extra template kwargs

* enable_thinking and assistant prefill cannot be enabled at the same time

* can set chat_template_kwargs in command line

* added doc

* fixed formatting

* add support for extra context in generic template init

* coding standard: common/chat.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* coding standard:  common/chat.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Apply suggestions from code review

coding standard: cosmetic changes

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* fix merge conflict

* chat.cpp: simplify calls to apply to ensure systematic propagation of extra_context (+ the odd existing additional_context)

* normalize environment variable name

* simplify code

* prefill cannot be used with thinking models

* compatibility with the new reasoning-budget parameter

* fix prefill for non thinking models

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Olivier Chafik <olivier.chafik@gmail.com>

2025-06-29 20:02:53 +02:00

..

arg.cpp

server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196 )

2025-06-29 20:02:53 +02:00

arg.h

common : add common_remote_get_content (#13123 )

2025-04-26 22:58:12 +02:00

base64.hpp

llava : expose as a shared library for downstream projects (#3613 )

2023-11-07 00:36:23 +03:00

build-info.cpp.in

cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167 )

2025-06-13 10:38:52 +02:00

chat-parser.cpp

llama-chat : Do not throw when tool parsing fails (#14012 )

2025-06-14 17:25:15 +01:00

chat-parser.h

llama-chat : Do not throw when tool parsing fails (#14012 )

2025-06-14 17:25:15 +01:00

chat.cpp

server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196 )

2025-06-29 20:02:53 +02:00

chat.h

server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196 )

2025-06-29 20:02:53 +02:00

CMakeLists.txt

cmake : Improve build-info.cpp generation (#14156 )

2025-06-13 09:51:34 +03:00

common.cpp

vocab : prevent tokenizer overflow (#14301 )

2025-06-20 07:13:06 -07:00

common.h

server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196 )

2025-06-29 20:02:53 +02:00

console.cpp

console : utf-8 fix for windows stdin (#9690 )

2024-09-30 11:23:42 +03:00

console.h

gguf : new file format with flexible meta data (beta) (#2398 )

2023-08-21 23:07:43 +03:00

json-partial.cpp

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

json-partial.h

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

json-schema-to-grammar.cpp

common : use std::string_view now that we target c++17 (#14319 )

2025-06-22 08:37:43 +03:00

json-schema-to-grammar.h

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

llguidance.cpp

llguidance : set tokenizer slices to default (#13424 )

2025-05-10 17:19:52 +02:00

log.cpp

Fix: Compile failure due to Microsoft STL breaking change (#11836 )

2025-02-12 21:36:11 +01:00

log.h

cleanup: fix compile warnings associated with gnu_printf (#11811 )

2025-02-12 10:06:53 -04:00

ngram-cache.cpp

ggml : portability fixes for VS 2017 (#12150 )

2025-03-04 18:53:26 +02:00

ngram-cache.h

llama : use LLAMA_TOKEN_NULL (#11062 )

2025-01-06 10:52:15 +02:00

regex-partial.cpp

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

regex-partial.h

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

sampling.cpp

server: streaming of tool calls and thoughts when --jinja is on (#12379 )

2025-05-25 01:48:08 +01:00

sampling.h

sampling : support for llguidance grammars (#10224 )

2025-02-02 09:55:32 +02:00

speculative.cpp

llama : deprecate llama_kv_self_ API (#14030 )

2025-06-06 14:11:15 +03:00

speculative.h

speculative : update default params (#11954 )

2025-02-19 13:29:42 +02:00