* initial commit for handling extra template kwargs
* enable_thinking and assistant prefill cannot be enabled at the same time
* can set chat_template_kwargs in command line
* added doc
* fixed formatting
* add support for extra context in generic template init
* coding standard: common/chat.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* coding standard: common/chat.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Apply suggestions from code review
coding standard: cosmetic changes
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* fix merge conflict
* chat.cpp: simplify calls to apply to ensure systematic propagation of extra_context (+ the odd existing additional_context)
* normalize environment variable name
* simplify code
* prefill cannot be used with thinking models
* compatibility with the new reasoning-budget parameter
* fix prefill for non thinking models
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Olivier Chafik <olivier.chafik@gmail.com>
* add preludes to content on partial regex match
* allow all parsers to parse non-tool-call content.
* tweak order of <|python_tag|> vs <function= parsing for functionary v3.1 format. still not ideal but hopefully less prone to crash