llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-07-21 10:17:58 +00:00

Files

woodx a5cabd7649 server : do not get prompt in infill mode (#7286 )

* avoid to get prompt in infill mode and embedding mode

* remove embedding mode

* refactor format

---------

Co-authored-by: wudexiang <wudexiang@bytedance.com>

2024-06-07 10:09:45 +03:00

baby-llama

…

batched

…

batched-bench

…

batched.swift

…

benchmark

…

convert-llama2c-to-ggml

…

embedding

…

eval-callback

…

export-lora

…

finetune

…

gbnf-validator

…

gguf

…

gguf-split

…

gritlm

…

imatrix

check for nans in imatrix and quantize (#7807 )

2024-06-07 09:01:29 +03:00

infill

…

jeopardy

…

llama-bench

…

llama.android

…

llama.swiftui

…

llava

…

lookahead

…

lookup

…

main

…

main-cmake-pkg

…

parallel

…

passkey

…

perplexity

…

quantize

…

quantize-stats

…

retrieval

…

rpc

…

save-load-state

…

server

server : do not get prompt in infill mode (#7286 )

2024-06-07 10:09:45 +03:00

simple

…

speculative

…

sycl

…

tokenize

…

train-text-from-scratch

…

alpaca.sh

…

base-translate.sh

…

chat-13B.bat

…

chat-13B.sh

…

chat-persistent.sh

…

chat-vicuna.sh

…

chat.sh

…

CMakeLists.txt

…

convert-legacy-llama.py

…

gpt4all.sh

…

json_schema_to_grammar.py

…

json-schema-pydantic-example.py

…

llama2-13b.sh

…

llama2.sh

…

llama.vim

…

llm.vim

…

Miku.sh

…

pydantic_models_to_grammar.py

…

pydantic-models-to-grammar-examples.py

…

reason-act.sh

…

regex-to-grammar.py

…

server-embd.py

…

server-llama2-13B.sh

…

ts-type-to-grammar.sh

…