mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-06-26 19:55:04 +00:00
Opt class for positional argument handling (#10508)
Added support for positional arguments `model` and `prompt`. Added functionality to download via strings like: llama-run llama3 llama-run ollama://granite-code llama-run ollama://granite-code:8b llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf llama-run https://example.com/some-file1.gguf llama-run some-file2.gguf llama-run file://some-file3.gguf Signed-off-by: Eric Curtin <ecurtin@redhat.com>
This commit is contained in:
14
README.md
14
README.md
@ -433,6 +433,20 @@ To learn more about model quantization, [read this documentation](examples/quant
|
||||
|
||||
</details>
|
||||
|
||||
## [`llama-run`](examples/run)
|
||||
|
||||
#### A comprehensive example for running `llama.cpp` models. Useful for inferencing. Used with RamaLama [^3].
|
||||
|
||||
- <details>
|
||||
<summary>Run a model with a specific prompt (by default it's pulled from Ollama registry)</summary>
|
||||
|
||||
```bash
|
||||
llama-run granite-code
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
[^3]: [https://github.com/containers/ramalama](RamaLama)
|
||||
|
||||
## [`llama-simple`](examples/simple)
|
||||
|
||||
|
Reference in New Issue
Block a user