Tokenizer SPM fixes for phi-3 and llama-spm (bugfix) (#7425)

* Update brute force test: add_special * Update brute force test: default values for add_bos_token and add_eos_token * Enable rtrim when pre-inserting BOS Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Revert "server : fix test regexes"
2025-09-24 07:56:47 -04:00 · 2024-05-21 14:39:48 +02:00
parent 917dc8cfa6
commit d7e852c1bc
5 changed files with 28 additions and 23 deletions
--- a/examples/server/tests/features/slotsave.feature
+++ b/examples/server/tests/features/slotsave.feature
@@ -26,7 +26,7 @@ Feature: llama.cpp server slot management
    # Since we have cache, this should only process the last tokens
    Given a user prompt "What is the capital of Germany?"
    And   a completion request with no api error
-    Then  24 tokens are predicted matching (Thank|special|Lily)
+    Then  24 tokens are predicted matching (Thank|special)
    And   7 prompt tokens are processed
    # Loading the original cache into slot 0,
    # we should only be processing 1 prompt token and get the same output
@@ -41,7 +41,7 @@ Feature: llama.cpp server slot management
    Given a user prompt "What is the capital of Germany?"
    And   using slot id 1
    And   a completion request with no api error
-    Then  24 tokens are predicted matching (Thank|special|Lily)
+    Then  24 tokens are predicted matching (Thank|special)
    And   1 prompt tokens are processed

  Scenario: Erase Slot