server: fix regression on streamed non-chat completion w/ stops (#13785)

* more forgiving message diffs: partial stop words aren't erased, full stops are * Add (slow) server test for completion + stream + stop
2025-06-26 19:55:04 +00:00 · 2025-05-26 06:16:37 -07:00
parent 79c137f776
commit f13847cfb5
2 changed files with 29 additions and 0 deletions
--- a/common/chat.cpp
+++ b/common/chat.cpp
@ -31,6 +31,11 @@ static std::string string_diff(const std::string & last, const std::string & cur
        return current;
    }
    if (!string_starts_with(current, last)) {
+        if (string_starts_with(last, current)) {
+            // This happens if the last generation ended on a partial stop word (not erased),
+            // and the current ended on a stop word (erased).
+            return "";
+        }
        throw std::runtime_error("Invalid diff: '" + last + "' not found at start of '" + current + "'");
    }
    return current.substr(last.size());