server: fix regression on streamed non-chat completion w/ stops (#13785)

* more forgiving message diffs: partial stop words aren't erased, full stops are

* Add (slow) server test for completion + stream + stop
This commit is contained in:
Olivier Chafik
2025-05-26 06:16:37 -07:00
committed by GitHub
parent 79c137f776
commit f13847cfb5
2 changed files with 29 additions and 0 deletions

View File

@ -31,6 +31,11 @@ static std::string string_diff(const std::string & last, const std::string & cur
return current;
}
if (!string_starts_with(current, last)) {
if (string_starts_with(last, current)) {
// This happens if the last generation ended on a partial stop word (not erased),
// and the current ended on a stop word (erased).
return "";
}
throw std::runtime_error("Invalid diff: '" + last + "' not found at start of '" + current + "'");
}
return current.substr(last.size());