Xuan Son Nguyen
958367bf53
server : refactor slot input data, move tokenizer to HTTP thread (#10023)
* server : refactor slot input data, move tokenizer to HTTP thread
* move prompt_tokens.empty() check
* fix incorrect if branch
* fix infinite generation loop
* bring back infill validation
* add infill test
* try fixing format_infill
* fix test
* remove redundant code
* rename completion to inference
* update docs
* use llama_tokens everywhere
2024-10-24 21:51:22 +02:00
..
2024-10-18 23:18:01 +02:00
2024-10-10 22:57:42 +02:00
2024-10-18 23:18:01 +02:00
2024-07-25 10:39:04 +02:00
2024-10-18 23:18:01 +02:00
2024-09-07 15:16:19 +03:00
2024-07-20 17:15:42 +03:00
2024-07-16 10:14:16 +03:00
2024-10-10 22:57:42 +02:00
2024-10-18 23:18:01 +02:00
2024-10-18 23:18:01 +02:00
2024-06-13 00:41:52 +01:00
2024-10-18 23:18:01 +02:00
2024-10-18 23:18:01 +02:00
2024-10-21 09:46:40 +03:00
2024-10-18 23:18:01 +02:00
2024-10-18 23:18:01 +02:00
2024-10-18 23:18:01 +02:00
2024-10-18 23:18:01 +02:00
2024-07-02 12:18:10 -04:00
2024-10-18 23:18:01 +02:00
2024-10-10 22:57:42 +02:00
2024-10-18 23:18:01 +02:00
2024-09-20 20:55:36 +02:00
2024-10-08 14:21:43 +02:00
2024-10-10 20:14:55 +02:00
2024-10-21 09:46:40 +03:00
2024-10-24 21:51:22 +02:00
2024-10-18 23:18:01 +02:00
2024-10-21 09:46:40 +03:00
2024-09-18 08:30:31 +08:00
2024-10-10 22:57:42 +02:00
2024-10-23 17:16:56 +03:00