llama.cpp/common at a59f8fdc85e1119d470d8766e29617962549d993 - llama.cpp - Cat's Mantra

tqcq/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-08-17 05:25:09 -04:00

Files

History

Kevin Wang 470939d483 common : preallocate sampling token data vector (#8363 )

`emplace_back` repeatedly-called is slower than preallocating the vector to the vocab size and directly inserting the data. Some rudimentary profiling with `chrono` improves the performance of this block of code from ~500us/op to ~40us/op.

Overall, this slightly improves the sampling performance which has a more substantial impact for the `examples/lookahead` implementation -- I am able to see a ~10% performance boost in lookahead inference.

2024-07-08 10:26:53 +03:00

..

llama : reorganize source code + improve CMake (#8006 )

2024-06-26 18:33:02 +03:00

base64.hpp

llava : expose as a shared library for downstream projects (#3613 )

2023-11-07 00:36:23 +03:00

build-info.cpp.in

build : link against build info instead of compiling against it (#3879 )

2023-11-02 08:50:16 +02:00

CMakeLists.txt

llama : reorganize source code + improve CMake (#8006 )

2024-06-26 18:33:02 +03:00

common.cpp

added support for Authorization Bearer tokens when downloading model (#8307 )

2024-07-06 22:32:04 +02:00

common.h

added support for Authorization Bearer tokens when downloading model (#8307 )

2024-07-06 22:32:04 +02:00

console.cpp

…

console.h

…

grammar-parser.cpp

Added support for . (any character) token in grammar engine. (#6467 )

2024-06-06 06:08:52 -07:00

grammar-parser.h

…

json-schema-to-grammar.cpp

json: restore default additionalProperties to false, fix some pattern escapes (#8180 )

2024-06-28 09:26:45 +01:00

json-schema-to-grammar.h

JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143 )

2024-05-08 21:53:08 +02:00

json.hpp

json-schema-to-grammar improvements (+ added to server) (#5978 )

2024-03-21 11:50:43 +00:00

log.h

infill : assert prefix/suffix tokens + remove old space logic (#8351 )

2024-07-08 09:34:35 +03:00

ngram-cache.cpp

Fixed lookup compilation issues on Windows (#6273 )

2024-03-24 14:21:17 +01:00

ngram-cache.h

lookup: complement data from context with general text statistics (#5479 )

2024-03-23 01:24:36 +01:00

sampling.cpp

common : preallocate sampling token data vector (#8363 )

2024-07-08 10:26:53 +03:00

sampling.h

common : normalize naming style (#7462 )

2024-05-22 20:04:20 +03:00

stb_image.h

…

train.cpp

train : change default FA argument (#7528 )

2024-05-25 15:22:35 +03:00

train.h

sync : ggml (backend v2) (#3912 )

2023-11-13 14:16:23 +02:00