* Add support for Llada-8b: diffusion model
* Add README
* Fix README and convert_hf_to_gguf
* convert_hf_to_gguf.py: address review comments
* Make everything in a single example
* Remove model-specific sampling
* Remove unused argmax
* Remove braced initializers, improve README.md a bit
* Add diffusion specific gguf params in set_vocab, remove setting rope_theta and rms_norm_eps
* Remove adding the mask token
* Move add_add_bos_token to set_vocab
* use add_bool in gguf_writer.py
* Support diffusion models: Add Dream 7B
* Move diffusion to examples
* Move stuff to examples. Add patch to not use kv-cache
* Address review comments
* Make sampling fast
* llama: remove diffusion functions
* Add basic timings + cleanup
* More cleanup
* Review comments: better formating, use LOG instead std::cerr, re-use batch, use ubatch instead of max_length
* fixup!
* Review: move everything to diffusion-cli for now