* CUDA: optimize and refactor MMQ * explicit q8_1 memory layouts, add documentation
The note is not visible to the blocked user.