a7528a5d2b
1) Inline EncodeFixed{32,64}(). They emit single machine instructions on 64-bit processors. 2) Remove size narrowing compiler warnings from DecodeFixed{32,64}(). 3) Add comments explaining the current state of optimizations in compilers we care about. 4) Change C-style includes, like <stdint.h>, to C++ style, like <cstdint>. 5) memcpy -> std::memcpy. The optimization comments are based on https://godbolt.org/z/RdIqS1. The missed optimization opportunities in clang have been reported as https://bugs.llvm.org/show_bug.cgi?id=41761 The change does not have significant impact on benchmarks. Results below. LevelDB: version 1.22 Date: Mon May 6 10:42:18 2019 CPU: 72 * Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz CPUCache: 25344 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) With change ------------------------------------------------ fillseq : 2.327 micros/op; 47.5 MB/s fillsync : 4185.526 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.662 micros/op; 30.2 MB/s overwrite : 4.261 micros/op; 26.0 MB/s readrandom : 4.239 micros/op; (1000000 of 1000000 found) readrandom : 3.649 micros/op; (1000000 of 1000000 found) readseq : 0.174 micros/op; 636.7 MB/s readreverse : 0.271 micros/op; 408.7 MB/s compact : 570495.000 micros/op; readrandom : 2.735 micros/op; (1000000 of 1000000 found) readseq : 0.118 micros/op; 937.3 MB/s readreverse : 0.190 micros/op; 583.7 MB/s fill100K : 860.164 micros/op; 110.9 MB/s (1000 ops) crc32c : 1.131 micros/op; 3455.2 MB/s (4K per op) snappycomp : 3.034 micros/op; 1287.5 MB/s (output: 55.1%) snappyuncomp : 0.544 micros/op; 7176.0 MB/s Baseline ------------------------------------------------ fillseq : 2.365 micros/op; 46.8 MB/s fillsync : 4240.165 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.244 micros/op; 34.1 MB/s overwrite : 4.153 micros/op; 26.6 MB/s readrandom : 4.698 micros/op; (1000000 of 1000000 found) readrandom : 4.065 micros/op; (1000000 of 1000000 found) readseq : 0.192 micros/op; 576.3 MB/s readreverse : 0.286 micros/op; 386.7 MB/s compact : 635979.000 micros/op; readrandom : 3.264 micros/op; (1000000 of 1000000 found) readseq : 0.169 micros/op; 652.8 MB/s readreverse : 0.213 micros/op; 519.5 MB/s fill100K : 1055.367 micros/op; 90.4 MB/s (1000 ops) crc32c : 1.353 micros/op; 2887.3 MB/s (4K per op) snappycomp : 3.036 micros/op; 1286.7 MB/s (output: 55.1%) snappyuncomp : 0.540 micros/op; 7238.6 MB/s PiperOrigin-RevId: 246856811