From 1bfbe76b4e3677c6ba5cff0a94b7c90a47d58d25 Mon Sep 17 00:00:00 2001 From: "gabor@google.com" Date: Fri, 29 Jul 2011 21:35:05 +0000 Subject: [PATCH] Improved benchmark, fixed bugs and SQLite parameters. - Based on suggestions on the sqlite-users mailing list, we removed the superfluous index on the primary key for SQLite's benchmarks, and turned write-ahead logging ("WAL") on. This led to performance improvements for SQLite. - Based on a suggestion by Florian Weimer on the leveldb mailing list, we disabled hard drive write-caching via hdparm when testing synchronous writes. This led to performance losses for LevelDB and Kyoto TreeDB. - Fixed a mistake in 2.A.->Random where the bar sizes were switched for Kyoto TreeDB and SQLite. git-svn-id: https://leveldb.googlecode.com/svn/trunk@45 62dab493-f737-651d-591e-8d6aee1b9529 --- doc/bench/db_bench_sqlite3.cc | 10 ++- doc/benchmark.html | 126 +++++++++++++++++----------------- 2 files changed, 69 insertions(+), 67 deletions(-) diff --git a/doc/bench/db_bench_sqlite3.cc b/doc/bench/db_bench_sqlite3.cc index a6f9a75..a15510e 100644 --- a/doc/bench/db_bench_sqlite3.cc +++ b/doc/bench/db_bench_sqlite3.cc @@ -74,7 +74,7 @@ static bool FLAGS_use_existing_db = false; static bool FLAGS_transaction = true; // If true, we enable Write-Ahead Logging -static bool FLAGS_WAL_enabled = false; +static bool FLAGS_WAL_enabled = true; inline static void ExecErrorCheck(int status, char *err_msg) { @@ -448,16 +448,20 @@ class Benchmark { // Change journal mode to WAL if WAL enabled flag is on if (FLAGS_WAL_enabled) { std::string WAL_stmt = "PRAGMA journal_mode = WAL"; + + // LevelDB's default cache size is a combined 4 MB + std::string WAL_checkpoint = "PRAGMA wal_autocheckpoint = 4096"; status = sqlite3_exec(db_, WAL_stmt.c_str(), NULL, NULL, &err_msg); ExecErrorCheck(status, err_msg); + status = sqlite3_exec(db_, WAL_checkpoint.c_str(), NULL, NULL, &err_msg); + ExecErrorCheck(status, err_msg); } // Change locking mode to exclusive and create tables/index for database std::string locking_stmt = "PRAGMA locking_mode = EXCLUSIVE"; std::string create_stmt = "CREATE TABLE test (key blob, value blob, PRIMARY KEY(key))"; - std::string index_stmt = "CREATE INDEX keyindex ON test (key)"; - std::string stmt_array[] = { locking_stmt, create_stmt, index_stmt }; + std::string stmt_array[] = { locking_stmt, create_stmt }; int stmt_array_length = sizeof(stmt_array) / sizeof(std::string); for (int i = 0; i < stmt_array_length; i++) { status = sqlite3_exec(db_, stmt_array[i].c_str(), NULL, NULL, &err_msg); diff --git a/doc/benchmark.html b/doc/benchmark.html index f842118..a0d6b02 100644 --- a/doc/benchmark.html +++ b/doc/benchmark.html @@ -85,7 +85,7 @@ div.bsql {

In order to test LevelDB's performance, we benchmark it against other well-established database implementations. We compare LevelDB (revision 39) against SQLite3 (version 3.7.6.3) and Kyoto Cabinet's (version 1.2.67) TreeDB (a B+Tree based key-value store). We would like to acknowledge Scott Hess and Mikio Hirabayashi for their suggestions and contributions to the SQLite3 and Kyoto Cabinet benchmarks, respectively.

-

Benchmarks were all performed on a six-core Intel(R) Xeon(R) CPU X5650 @ 2.67GHz, with 12288 KB of total L3 cache and 12 GB of DDR3 RAM at 1333 MHz. (Note that LevelDB uses at most two CPUs since the benchmarks are single threaded: one to run the benchmark, and one for background compactions.) We ran the benchmarks on two machines (with identical processors), one with an Ext3 file system and one with an Ext4 file system. The machine with the Ext3 file system has a SATA Hitachi HDS721050CLA362 hard drive. The machine with the Ext4 file system has a SATA Samsung HD502HJ hard drive. Both hard drives spin at 7200 RPM. The numbers reported below are the median of three measurements.

+

Benchmarks were all performed on a six-core Intel(R) Xeon(R) CPU X5650 @ 2.67GHz, with 12288 KB of total L3 cache and 12 GB of DDR3 RAM at 1333 MHz. (Note that LevelDB uses at most two CPUs since the benchmarks are single threaded: one to run the benchmark, and one for background compactions.) We ran the benchmarks on two machines (with identical processors), one with an Ext3 file system and one with an Ext4 file system. The machine with the Ext3 file system has a SATA Hitachi HDS721050CLA362 hard drive. The machine with the Ext4 file system has a SATA Samsung HD502HJ hard drive. Both hard drives spin at 7200 RPM and have hard drive write-caching enabled (using `hdparm -W 1 [device]`). The numbers reported below are the median of three measurements.

Benchmark Source Code

We wrote benchmark tools for SQLite and Kyoto TreeDB based on LevelDB's db_bench. The code for each of the benchmarks resides here:

@@ -97,9 +97,9 @@ div.bsql {

Custom Build Specifications

1. Baseline Performance

@@ -130,8 +130,8 @@ parameters are varied. For the baseline:

1,010,000 ops/sec
 
SQLite3 - 186,000 ops/sec -
 
+ 174,000 ops/sec +
 

B. Random Reads

@@ -142,8 +142,8 @@ parameters are varied. For the baseline:

- - + +
151,000 ops/sec
 
SQLite3146,000 ops/sec
 
134,000 ops/sec
 

C. Sequential Writes

@@ -154,8 +154,8 @@ parameters are varied. For the baseline:

- - + +
342,000 ops/sec
 
SQLite326,900 ops/sec
 
48,600 ops/sec
 

D. Random Writes

@@ -166,8 +166,8 @@ parameters are varied. For the baseline:

- - + +
88,500 ops/sec
 
SQLite3420 ops/sec
 
9,860 ops/sec
 

LevelDB outperforms both SQLite3 and TreeDB in sequential and random write operations and sequential read operations. Kyoto Cabinet has the fastest random read operations.

@@ -178,26 +178,26 @@ parameters are varied. For the baseline:

Sequential Writes

- - + + - - + + - +
LevelDB1,060 ops/sec
 
1,100 ops/sec
 
Kyoto TreeDB1,020 ops/sec
 
1,000 ops/sec
 
SQLite32,910 ops/sec1,600 ops/sec
 

Random Writes

- + - + - - + +
LevelDB 480 ops/sec
 
 
Kyoto TreeDB 1,100 ops/sec
 
 
SQLite32,200 ops/sec
 
1,600 ops/sec
 

LevelDB doesn't perform as well with large values of 100,000 bytes each. This is because LevelDB writes keys and values at least twice: first time to the transaction log, and second time (during a compaction) to a sorted file. With larger values, LevelDB's per-operation efficiency is swamped by the @@ -211,9 +211,9 @@ cost of extra copies of large values.

 
(1.08x baseline) SQLite3 - 100,000 entries/sec -
 
- (3.72x baseline) + 124,000 entries/sec +
 
+ (2.55x baseline)

Random Writes

@@ -222,22 +222,20 @@ cost of extra copies of large values.

- - - + + +
 
(1.35x baseline)
SQLite31,000 entries/sec
 
(2.38x baseline)
22,000 entries/sec
 
(2.23x baseline)

Because of the way LevelDB persistent storage is organized, batches of random writes are not much slower (only a factor of 4x) than batches -of sequential writes. However SQLite3 sees a significant slowdown -(factor of 100x) when switching from sequential to random batch -writes. This is because each random batch write in SQLite3 has to -update approximately as many pages as there are keys in the batch.

+of sequential writes.

C. Synchronous Writes

In the following benchmark, we enable the synchronous writing modes of all of the databases. Since this change significantly slows down the -benchmark, we stop after 10,000 writes.

+benchmark, we stop after 10,000 writes. For synchronous write tests, we've +disabled hard drive write-caching (using `hdparm -W 0 [device]`).