Commit Graph

104 Commits

Author SHA1 Message Date
Kyle Zhang
d3d1c8a0f4 don't check current key in DBIter::Next()
When iter_ is pointing to current key, we can safely move to the next
key to avoid checking current key, which is of course not necessary.

Benchmark shows that 'readseq' has about 8% performance improvement.

Without patch:

>./db_bench --benchmarks=readseq --num=$((4<<20)) --db=/tmp/db --use_existing_db=1
LevelDB:    version 1.21
Date:       Thu Apr 25 09:37:21 2019
CPU:        32 * Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
CPUCache:   20480 KB
Keys:       16 bytes each
Values:     100 bytes each (50 bytes after compression)
Entries:    4194304
RawSize:    464.0 MB (estimated)
FileSize:   264.0 MB (estimated)
------------------------------------------------
readseq      :       0.196 micros/op;  565.7 MB/s

With patch:

>./db_bench --benchmarks=readseq --num=$((4<<20)) --db=/tmp/db --use_existing_db=1
LevelDB:    version 1.21
Date:       Thu Apr 25 09:38:20 2019
CPU:        32 * Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
CPUCache:   20480 KB
Keys:       16 bytes each
Values:     100 bytes each (50 bytes after compression)
Entries:    4194304
RawSize:    464.0 MB (estimated)
FileSize:   264.0 MB (estimated)
------------------------------------------------
readseq      :       0.181 micros/op;  612.3 MB/s

Signed-off-by: Kyle Zhang <kyle@smartx.com>
2019-04-25 09:54:05 +08:00
Chris Mumford
2f008ac19e Initialize class members to default values in constructors.
There were a few members which were identified to have been left
uninitialized in some constructors. These were very likely to
have been set before being used, otherwise the ASan tests would
have caught them, but still good practice to have them
initialized. This addresses some items reported in issue #668.

PiperOrigin-RevId: 243370145
2019-04-12 18:50:15 -07:00
Chris Mumford
65e86f75ea Fix formatting of recent snapshot compaction fix.
Fix variable names, line lengths, namespace use, and a few other
minor issues to conform to Google C++ style guide.

PiperOrigin-RevId: 243187729
2019-04-12 00:17:03 -07:00
Chris Mumford
7711e76766 Merge pull request #339 from richcole-at-amazon:master
PiperOrigin-RevId: 243156105
2019-04-12 00:16:52 -07:00
Chris Mumford
71ed7c401e Fixed typo in comment in version_set.h.
Flagged by presubmit check.

PiperOrigin-RevId: 243118632
2019-04-12 00:16:41 -07:00
Richard Cole
20fb601aa9 Fix snapshot compaction bug
Closes google/leveldb#320

During compaction it was possible that records from a block b1=(l1,u1)
would be pushed down from level i to level i+1. If there is a block
b2=(l2,u2) at level i with k1 = user_key(u1) = user_key(l2) then
a subsequent search for k1 will yield the record l2 which has a smaller
sequence number than u1 because the sort order for records sorts
increasing by user key but decreaing by sequence number.

This change add a call to a new function AddBoundaryInputs to
SetupOtherInputs. AddBoundaryInputs searches for a block b2 matching the
criteria above and adds it to the set of files to be compacted. Whenever
AddBoundaryInputs is called it is important that the compaction fileset
in level i+1 (known as c->inputs_[1] in the code) be recomputed. Each
call to AddBoundaryInputs is followed by a call to GetOverlappingInputs.

SetupOtherInputs is called on both manual and automated compaction
passes. It is called for both level zero and for levels greater than 0.

The original change posted in https://github.com/google/leveldb/pull/339
has been modified to also include changed made by Chris Mumford<cmumford@google.com>
in 4b72cb14f8

  1. Releasing snapshots during test cleanup to avoid
     memory leak warnings.
  2. Refactored test to use testutil.h to be in line
     with other issue tests and to create the test
     database in the correct temporary location.
  3. Added copyright banner.

  Otherwise, just minor formatting and limiting character
  width to 80 characters.

Additionally the change was rebased on top of current master and
changes previously made to the Makefile were ported to the
CMakeLists.txt.

Testing Done:

  A test program (issue320_test) was constructed that performs mutations
  while snapshots are active. issue320_test fails without this bug fix
  after 64k writes. It passes with this bug fix. It was run with 200M
  writes and passed.

  Unit tests were written for the new function that was added to the
  code. Make test was run and seen to pass.

Signed-off-by: Richard Cole <richcole@amazon.com>
2019-04-01 13:10:09 -07:00
costan
6188a54ce9 leveldb: Add tests for empty keys and values.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239695281
2019-03-29 11:22:22 -07:00
usurai
6571279d6d fix a typo in the comment of skiplist_test.cc (#664) 2019-03-21 07:58:29 -07:00
cmumford
ea49b27d06 Switch corruption_test to use InMemEnv.
This change switches corruption_test, which previously used direct file
I/O to corrupt table files for open databases, to use InMemEnv. Using an
Env eliminates some platform dependencies thus simplifying the tests.

Also removed EnvWindowsTestHelper::RelaxFilePermissions().  This was
only added because the Windows Env opens files for exclusive access.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239305329
2019-03-20 13:57:03 -07:00
costan
201f77d137 Inline defaults in options.
This CL moves default values for
leveldb::{Options,ReadOptions,WriteOptions} from constructors to member
declarations, and removes now-redundant comments stating the defaults.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239271242
2019-03-20 13:56:22 -07:00
costan
7d8e41e49b leveldb: Replace AtomicPointer with std::atomic.
This CL removes AtomicPointer from leveldb's port interface. Its usage is replaced with std::atomic<> from the C++11 standard library.

AtomicPointer was used to wrap flags, numbers, and pointers, so its instances are replaced with std::atomic<bool>, std::atomic<int>, std::atomic<size_t> and std::atomic<Node*>.

This CL does not revise the memory ordering. AtomicPointer's methods are replaced mechanically with their std::atomic equivalents, even when the underlying usage is incorrect. (Example: DBImpl::has_imm_ is written using release stores, even though it is always read using relaxed ordering.) Revising the memory ordering is left for future CLs.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=237865146
2019-03-11 13:41:25 -07:00
Dimitris Apostolou
a20508dc6a Fix typo (#565) 2019-03-11 10:36:11 -07:00
cmumford
c69d33b0ec Added native support for Windows.
This change adds a native Windows port (port_windows.h) and a
Windows Env (WindowsEnv).

Note1: "small" is defined when including <Windows.h> so some
parameters were renamed to avoid conflict.

Note2: leveldb::Env defines the method: "DeleteFile" which is
also a constant defined when including <Windows.h>. The solution
was to ensure this macro is defined in env.h which forces
the function, when compiled, to be either DeleteFileA or
DeleteFileW when building for MBCS or UNICODE respectively.

This resolves #519 on GitHub.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=236364778
2019-03-01 18:00:35 -08:00
costan
fe4494804f leveldb: Make WriteBatch::ApproximateSize() const.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=229395810
2019-01-15 18:43:13 +00:00
costan
89af27bde5 Remove ssize_t from code that is not POSIX-specific.
ssize_t is not standard C++. It is a POSIX extension. Therefore, it does
not belong in generic code.

This change tweaks the logic in DBIter to remove the need for signed
integers, so ssize_t can be replaced with size_t. The impacted method
and private member are renamed to better express their purpose.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211471606
2018-09-04 10:37:22 -07:00
costan
16a2b8bb3a Expose WriteBatch::Append in the C API.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=209345072
2018-08-19 19:54:34 -07:00
costan
f7b0e1d901 Expose WriteBatch::Append().
WriteBatchInternal has a method for efficiently concatenating two
WriteBatches. This commit exposes the method to the public API.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=208724311
2018-08-14 15:30:29 -07:00
costan
1868398150 Clean up SnapshotImpl.
* Omit SnapshotImpl::list_ when assert() isn't on
* Make SnapshotImpl::number_ const and set it in the constructor
* Make SnapshotImpl::number_ private and access it via a getter
* Rename SnapshotImpl::number_ to SnapshotImpl::sequence_number_
* Rename SnapshotList::list_ to SnapshotList::head_
* Wrap casting from Snapshot* to SnapshotImpl* in ToSnapshotImpl()

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=194852828
2018-04-30 16:01:39 -07:00
MarcoFalke
14cce848e7 Fix sign mismatch warnings in GCC.
This was contributed in https://github.com/google/leveldb/pull/492

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193080913
2018-04-16 18:13:09 -07:00
costan
09217fd067 Replace NULL with nullptr in C++ files.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192365747
2018-04-10 16:26:43 -07:00
costan
0db30413a4 leveldb: Add more thread safety annotations.
After this CL, all classes with Mutex members should be covered by annotations. Exceptions are atomic members, which shouldn't need locking, and DBImpl members that cause errors when annotated, which will be tackled separately.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=190260865
2018-03-23 12:56:14 -07:00
costan
739c25100e Add CMake build support.
Fixes https://github.com/google/leveldb/issues/466

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189425354
2018-03-16 19:17:27 -07:00
costan
0fa5a4f7b1 Extend thread safety annotations.
This CL makes it easier to reason about thread safety by:

1) Adding Clang thread safety annotations according to comments.
2) Expanding a couple of variable names, without adding extra lines of code.
3) Adding const in a couple of places.
4) Replacing an always-non-null const pointer with a reference.
5) Fixing style warnings in the modified files.

This CL does not annotate the DBImpl members that claim to be protected
by the instance mutex, but are accessed without the mutex being held.
Those members (and their unprotected accesses) will be addressed in
future CLs.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189354657
2018-03-16 10:32:40 -07:00
costan
8143c12f3f Fix includes in util/testharness.h.
This CL removes unused headers included by util/testharness.h, adds
precise includes where the build breaks, and fixes style errors in the
edited files.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189331061
2018-03-16 10:31:48 -07:00
costan
aece2068d7 Remove extern from function declarations.
External linkage is the default for function declarations in C++.

This also fixes ClangTidy errors generated by removing the "extern"
keyword as described above.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188730416
2018-03-12 09:24:48 -07:00
costan
ddab751002 Add tests for {Old}InfoLogFileName().
This change was contributed by GitHub user @LopatkinEvgeniy in
https://github.com/google/leveldb/pull/559.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188728845
2018-03-12 09:24:25 -07:00
costan
7fd7c00721 Remove unused function ExtractValueType.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188728505
2018-03-12 09:24:07 -07:00
costan
623d014a54 Expose Env::GetTempDirectory() for use in C test.
This removes the use of the non-portable headers <sys/types.h> and <unistd.h> in c_test.c.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188503102
2018-03-09 10:38:04 -08:00
cmumford
47cb9e2a21 Add leveldb_options_set_max_file_size to the C API.
When the max file size option was added in CL 134391640 the C API
was not modified to support this.

This change was contributed by GitHub user @olt and fixes issue #439.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=173466388
2017-11-03 15:04:26 -07:00
cmumford
3da4d8b989 Deleted unused assignments in Reader.
Deleted two unused assignments:

1. offset_in_block in Reader::SkipToInitialBlock().
2. in_fragmented_record in Reader::ReadRecord().

Reasons for the change:
1. offset_in_block is not read again after the if condition.
2. The kFullRecordType switch branch returns, so
   in_fragmented_record isn't read again.
3. The kFirstType switch branch sets in_fragmented_record to
   true after the if, so the write in the if is ignored.

Change contributed by @C0deAi on GitHub.

This fixes https://github.com/google/leveldb/issues/517

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=172763897
2017-11-03 15:03:44 -07:00
cmumford
0509414f85 leveldb::DestroyDB will now delete empty directories.
Env's that filtered out dot files ("." and "..") would return an
empty vector of children causing DestroyDB to do nothing. This fixes
https://github.com/google/leveldb/issues/215

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=172501335
2017-11-03 15:03:20 -07:00
costan
23162ca1c6 Fix typo (forgotten reference operator) in test.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171708408
2017-10-10 11:47:13 -07:00
sanjay
7e12c00ecf Fix issue 474: a race between the f*_unlocked() STDIO calls in
env_posix.cc and concurrent application calls to fflush(NULL).

The fix is to avoid using stdio in env_posix.cc but add our own
buffering where we need it.

Added a test to reproduce the bug.

Added a test for Env reads/writes.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170738066
2017-10-03 11:27:09 -07:00
costan
ea0a7586b8 Remove confusing and unnecessary if.
12 lines above, there is an "if (!s.ok()) { return s; }" block of code.
"s" is never modified between that block and the "if" removed by this
CL, so "s.ok()" must be true.

The code most likely intended to say "if (!builder->ok())", because the
builder->Add() call above can modify the TableBuilder's status, as a
side-effect. However, this approach would have required setting "s =
builder.status()" in the "else" branch, near the "builder.Abandon()"
call. So, removing the "if" outright is simpler than following that line
of thought.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=167326229
2017-09-01 14:41:28 -07:00
davidair
02f43c0fcd Remove dead code.
The dead code has been in the codebase since the initial commit and is
generating a compiler warning when used in Xcode.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=164174594
2017-08-24 15:00:52 -07:00
costan
8415f00eee leveldb: Report missing CURRENT manifest file as database corruption.
BTRFS reorders rename and write operations, so it is possible that a filesystem crash and recovery results in a situation where the file pointed to by CURRENT does not exist. DB::Open currently reports an I/O error in this case. Reporting database corruption is a better hint to the caller, which can attempt to recover the database or erase it and start over.

This issue is not merely theoretical. It was reported as having showed up in the wild at https://github.com/google/leveldb/issues/195 and at https://crbug.com/738961. Also, asides from the BTRFS case described above, incorrect data in CURRENT seems like a possible corruption case that should be handled gracefully.

The Env API changes here can be considered backwards compatible, because an implementation that returns Status::IOError instead of Status::NotFound will still get the same functionality as before.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=161432630
2017-07-10 14:14:00 -07:00
costan
69e2bd224b LevelDB: Add WriteBatch::ApproximateSize().
This can be used to report metrics on LevelDB usage.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=156934930
2017-07-10 14:13:30 -07:00
cmumford
7fa20948d5 Convert documentation to markdown.
Markdown is more readable in a text editor and when hosted
on GitHub is more readable than HTML.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=148830423
2017-03-01 09:42:25 -08:00
corrado
a2fb086d07 Add option for max file size. The currend hard-coded value of 2M is inefficient in colossus.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=134391640
2016-09-28 10:52:24 -07:00
Nicholas Westlake
ea992b467b Change std::uint64_t to uint64_t (#354)
-This fixes compile errors with default setup on RHEL 6 systems.
2016-04-12 15:38:09 -07:00
mjwiacek
e84b5bdb5a This CL fixes a bug encountered when reading records from leveldb files that have been split, as in a [] input task split.
Detailed description:

Suppose an input split is generated between two leveldb record blocks and the preceding block ends with null padding.

A reader that previously read at least 1 record within the first block (before encountering the padding) upon trying to read the next record, will successfully and correctly read the next logical record from the subsequent block, but will return a last record offset pointing to the padding in the first block.

When this happened in a [], it resulted in duplicate records being handled at what appeared to be different offsets that were separated by only a few bytes.

This behavior is only observed when at least 1 record was read from the first block before encountering the padding. If the initial offset for a reader was within the padding, the correct record offset would be reported, namely the offset within the second block.

The tests failed to catch this scenario/bug, because each read test only read a single record with an initial offset. This CL adds an explicit test case for this scenario, and modifies the test structure to read all remaining records in the test case after an initial offset is specified.  Thus an initial offset that jumps to record #3, with 5 total records in the test file, will result in reading 2 records, and validating the offset of each of them in order to pass successfully.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=115338487
2016-03-31 15:53:34 -07:00
Bruce Dawson
6b18316d01 Fix signed/unsigned mismatch on VC++ builds 2016-02-19 13:59:19 -08:00
cmumford
adbe3eb073 Putting build artifacts in subdirectory.
1. Object files, libraries, and compiled executables are put
   into subdirectories.
2. The shared library is linked from individual object files.
   This provides for greater parallelism on large desktops
   while at the same time making for easier builds on small
   (i.e. embedded) systems. Fixes issue #279.
3. One program, db_bench, is compiled using the shared library.
4. The source file for "leveldbutil" was renamed from
   leveldb_main.cc to leveldbutil.cc. This provides for simpler
   makefile rules.
5. Because all targets placed the library (libleveldb.a) at the top
   level, the last platform built (desktop/device) always overwrote
   any prior artifact.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=113407013
2016-01-29 16:10:00 -08:00
Chris Mumford
dac40d25f6 Merge pull request #284 from ideawu/master
log compaction output file's level along with number
2016-01-12 11:30:32 -08:00
ssid
706b7f8d43 Resolve race when getting approximate-memory-usage property
The write operations in the table happens without holding the mutex
lock, but concurrent writes are avoided using "writers_" queue.
The Arena::MemoryUsage could access the blocks when write happens.
So, the memory usage is cached in atomic word and can be loaded
from any thread safely.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=107573379
2015-12-09 11:27:50 -08:00
cmumford
3c9ff3c691 Only compiling TrimSpace on linux.
Incorporated change by zmodem at https://github.com/google/leveldb/pull/310
to fix issue #310.

This change will only build TrimSace on linux to avoid unused function
warning/error.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=105323419
2015-12-09 10:35:07 -08:00
ssid
528c2bc6ad Add "approximate-memory-usage" property to leveldb::DB::GetProperty
The approximate RAM usage of the database is calculated from the memory
allocated for write buffers and the block cache. This is to give an
estimate of memory usage to leveldb clients.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=104222307
2015-12-09 10:34:58 -08:00
Mike Wiacek
ce45404bba Suppress error reporting after seeking but before a valid First or Full record is encountered.
Fix a spelling mistake.
2015-12-09 10:34:57 -08:00
Chris Mumford
65190ac48b Will not reuse manifest if reuse_logs options is false.
Prior implementation would always try to reuse the manifest, even if reuse_logs
was false (the default). This was missed because the stock
Env::NewAppendableFile implementation returns false forcing the creation of a
new log.
2015-08-11 14:59:48 -07:00
Sanjay Ghemawat
ac1d69da31 LevelDB now attempts to reuse the preceding MANIFEST and log file when re-opened.
(Based on a suggestion by cmumford.)

"open" benchmark on my workstation speeds up significantly since we
can now avoid three fdatasync calls and a compaction per open:

  Before: ~80000 microseconds
  After:    ~130 microseconds

Details:

(1) Added Options::reuse_logs (currently defaults to false) to control
new behavior.  The intention is to change the default to true after some
baking.

(2) Added Env::NewAppendableFile() whose default implementation returns
a not-supported error.

(3) VersionSet::Recovery attempts to reuse the MANIFEST from which
it is recovering.

(4) DBImpl recovery attempts to reuse the last log file and memtable.

(5) db_test.cc now tests a new configuration that sets reuse_logs to true.

(6) fault_injection_test also tests a reuse_logs==true config.

(7) Added a new recovery_test.
2015-08-11 14:56:39 -07:00