Use clang-format to correct formatting to be in agreement with the [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html). Doing this simplifies the process of accepting changes. Also fixed a few warnings flagged by clang-tidy.
PiperOrigin-RevId: 246350737
There were a few members which were identified to have been left
uninitialized in some constructors. These were very likely to
have been set before being used, otherwise the ASan tests would
have caught them, but still good practice to have them
initialized. This addresses some items reported in issue #668.
PiperOrigin-RevId: 243370145
This change switches corruption_test, which previously used direct file
I/O to corrupt table files for open databases, to use InMemEnv. Using an
Env eliminates some platform dependencies thus simplifying the tests.
Also removed EnvWindowsTestHelper::RelaxFilePermissions(). This was
only added because the Windows Env opens files for exclusive access.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239305329
This CL moves default values for
leveldb::{Options,ReadOptions,WriteOptions} from constructors to member
declarations, and removes now-redundant comments stating the defaults.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239271242
This CL removes AtomicPointer from leveldb's port interface. Its usage is replaced with std::atomic<> from the C++11 standard library.
AtomicPointer was used to wrap flags, numbers, and pointers, so its instances are replaced with std::atomic<bool>, std::atomic<int>, std::atomic<size_t> and std::atomic<Node*>.
This CL does not revise the memory ordering. AtomicPointer's methods are replaced mechanically with their std::atomic equivalents, even when the underlying usage is incorrect. (Example: DBImpl::has_imm_ is written using release stores, even though it is always read using relaxed ordering.) Revising the memory ordering is left for future CLs.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=237865146
Fixes GitHub issue #657.
This CL also makes the Windows CI green.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=237255887
This change adds a native Windows port (port_windows.h) and a
Windows Env (WindowsEnv).
Note1: "small" is defined when including <Windows.h> so some
parameters were renamed to avoid conflict.
Note2: leveldb::Env defines the method: "DeleteFile" which is
also a constant defined when including <Windows.h>. The solution
was to ensure this macro is defined in env.h which forces
the function, when compiled, to be either DeleteFileA or
DeleteFileW when building for MBCS or UNICODE respectively.
This resolves#519 on GitHub.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=236364778
This prevents file descriptors from leaking to child processes.
When compiled for older (pre-2.6.23) kernels which lack support for
O_CLOEXEC there is no change in behavior. With newer kernels, child
processes will no longer inherit leveldb's file handles, which
reduces the changes of accidentally corrupting the database.
Fixes https://github.com/google/leveldb/issues/623
Apple doesn't follow POSIX specifications for fsync(). Instead, fsync() guarantees to flush the buffer cache to the device, which means the data will survive kernel panics, but may not survive power outages. Applications that need stronger guarantees (like databases) need to use fcntl(F_FULLFSYNC).
This CL switches PosixWritableFile::Sync() to get the stronger guarantees on Apple systems. The improved implementation follows the same principles as SQLite [1] and node.js [2].
Research for the fcntl() to fsync() fallback strategy:
Apple's released source code at https://opensource.apple.com/ shows at least three different error codes being returned when a filesystem does not support F_FULLFSYNC.
fcntl() is implemented in xnu-4903.221.2 in bsd/kern/kern_descrip.c, where it delegates to fcntl_nocancel(). The documentation for fcntl_nocancel() mentions error codes for some operations, but does not include F_FULLFSYNC. The F_FULLSYNC branch in fcntl_nocancel() calls VNOP_IOCTL(_, F_FULLSYNC, NULL, 0, _), whose return value sets the error
code.
VNOP_IOCTL() is implemented in bsd/vfs/kpi_vfs.c and calls the ioctl function in the vnode's operation vector. The per-filesystem function names follow the pattern _vnop_ioctl() for all the instances in opensource code: {hfs,msdosfs,nfs,ntfs,smbfs,webdav,zfs}_vnop_ioctl().
hfs-407.30.1, msdosfs-229.200.3, and nfs in xnu-4903.221.2 handle F_FULLFSYNC. ntfs-94.200.1 and smb-759.40.1 do not handle F_FULLFSYNC, and the default branch returns ENOSUP. webdav-380.200.1 also does not handle F_FULLFSYNC, but the default branch returns EINVAL. zfs-59 also does not handle F_FULLSYNC, and its default branch returns ENOTTY.
From a different angle, Apple's ntfs-94.200.1 includes utility code that uses fcntl(F_FULLFSYNC) and falls back to fsync() just like we do, supporting the hypothesis that there is no good way to detect lack of F_FULLFSYNC support. Also, Apple's fcntl() man page [3] does not mention a way to detect lack of F_FULLFSYNC support.
[1] https://www.sqlite.org/src/doc/trunk/src/os_unix.c
[2] https://github.com/libuv/libuv/blob/master/src/unix/fs.c
[3] https://developer.apple.com/library/archive/documentatiVon/System/Conceptual/ManPages_iPhoneOS/man2/fcntl.2.html
Tested:
https://travis-ci.org/pwnall/leveldb/builds/477318498
TAP global presubmit
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=228593729
The space in between the header and log message was mistakenly omitted
in a prior commit. Re-adding.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=228202737
General cleanup principles:
* Use override when applicable.
* Remove static when redundant (methods and globals in anonymous
namespaces).
* Use const on class members where possible.
* Standardize on "status" for Status local variables.
* Renames where clarity can be improved.
* Qualify standard library names with std:: when possible, to
distinguish from POSIX names.
* Qualify POSIX names with the global namespace (::) when possible, to
distinguish from standard library names.
This also refactors the background thread synchronization logic so that
it's statically analyzable.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=219212089
C++11 guarantees thread-safe initialization of static variables inside
functions. This is a more restricted form of std::call_once or
pthread_once_t (e.g., single call site), so the compiler might be able
to generate better code [1]. Equally important, having less
platform-dependent code in env_posix.cc makes it easier to port to other
platforms.
Due to the change above, this CL introduced a new approach for storing
the singleton PosixEnv instance returned by Env::Default(). The new
approach avoids a dynamic memory allocation, which eliminates the false
positive from LeakSanitizer reported in
https://github.com/google/leveldb/issues/539 and
https://github.com/google/leveldb/issues/113
[1] https://stackoverflow.com/a/27206650/
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=214293129
This commit replaces the use of pthreads in the POSIX port with std::thread
and port::Mutex + port::CondVar. This is intended to simplify porting
the env to a different platform.
The indirect use of pthreads in PosixLogger is replaced with
std:🧵:id(), based on an approach prototyped by @cmumfordx@.
The pthreads dependency in CMakeFiles is not removed, because some C++
standard library implementations must be linked against pthreads for
std::thread use. Figuring out this dependency is left for future work.
Switching away from pthreads also fixes
https://github.com/google/leveldb/issues/381
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=212478311
This is not an API-breaking change, because it reduces the API that the
leveldb embedder must implement. The project will build just fine
against ports that still implement InitOnce.
C++11 guarantees thread-safe initialization of static variables inside
functions. This is a more restricted form of std::call_once or
pthread_once_t (e.g., single call site), so the compiler might be able
to generate better code [1]. Equally important, having less code in
port_example.h makes it easier to port to other platforms.
Due to the change above, this CL introduces a new approach for storing
the singleton BytewiseComparatorImpl instance returned by
BytewiseComparator(). The new approach avoids a dynamic memory
allocation, which eliminates the false positive from LeakSanitizer
reported in https://github.com/google/leveldb/issues/200
[1] https://stackoverflow.com/a/27206650/
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=212348004
This is separated from the general cleanup because of the logic changes
in SyncDirIfManifest().
General cleanup principles:
* Use override when applicable.
* Remove static when redundant (methods and globals in anonymous
namespaces).
* Use const on class members where possible.
* Standardize on "status" for Status local variables.
* Renames where clarity can be improved.
* Qualify standard library names with std:: when possible, to
distinguish from POSIX names.
* Qualify POSIX names with the global namespace (::) when possible, to
distinguish from standard library names.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211709673
General cleanup principles:
* Use override when applicable.
* Use const on class members where possible.
* Renames where clarity can be improved.
* Qualify standard library names with std:: when possible, to
distinguish from POSIX names.
* Qualify POSIX names with the global namespace (::) when possible, to
distinguish from standard library names.
This also revamps the logic for putting together a message into the
in-memory buffer before that is passed to fwrite(). While correct in
practice, the current implementation advances a char pointer past the
size of its buffer, which is technically undefined behavior.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211472570
Now that we require C++11, we can use std::atomic<int>, which has
primitives for most of the logic we need. As a bonus, the happy path for
Limiter::Acquire() and Limiter::Release() only performs one atomic
operation.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211469518
The porting layer implements threading primitives: atomic pointers,
condition variables, mutexes, thread-safe initialization. These are all
specified in C++11, so the reference open source port implementation can
become platform-independent.
The porting layer will remain in place to allow the use of other
implementations with more features, such as the built-in deadlock
detection in abseil's Mutex.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=193245934
Commit a0008deb67 introduced
std::numeric_limits usage in logging.cc, but didn't #include <limits>
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192840190
The old implementation caused odd crashes on ARM, which were fixed by
changing a local variable type. The main suspect is the use of a static
local variable. This CL replaces the static local variable with
constexpr, which still ensures the compiler sees the expressions as
constants.
The CL also replaces Slice operations in the functions' inner loop with
iterator-style pointer operations, which can help the compiler generate
less code.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192832175
ConsumeDecimalNumber has fairly non-trivial logic, and a previous
version has crashed inexplicably on Android. Having some test coverage
will make it easier to tweak / simplify the function later on.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=192821751
After this CL, all classes with Mutex members should be covered by annotations. Exceptions are atomic members, which shouldn't need locking, and DBImpl members that cause errors when annotated, which will be tackled separately.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=190260865
This CL makes it easier to reason about thread safety by:
1) Adding Clang thread safety annotations according to comments.
2) Expanding a couple of variable names, without adding extra lines of code.
3) Adding const in a couple of places.
4) Replacing an always-non-null const pointer with a reference.
5) Fixing style warnings in the modified files.
This CL does not annotate the DBImpl members that claim to be protected
by the instance mutex, but are accessed without the mutex being held.
Those members (and their unprotected accesses) will be addressed in
future CLs.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189354657
This CL removes unused headers included by util/testharness.h, adds
precise includes where the build breaks, and fixes style errors in the
edited files.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=189331061
External linkage is the default for function declarations in C++.
This also fixes ClangTidy errors generated by removing the "extern"
keyword as described above.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=188730416
CL 170738066 introduced std::min and std::max to env_test.cc. These
require the <algorithm> header.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171024062
Deleting a PosixWritableFile without calling Close() leaks the file
descriptor. While the API description in include/leveldb/env.h does not
specify whether the caller is responsible for Close()ing the file before
deleting it, all other Env file implementations do release underlying
resources when destroyed, even if Close() is not called.
The leak shows up when running db_tests on Mac Travis, or on a vanilla
MacOS install.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170906843
This also removes std::unique_ptr introduced in CL 170738066, because
it's C++11-only, and the open source version still supports older
versions at the moment.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170876919
If the file already existed, we should have truncated it. This was not
detected by leveldb tests since leveldb code avoids reusing same files,
but there was code elsewhere that was directly using leveldb files and
relying on this behavior.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170769101
If leveldb::Options::block_cache is set to a cache of zero capacity
then it is possible for LRUHandle::next to be used without having been
set.
Conditional jump or move depends on uninitialised value(s):
leveldb::(anonymous namespace)::LRUHandle::key() const (cache.cc:58)
leveldb::(anonymous namespace)::LRUCache::Unref(leveldb::(anonymous namespace)::LRUHandle*) (cache.cc:234)
leveldb::(anonymous namespace)::LRUCache::Release(leveldb::Cache::Handle*) (cache.cc:266)
leveldb::(anonymous namespace)::ShardedLRUCache::Release(leveldb::Cache::Handle*) (cache.cc:375)
leveldb::CacheTest::Insert(int, int, int) (cache_test.cc:59)
This bug forced a commit reversion in Chromium. For more information see
https://bugs.chromium.org/p/chromium/issues/detail?id=761398#c4
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170749054
env_posix.cc and concurrent application calls to fflush(NULL).
The fix is to avoid using stdio in env_posix.cc but add our own
buffering where we need it.
Added a test to reproduce the bug.
Added a test for Env reads/writes.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170738066
An Android test was occasionally crashing with a SEGV in ConsumeDecimalNumber
Switching a local variable from an int to uint64_t eliminated these crashes.
Speculating this is either a compiler, runtime library, or emulator issue.
Switching this type to uint64_t also eliminates a compiler warning
about comparing an int with a uint64_t.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=166399695
BTRFS reorders rename and write operations, so it is possible that a filesystem crash and recovery results in a situation where the file pointed to by CURRENT does not exist. DB::Open currently reports an I/O error in this case. Reporting database corruption is a better hint to the caller, which can attempt to recover the database or erase it and start over.
This issue is not merely theoretical. It was reported as having showed up in the wild at https://github.com/google/leveldb/issues/195 and at https://crbug.com/738961. Also, asides from the BTRFS case described above, incorrect data in CURRENT seems like a possible corruption case that should be handled gracefully.
The Env API changes here can be considered backwards compatible, because an implementation that returns Status::IOError instead of Status::NotFound will still get the same functionality as before.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=161432630
env_test.cc defines EnvPosixTest which tests the Env implementation returned by Env::Default(). The naming is a bit unfortunate, as the tests in env_test.cc are written against the Env contract, and therefore are applicable to any Env implementation. An instance of the confusion caused by the naming is [] which added a dependency from env_test.cc to EnvPosixTestHelper, which is closely coupled to EnvPosix.
This change disentangles EnvPosix-specific test code into a env_posix_test.cc file. The code there uses EnvPosixTestHelper and specifically targets the EnvPosix implementation. env_test.cc now implements EnvTest, and contains tests that are also applicable to other ports, which may define their own Env implementation.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=148914642
This change authored by vadimskipin and submitted via:
https://github.com/google/leveldb/pull/309
Changes made to support iOS builds and other architectures
without support for SSE 4.2.
db_bench reports original crc32 speed at:
crc32c : 3.610 micros/op; 1082.0 MB/s (4K per op)
with this change performance has increased to:
crc32c : 0.843 micros/op; 4633.6 MB/s (4K per op)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=148694935
Background compaction can create an unbounded number of
leveldb::RandomAccessFile instances. On 64-bit systems mmap is used and
file descriptors are only used beyond a certain number of mmap's.
32-bit systems to not use mmap at all. leveldb::RandomAccessFile does not
observe Options.max_open_files so compaction could exhaust the file
descriptor limit.
This change uses getrlimit to determine the maximum number of open
files and limits RandomAccessFile to approximately 20% of that value.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=143505556
Background:
LevelDB uses a cache (util/cache.h, util/cache.cc) of (key,value)
pairs for two purposes:
- a cache of (table, file handle) pairs
- a cache of blocks
The cache places the (key,value) pairs in a reference-counted
wrapper. When it returns a value, it returns a reference to this
wrapper. When the client has finished using the reference and
its enclosed (key,value), it calls Release() to decrement the
reference count.
Each (key,value) pair has an associated resource usage. The
cache maintains the sum of the usages of the elements it holds,
and removes values as needed to keep the sum below a capacity
threshold. It maintains an LRU list so that it will remove the
least-recently used elements first.
The max_open_files option to LevelDB sets the size of the cache
of (table, file handle) pairs. The option is not used in any
other way.
The observed behaviour:
If LevelDB at any time used more file handles concurrently than
the cache size set via max_open_files, it attempted to reduce the
number by evicting entries from the table cache. This could
happen most easily during compaction, and if max_open_files was
low. Because the handles were in use, their reference count did
not drop to zero, and so the usage sum in the cache was not
modified by the evictions. Subsequent Insert() calls returned
valid handles, but their entries were immediately evicted from
the cache, which though empty still acted as though full. As a
result, there was effectively no caching, and the number of open
file handles rose []ly until it hit system-imposed limits and
the process died.
If one set max_open_files lower, the cache was more likely to
exhibit this beahviour, and cause the process to run out of file
descriptors. That is, max_open_files acted in almost exactly the
opposite manner from what was intended.
The problems:
1. The cache kept all elements on its LRU list eligible for capacity
eviction---even those with outstanding references from clients. This was
ineffective in reducing resource consumption because there was an
outstanding reference, guaranteeing that the items remained. A secondary
issue was that there is no guarantee that these in-use items will be the
last things reached in the LRU chain, which actually recorded
"least-recently requested" rather than "least-recently used".
2. The sum of usages was decremented not when a (key,value) was evicted from
the cache, but when its reference count went to zero. Thus, when things
were removed from the cache, either by garbage collection or via Erase(),
the usage sum was not necessarily decreased. This allowed the cache to act
as though full when it was in fact not, reducing caching effectiveness, and
leading to more resources being consumed---the opposite of what the
evictions were intended to achieve.
3. (minor) The cache's clients insert items into it by first looking up the
key, and inserting only if no value is found. Although the cache has an
internal lock, the clients use no locking to ensure atomicity of the
Lookup/Insert pair. (see table/table.cc: block_cache->Insert() and
db/table_cache.cc: cache_->Insert()). Thus, if two threads Insert() at
about the same time, they can both Lookup(), find nothing, and both
Insert(). The second Insert() would evict the first value, leaving each
thread with a handle on its own version of the data, and with the second
version in the cache. It would be better if both threads ended up with a
handle on the same (key,value) pair, which implies it must be the first item
inserted. This suggests that Insert() should not replace an existing value.
This can be made safe with current usage inside LeveDB itself, but this is
not easy to change first because Cache is a public interface, so to change
the semantics of an existing call might break things, second because Cache
is an abstract virtual class, so adding a new abstract virtual method may
break other implementations, and third, the new method "insert without
replacing" cannot be implemented in terms of the existing methods, so cannot
be implemented with a non-abstract default. But fortunately, the effects
of this issue are minor, so this issue is not fixed by this change.
The changes:
The assumption in the fixes is that it is always better to cache
entries unless removal from the cache would lead to deallocation.
Cache entries now have an "in_cache" boolean indicating whether
the cache has a reference on the entry. The only ways that this can
become false without the entry being passed to its "deleter" are via
Erase(), via Insert() when an element with a duplicate key is inserted,
or on destruction of the cache.
The cache now keeps two linked lists instead of one. All items
in the cache are in one list or the other, and never both. Items
still referenced by clients but erased from the cache are in
neither list. The lists are:
- in-use: contains the items currently referenced by clients, in no particular
order. (This list is used for invariant checking. If we removed the check,
elements that would otherwise be on this list could be left as disconnected
singleton lists.)
- LRU: contains the items not currently referenced by clients, in LRU order
A new internal Ref() method increments the reference count. If
incrementing from 1 to 2 for an item in the cache, it is moved
from the LRU list to the in-use list.
The Unref() call now moves things from the in-use list to the LRU
list if the reference count falls to 1, and the item is in the
cache. It no longer adjusts the usage sum. The usage sum now
reflects only what is in the cache, rather than including
still-referenced items that have been evicted.
The LRU_Append() now takes a "list" parameter so that it can be
used to append either to the LRU list or the in-use list.
Lookup() is modified to use the new Ref() call, rather than
adjusting the reference count and LRU chain directly.
Insert() eviction code is also modified to adjust the usage sum and the
in_cache boolean of the evicted elements. Some LevelDB tests assume that there
will be no caching whatsoever if the cache size is set to zero, so this is
handled as a special case.
A new private method FinishErase() is factored out
with the common code from where items are removed from the cache.
Erase() is modified to adjust the usage sum and the in_cache
boolean of the erased elements, and to use FinishErase().
Prune() is modified to use FinishErase() also, and to make use of the fact that
the lru_ list now contains only items with reference count 1.
- EvictionPolicy is modified to test that an entry with an
outstanding handle is not evicted. This test fails with the old cache.cc.
- A new test case UseExceedsCacheSize verifies that even when the
cache is overfull of entries with outstanding handles, none are
evicted. This test fails with the old cache.cc, and is the key
issue that causes file descriptors to run out when the cache
size is set too small.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=123247237
The write operations in the table happens without holding the mutex
lock, but concurrent writes are avoided using "writers_" queue.
The Arena::MemoryUsage could access the blocks when write happens.
So, the memory usage is cached in atomic word and can be loaded
from any thread safely.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=107573379
The approximate RAM usage of the database is calculated from the memory
allocated for write buffers and the block cache. This is to give an
estimate of memory usage to leveldb clients.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=104222307
Prune() drops on-memory read cache of the database, so that the client can
relief its memory shortage.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=101335710
The create function took |num_keys| as an int, but callers and implementers wanted it to function as a size_t (e.g. passing std::vector::size() in, passing it to vector constructors as a size arg, indexing containers by it, etc.). This resulted in implicit conversions between the two types as well as warnings (found with Chromium's external copy of these sources, built with MSVC) about signed vs. unsigned comparisons.
The leveldb sources were already widely using size_t elsewhere, e.g. for key and filter lengths, so using size_t here is not inconsistent with the existing code. However, it does change the public C API.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=101074871
(Based on a suggestion by cmumford.)
"open" benchmark on my workstation speeds up significantly since we
can now avoid three fdatasync calls and a compaction per open:
Before: ~80000 microseconds
After: ~130 microseconds
Details:
(1) Added Options::reuse_logs (currently defaults to false) to control
new behavior. The intention is to change the default to true after some
baking.
(2) Added Env::NewAppendableFile() whose default implementation returns
a not-supported error.
(3) VersionSet::Recovery attempts to reuse the MANIFEST from which
it is recovering.
(4) DBImpl recovery attempts to reuse the last log file and memtable.
(5) db_test.cc now tests a new configuration that sets reuse_logs to true.
(6) fault_injection_test also tests a reuse_logs==true config.
(7) Added a new recovery_test.
Changes are:
* Update version number to 1.18
* Replace the basic fprintf call with a call to fwrite in order to
work around the apparent compiler optimization/rewrite failure that we are
seeing with the new toolchain/iOS SDKs provided with Xcode6 and iOS8.
* Fix ALL the header guards.
* Createed a README.md with the LevelDB project description.
* A new CONTRIBUTING file.
* Don't implicitly convert uint64_t to size_t or int. Either preserve it as
uint64_t, or explicitly cast. This fixes MSVC warnings about possible value
truncation when compiling this code in Chromium.
* Added a DumpFile() library function that encapsulates the guts of the
"leveldbutil dump" command. This will allow clients to dump
data to their log files instead of stdout. It will also allow clients to
supply their own environment.
* leveldb: Remove unused function 'ConsumeChar'.
* leveldbutil: Remove unused member variables from WriteBatchItemPrinter.
* OpenBSD, NetBSD and DragonflyBSD have _LITTLE_ENDIAN, so define
PLATFORM_IS_LITTLE_ENDIAN like on FreeBSD. This fixes:
* issue #143
* issue #198
* issue #249
* Switch from <cstdatomic> to <atomic>. The former never made it into the
standard and doesn't exist in modern gcc versions at all. The later contains
everything that leveldb was using from the former.
This problem was noticed when porting to Portable Native Client where no memory
barrier is defined. The fact that <cstdatomic> is missing normally goes
unnoticed since memory barriers are defined for most architectures.
* Make Hash() treat its input as unsigned. Before this change LevelDB files
from platforms with different signedness of char were not compatible. This
change fixes: issue #243
* Verify checksums of index/meta/filter blocks when paranoid_checks set.
* Invoke all tools for iOS with xcrun. (This was causing problems with the new
XCode 5.1.1 image on pulse.)
* include <sys/stat.h> only once, and fix the following linter warning:
"Found C system header after C++ system header"
* When encountering a corrupted table file, return Status::Corruption instead of
Status::InvalidArgument.
* Support cygwin as build platform, patch is from https://code.google.com/p/leveldb/issues/detail?id=188
* Fix typo, merge patch from https://code.google.com/p/leveldb/issues/detail?id=159
* Fix typos and comments, and address the following two issues:
* issue #166
* issue #241
* Add missing db synchronize after "fillseq" in the benchmark.
* Removed unused variable in SeekRandom: value (issue #201)
- switched from mmap based writing to simpler stdio based writing. Has a
minor impact (0.5 microseconds) on microbenchmarks for asynchronous
writes. Synchronous writes speed up from 30ms to 10ms on linux/ext4.
Should be much more reliable on diverse platforms.
- compaction errors now immediately put the database into a read-only
mode (until it is re-opened). As a downside, a disk going out of
space and then space being created will require a re-open to recover
from, whereas previously that would happen automatically. On the
plus side, many corruption possibilities go away.
- force the DB to enter an error-state so that all future writes fail
when a synchronous log write succeeds but the sync fails.
- repair now regenerates sstables that exhibit problems
- fix issue 218 - Use native memory barriers on OSX
- fix issue 212 - QNX build is broken
- fix build on iOS with xcode 5
- make tests compile and pass on windows
Fix issues 200, 201
Also,
* Fix link to bigtable paper in docs.
* New sstables will have the file extension .ldb. .sst files will
continue to be recognized.
* When building for iOS, use xcrun to execute the compiler. This may
affect issue 177.
Fix issues 77, 87, 182, 190.
Additionally, fix the bug described in
https://groups.google.com/d/msg/leveldb/yL6h1mAOc20/vLU64RylIdMJ
where a large contiguous keyspace of deleted data was not getting
compacted.
Also fix a bug where options.max_open_files was not getting clamped
properly.
Fixes issues
147 - thanks feniksgordonfreeman
153
156
166
Additionally,
* Remove calls to exit(1).
* Fix unused-variable warnings from clang.
* Fix possible overflow error related to num_restart value >= (2^32/4).
* Add leveldbutil to .gitignore.
* Add better log messages when Write is stalled on a compaction.
Highlights
----------
Mmap at most 1000 files on Posix to improve performance for large databases.
Support for more architectures (thanks to Alexander K.)
Building and porting
--------------------
HP/UX support (issue 126)
AtomicPointer for ia64 (issue 123)
Sparc v9 support (issue 124)
Atomic ops for powerpc
Use -fno-builtin-memcmp only when using g++
Simplify IOS build rules (issue 114)
Use CXXFLAGS instead of CFLAGS when invoking C++ compiler (issue 118)
Fix snappy shared library problem (issue 94)
Fix shared library installation path regression
Endian-ness detection tweak for FreeBSD
Bug fixes
---------
Stop ignoring FLAGS_open_files in db_bench
Make bloom test behavior agnostic to endian-ness
Performance
-----------
Limit number of mmapped files to 1000 to improve perf for large dbs
Do not delay for 1 second on shutdown path (issue 125)
Misc
----
Make InMemoryEnv return a no-op logger
C binding now has a wrapper for free (issue 117)
Add thread-safety annotations
Added an in-process lock table (issue 120)
Make RandomAccessFile and SequentialFile non-copyable
various platforms; improve android port speed.
Avoid static initializer by using a new portability interface for
thread-safe lazy initialization. Custom ports will need to be
extended to implement InitOnce/OnceType/LEVELDB_ONCE_INIT.
Fix endian-ness detection (fixes Powerpc builds).
Build related fixes:
- Support platforms that have unversioned shared libraries.
- Fix IOS build rules.
Android improvements
- Speed up atomic pointers
- Share more code with port_posix.
Do not spin in a tight loop attempting compactions if the file system
is inaccessible (e.g., if kerberos tickets have expired or if it is out
of space).
In particular, we add a new FilterPolicy class. An instance
of this class can be supplied in Options when opening a
database. If supplied, the instance is used to generate
summaries of keys (e.g., a bloom filter) which are placed in
sstables. These summaries are consulted by DB::Get() so we
can avoid reading sstable blocks that are guaranteed to not
contain the key we are looking for.
This change provides one implementation of FilterPolicy
based on bloom filters.
Other changes:
- Updated version number to 1.4.
- Some build tweaks.
- C binding for CompactRange.
- A few more benchmarks: deleteseq, deleterandom, readmissing, seekrandom.
- Minor .gitignore update.
- Pass system's values of CFLAGS,LDFLAGS.
Don't override OPT if it's already set.
Original patch by Alessio Treglia <alessio@debian.org>:
http://code.google.com/p/leveldb/issues/detail?id=27#c6
- Remove 1 exit time destructor from leveldb.
See http://crbug.com/101600
- Fix problem where sstable building code would pass an
internal key to the user comparator.
(Sync with uptream at 25436817.)
- Replace raw slice comparison with a call to user comparator.
Added test for custom comparators.
- Fix end of namespace comments.
- Fixed bug in picking inputs for a level-0 compaction.
When finding overlapping files, the covered range may expand
as files are added to the input set. We now correctly expand
the range when this happens instead of continuing to use the
old range. For example, suppose L0 contains files with the
following ranges:
F1: a .. d
F2: c .. g
F3: f .. j
and the initial compaction target is F3. We used to search
for range f..j which yielded {F2,F3}. However we now expand
the range as soon as another file is added. In this case,
when F2 is added, we expand the range to c..j and restart the
search. That picks up file F1 as well.
This change fixes a bug related to deleted keys showing up
incorrectly after a compaction as described in Issue 44.
(Sync with upstream @25072954)
- Added DB::CompactRange() method.
Changed manual compaction code so it breaks up compactions of
big ranges into smaller compactions.
Changed the code that pushes the output of memtable compactions
to higher levels to obey the grandparent constraint: i.e., we
must never have a single file in level L that overlaps too
much data in level L+1 (to avoid very expensive L-1 compactions).
Added code to pretty-print internal keys.
- Fixed bug where we would not detect overlap with files in
level-0 because we were incorrectly using binary search
on an array of files with overlapping ranges.
Added "leveldb.sstables" property that can be used to dump
all of the sstables and ranges that make up the db state.
- Removing post_write_snapshot support. Email to leveldb mailing
list brought up no users, just confusion from one person about
what it meant.
- Fixing static_cast char to unsigned on BIG_ENDIAN platforms.
Fixes Issue 35 and Issue 36.
- Comment clarification to address leveldb Issue 37.
- Change license in posix_logger.h to match other files.
- A build problem where uint32 was used instead of uint32_t.
Sync with upstream @24408625
Fix GCC -Wshadow warnings in LevelDB's public header files,
reported by Dustin.
Add in-memory Env implementation (helpers/memenv/*).
This enables users to create LevelDB databases in-memory.
Initialize ShardedLRUCache::last_id_ to zero.
This fixes a Valgrind warning.
(Also delete port/sha1_* which were removed upstream some time ago.)
- Fix for issue 33 (non-null-terminated result from
leveldb_property_value())
- Support for running multiple instances of a benchmark in parallel.
- Reduce lock contention on Get():
(1) Do not hold the lock while searching memtables.
(2) Shard block and table caches 16-ways.
Benchmark for evaluating this change:
$ db_bench --benchmarks=fillseq1,readrandom --threads=$n
(fillseq1 is a small hack to make sure fillseq runs once regardless
of number of threads specified on the command line).
git-svn-id: https://leveldb.googlecode.com/svn/trunk@49 62dab493-f737-651d-591e-8d6aee1b9529
- Fix bug in Iterator::Prev where it would return the wrong key.
Fixes issues 29 and 30.
- Added a tweak to testharness to allow running just some tests.
- Fixing two minor documentation errors based on issues 28 and 25.
- Cleanup; fix namespaces of export-to-C code.
Also fix one "const char*" vs "char*" mismatch.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@48 62dab493-f737-651d-591e-8d6aee1b9529
- LevelDB patch for FreeBSD. This resolves Issue 22.
Contributed by dforsythe (thanks!).
- Removing Chromium-specific files.
They are now going to live in the Chromium repository.
- Adding a benchmark page comparing LevelDB performance
to SQLite and Kyoto Cabinet's TreeDB, along with
code to generate the benchmarks.
Thanks to Kevin Tseng for compiling the benchmarks,
and Scott Hess and Mikio Hirabayashi for their
help and advice.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@40 62dab493-f737-651d-591e-8d6aee1b9529
- Removed one copy of an uncompressed block contents changing
the signature of Snappy_Uncompress() so it uncompresses into a
flat array instead of a std::string.
Speeds up readrandom ~10%.
- Instead of a combination of Env/WritableFile, we now have a
Logger interface that can be easily overridden applications
that want to supply their own logging.
- Separated out the gcc and Sun Studio parts of atomic_pointer.h
so we can use 'asm', 'volatile' keywords for Sun Studio.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@39 62dab493-f737-651d-591e-8d6aee1b9529
- LevelDB patch for Sun Studio
Based on a patch submitted by Theo Schlossnagle - thanks!
This fixes Issue 17.
- Fix a couple of test related memory leaks.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@38 62dab493-f737-651d-591e-8d6aee1b9529
Slight tweak to the no-overlap optimization: only push to
level 2 to reduce the amount of wasted space when the same
small key range is being repeatedly overwritten.
Fix for Issue 18: Avoid failure on Windows by avoiding
deletion of lock file until the end of DestroyDB().
Fix for Issue 19: Disregard sequence numbers when checking for
overlap in sstable ranges. This fixes issue 19: when writing
the same key over and over again, we would generate a sequence
of sstables that were never merged together since their sequence
numbers were disjoint.
Don't ignore map/unmap error checks.
Miscellaneous fixes for small problems Sanjay found while diagnosing
issue/9 and issue/16 (corruption_testr failures).
- log::Reader reports the record type when it finds an unexpected type.
- log::Reader no longer reports an error when it encounters an expected
zero record regardless of the setting of the "checksum" flag.
- Added a missing forward declaration.
- Documented a side-effects of larger write buffer sizes
(longer recovery time).
git-svn-id: https://leveldb.googlecode.com/svn/trunk@37 62dab493-f737-651d-591e-8d6aee1b9529
This revision adds two major changes:
1. build_detect_platform which generates build_config.mk
with platform-dependent flags for the build process
2. /port/atomic_pointer.h with anAtomicPointerimplementation
for platforms without <cstdatomic>
Some of this code is loosely based on patches submitted to the
LevelDB mailing list at https://groups.google.com/forum/#!forum/leveldb
Tip of the hat to Dave Smith and Edouard A, who both sent patches.
The presence of Snappy (http://code.google.com/p/snappy/) and
cstdatomic are now both detected in the build_detect_platform
script (1.) which gets executing during make.
For (2.), instead of broadly importing atomicops_* from Chromium or
the Google performance tools, we chose to just implement AtomicPointer
and the limited atomic load and store operations it needs.
This resulted in much less code and fewer files - everything is
contained in atomic_pointer.h.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@34 62dab493-f737-651d-591e-8d6aee1b9529
* Patch LevelDB to build for OSX and iOS
* Fix race condition in memtable iterator deletion.
* Other small fixes.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@29 62dab493-f737-651d-591e-8d6aee1b9529
* env_chromium.cc should not export symbols.
* Fix MSVC warnings.
* Removed large value support.
* Fix broken reference to documentation file
git-svn-id: https://leveldb.googlecode.com/svn/trunk@24 62dab493-f737-651d-591e-8d6aee1b9529