Commit Graph

109 Commits

Author SHA1 Message Date
sanjay
2372ac574f Fix file writing bug in CL 170738066.
If the file already existed, we should have truncated it. This was not
detected by leveldb tests since leveldb code avoids reusing same files,
but there was code elsewhere that was directly using leveldb files and
relying on this behavior.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170769101
2017-10-03 11:32:02 -07:00
cmumford
1c75e88055 Fix use of uninitialized value in LRUHandle.
If leveldb::Options::block_cache is set to a cache of zero capacity
then it is possible for LRUHandle::next to be used without having been
set.

Conditional jump or move depends on uninitialised value(s):
  leveldb::(anonymous namespace)::LRUHandle::key() const (cache.cc:58)
  leveldb::(anonymous namespace)::LRUCache::Unref(leveldb::(anonymous namespace)::LRUHandle*) (cache.cc:234)
  leveldb::(anonymous namespace)::LRUCache::Release(leveldb::Cache::Handle*) (cache.cc:266)
  leveldb::(anonymous namespace)::ShardedLRUCache::Release(leveldb::Cache::Handle*) (cache.cc:375)
  leveldb::CacheTest::Insert(int, int, int) (cache_test.cc:59)

This bug forced a commit reversion in Chromium. For more information see
https://bugs.chromium.org/p/chromium/issues/detail?id=761398#c4

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170749054
2017-10-03 11:30:48 -07:00
sanjay
7e12c00ecf Fix issue 474: a race between the f*_unlocked() STDIO calls in
env_posix.cc and concurrent application calls to fflush(NULL).

The fix is to avoid using stdio in env_posix.cc but add our own
buffering where we need it.

Added a test to reproduce the bug.

Added a test for Env reads/writes.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170738066
2017-10-03 11:27:09 -07:00
costan
bcd9a8ea4a Use portable CRC32C from google/crc32c.
Benchmark results below. More results at
354d61ef97.

New, MacBookPro13,3 with Core i7 6920HQ:
LevelDB:    version 1.20
Keys:       16 bytes each
Values:     100 bytes each (50 bytes after compression)
Entries:    1000000
RawSize:    110.6 MB (estimated)
FileSize:   62.9 MB (estimated)
WARNING: Snappy compression is not enabled
------------------------------------------------
fillseq      :       2.952 micros/op;   37.5 MB/s
fillsync     :      43.932 micros/op;    2.5 MB/s (1000 ops)
fillrandom   :       3.856 micros/op;   28.7 MB/s
overwrite    :       4.053 micros/op;   27.3 MB/s
readrandom   :       4.234 micros/op; (1000000 of 1000000 found)
readrandom   :       3.923 micros/op; (1000000 of 1000000 found)
readseq      :       0.201 micros/op;  550.8 MB/s
readreverse  :       0.356 micros/op;  310.6 MB/s
compact      :  436800.000 micros/op;
readrandom   :       2.375 micros/op; (1000000 of 1000000 found)
readseq      :       0.151 micros/op;  734.3 MB/s
readreverse  :       0.298 micros/op;  370.7 MB/s
fill100K     :     554.075 micros/op;  172.1 MB/s (1000 ops)
crc32c       :       1.393 micros/op; 2805.0 MB/s (4K per op)
snappycomp   :    3902.000 micros/op; (snappy failure)
snappyuncomp :    3821.000 micros/op; (snappy failure)
acquireload  :      13.088 micros/op; (each op is 1000 loads)

Baseline, MacBookPro13,3 with Core i7 6920HQ:
LevelDB:    version 1.20
Keys:       16 bytes each
Values:     100 bytes each (50 bytes after compression)
Entries:    1000000
RawSize:    110.6 MB (estimated)
FileSize:   62.9 MB (estimated)
WARNING: Snappy compression is not enabled
------------------------------------------------
fillseq      :       3.000 micros/op;   36.9 MB/s
fillsync     :      46.721 micros/op;    2.4 MB/s (1000 ops)
fillrandom   :       3.922 micros/op;   28.2 MB/s
overwrite    :       4.080 micros/op;   27.1 MB/s
readrandom   :       4.409 micros/op; (1000000 of 1000000 found)
readrandom   :       3.895 micros/op; (1000000 of 1000000 found)
readseq      :       0.190 micros/op;  582.4 MB/s
readreverse  :       0.413 micros/op;  267.6 MB/s
compact      :  441076.000 micros/op;
readrandom   :       2.308 micros/op; (1000000 of 1000000 found)
readseq      :       0.170 micros/op;  651.2 MB/s
readreverse  :       0.302 micros/op;  366.2 MB/s
fill100K     :     614.289 micros/op;  155.3 MB/s (1000 ops)
crc32c       :       3.547 micros/op; 1101.2 MB/s (4K per op)
snappycomp   :    3393.000 micros/op; (snappy failure)
snappyuncomp :    3171.000 micros/op; (snappy failure)
acquireload  :      12.761 micros/op; (each op is 1000 loads)

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=170100372
2017-09-26 18:50:41 -07:00
cmumford
09a3c8e741 Switched variable type from int to uint64_t in ConsumeDecimalNumber.
An Android test was occasionally crashing with a SEGV in ConsumeDecimalNumber
Switching a local variable from an int to uint64_t eliminated these crashes.
Speculating this is either a compiler, runtime library, or emulator issue.

Switching this type to uint64_t also eliminates a compiler warning
about comparing an int with a uint64_t.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=166399695
2017-08-24 15:40:54 -07:00
costan
8415f00eee leveldb: Report missing CURRENT manifest file as database corruption.
BTRFS reorders rename and write operations, so it is possible that a filesystem crash and recovery results in a situation where the file pointed to by CURRENT does not exist. DB::Open currently reports an I/O error in this case. Reporting database corruption is a better hint to the caller, which can attempt to recover the database or erase it and start over.

This issue is not merely theoretical. It was reported as having showed up in the wild at https://github.com/google/leveldb/issues/195 and at https://crbug.com/738961. Also, asides from the BTRFS case described above, incorrect data in CURRENT seems like a possible corruption case that should be handled gracefully.

The Env API changes here can be considered backwards compatible, because an implementation that returns Status::IOError instead of Status::NotFound will still get the same functionality as before.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=161432630
2017-07-10 14:14:00 -07:00
果冻
5b817400a0 fix comment 2017-03-10 14:23:19 +08:00
costan
f3f139737c Separate Env tests from PosixEnv tests.
env_test.cc defines EnvPosixTest which tests the Env implementation returned by Env::Default(). The naming is a bit unfortunate, as the tests in env_test.cc are written against the Env contract, and therefore are applicable to any Env implementation. An instance of the confusion caused by the naming is [] which added a dependency from env_test.cc to EnvPosixTestHelper, which is closely coupled to EnvPosix.

This change disentangles EnvPosix-specific test code into a env_posix_test.cc file. The code there uses EnvPosixTestHelper and specifically targets the EnvPosix implementation. env_test.cc now implements EnvTest, and contains tests that are also applicable to other ports, which may define their own Env implementation.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=148914642
2017-03-01 13:53:23 -08:00
costan
ea175e28f8 Implement support for Intel crc32 instruction (SSE 4.2)
This change authored by vadimskipin and submitted via:

    https://github.com/google/leveldb/pull/309

Changes made to support iOS builds and other architectures
without support for SSE 4.2.

db_bench reports original crc32 speed at:

    crc32c : 3.610 micros/op; 1082.0 MB/s (4K per op)

with this change performance has increased to:

    crc32c : 0.843 micros/op; 4633.6 MB/s (4K per op)

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=148694935
2017-02-28 14:08:46 -08:00
cmumford
95cd743e5e Including <limits> for std::numeric_limits.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=146841327
2017-02-09 14:09:51 -08:00
cmumford
646c3588de Limit the number of read-only files the POSIX Env will have open.
Background compaction can create an unbounded number of
leveldb::RandomAccessFile instances. On 64-bit systems mmap is used and
file descriptors are only used beyond a certain number of mmap's.
32-bit systems to not use mmap at all. leveldb::RandomAccessFile does not
observe Options.max_open_files so compaction could exhaust the file
descriptor limit.

This change uses getrlimit to determine the maximum number of open
files and limits RandomAccessFile to approximately 20% of that value.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=143505556
2017-01-04 09:13:20 -08:00
corrado
a2fb086d07 Add option for max file size. The currend hard-coded value of 2M is inefficient in colossus.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=134391640
2016-09-28 10:52:24 -07:00
m3b
06a191b8de fix problems in LevelDB's caching code
Background:

LevelDB uses a cache (util/cache.h, util/cache.cc) of (key,value)
pairs for two purposes:
- a cache of (table, file handle) pairs
- a cache of blocks

The cache places the (key,value) pairs in a reference-counted
wrapper.  When it returns a value, it returns a reference to this
wrapper.  When the client has finished using the reference and
its enclosed (key,value), it calls Release() to decrement the
reference count.

Each (key,value) pair has an associated resource usage.  The
cache maintains the sum of the usages of the elements it holds,
and removes values as needed to keep the sum below a capacity
threshold.  It maintains an LRU list so that it will remove the
least-recently used elements first.

The max_open_files option to LevelDB sets the size of the cache
of (table, file handle) pairs.  The option is not used in any
other way.

The observed behaviour:

If LevelDB at any time used more file handles concurrently than
the cache size set via max_open_files, it attempted to reduce the
number by evicting entries from the table cache.  This could
happen most easily during compaction, and if max_open_files was
low.  Because the handles were in use, their reference count did
not drop to zero, and so the usage sum in the cache was not
modified by the evictions.  Subsequent Insert() calls returned
valid handles, but their entries were immediately evicted from
the cache, which though empty still acted as though full.  As a
result, there was effectively no caching, and the number of open
file handles rose []ly until it hit system-imposed limits and
the process died.

If one set max_open_files lower, the cache was more likely to
exhibit this beahviour, and cause the process to run out of file
descriptors.  That is, max_open_files acted in almost exactly the
opposite manner from what was intended.

The problems:

1. The cache kept all elements on its LRU list eligible for capacity
   eviction---even those with outstanding references from clients.  This was
   ineffective in reducing resource consumption because there was an
   outstanding reference, guaranteeing that the items remained.  A secondary
   issue was that there is no guarantee that these in-use items will be the
   last things reached in the LRU chain, which actually recorded
   "least-recently requested" rather than "least-recently used".

2. The sum of usages was decremented not when a (key,value) was evicted from
   the cache, but when its reference count went to zero.  Thus, when things
   were removed from the cache, either by garbage collection or via Erase(),
   the usage sum was not necessarily decreased.  This allowed the cache to act
   as though full when it was in fact not, reducing caching effectiveness, and
   leading to more resources being consumed---the opposite of what the
   evictions were intended to achieve.

3. (minor) The cache's clients insert items into it by first looking up the
   key, and inserting only if no value is found.  Although the cache has an
   internal lock, the clients use no locking to ensure atomicity of the
   Lookup/Insert pair.  (see table/table.cc:  block_cache->Insert() and
   db/table_cache.cc:  cache_->Insert()).  Thus, if two threads Insert() at
   about the same time, they can both Lookup(), find nothing, and both
   Insert().  The second Insert() would evict the first value, leaving each
   thread with a handle on its own version of the data, and with the second
   version in the cache.  It would be better if both threads ended up with a
   handle on the same (key,value) pair, which implies it must be the first item
   inserted.  This suggests that Insert() should not replace an existing value.

   This can be made safe with current usage inside LeveDB itself, but this is
   not easy to change first because Cache is a public interface, so to change
   the semantics of an existing call might break things, second because Cache
   is an abstract virtual class, so adding a new abstract virtual method may
   break other implementations, and third, the new method "insert without
   replacing" cannot be implemented in terms of the existing methods, so cannot
   be implemented with a non-abstract default.   But fortunately, the effects
   of this issue are minor, so this issue is not fixed by this change.

The changes:

The assumption in the fixes is that it is always better to cache
entries unless removal from the cache would lead to deallocation.

Cache entries now have an "in_cache" boolean indicating whether
the cache has a reference on the entry.  The only ways that this can
become false without the entry being passed to its "deleter" are via
Erase(), via Insert() when an element with a duplicate key is inserted,
or on destruction of the cache.

The cache now keeps two linked lists instead of one.  All items
in the cache are in one list or the other, and never both.  Items
still referenced by clients but erased from the cache are in
neither list.  The lists are:
- in-use:  contains the items currently referenced by clients, in no particular
  order.  (This list is used for invariant checking.  If we removed the check,
  elements that would otherwise be on this list could be left as disconnected
  singleton lists.)
- LRU:  contains the items not currently referenced by clients, in LRU order

A new internal Ref() method increments the reference count.  If
incrementing from 1 to 2 for an item in the cache, it is moved
from the LRU list to the in-use list.

The Unref() call now moves things from the in-use list to the LRU
list if the reference count falls to 1, and the item is in the
cache.  It no longer adjusts the usage sum.  The usage sum now
reflects only what is in the cache, rather than including
still-referenced items that have been evicted.

The LRU_Append() now takes a "list" parameter so that it can be
used to append either to the LRU list or the in-use list.

Lookup() is modified to use the new Ref() call, rather than
adjusting the reference count and LRU chain directly.

Insert() eviction code is also modified to adjust the usage sum and the
in_cache boolean of the evicted elements.  Some LevelDB tests assume that there
will be no caching whatsoever if the cache size is set to zero, so this is
handled as a special case.

A new private method FinishErase() is factored out
with the common code from where items are removed from the cache.

Erase() is modified to adjust the usage sum and the in_cache
boolean of the erased elements, and to use FinishErase().

Prune() is modified to use FinishErase() also, and to make use of the fact that
the lru_ list now contains only items with reference count 1.

- EvictionPolicy is modified to test that an entry with an
outstanding handle is not evicted.  This test fails with the old cache.cc.

- A new test case UseExceedsCacheSize verifies that even when the
cache is overfull of entries with outstanding handles, none are
evicted.  This test fails with the old cache.cc, and is the key
issue that causes file descriptors to run out when the cache
size is set too small.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=123247237
2016-07-06 09:15:53 -07:00
ssid
706b7f8d43 Resolve race when getting approximate-memory-usage property
The write operations in the table happens without holding the mutex
lock, but concurrent writes are avoided using "writers_" queue.
The Arena::MemoryUsage could access the blocks when write happens.
So, the memory usage is cached in atomic word and can be loaded
from any thread safely.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=107573379
2015-12-09 11:27:50 -08:00
ssid
528c2bc6ad Add "approximate-memory-usage" property to leveldb::DB::GetProperty
The approximate RAM usage of the database is calculated from the memory
allocated for write buffers and the block cache. This is to give an
estimate of memory usage to leveldb clients.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=104222307
2015-12-09 10:34:58 -08:00
tzik
359b6bcec2 Add leveldb::Cache::Prune
Prune() drops on-memory read cache of the database, so that the client can
relief its memory shortage.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=101335710
2015-12-09 10:34:58 -08:00
pkasting
50e77a8263 Fix size_t/int comparison/conversion issues in leveldb.
The create function took |num_keys| as an int, but callers and implementers wanted it to function as a size_t (e.g. passing std::vector::size() in, passing it to vector constructors as a size arg, indexing containers by it, etc.).  This resulted in implicit conversions between the two types as well as warnings (found with Chromium's external copy of these sources, built with MSVC) about signed vs. unsigned comparisons.

The leveldb sources were already widely using size_t elsewhere, e.g. for key and filter lengths, so using size_t here is not inconsistent with the existing code.  However, it does change the public C API.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=101074871
2015-12-09 10:34:58 -08:00
Sanjay Ghemawat
ac1d69da31 LevelDB now attempts to reuse the preceding MANIFEST and log file when re-opened.
(Based on a suggestion by cmumford.)

"open" benchmark on my workstation speeds up significantly since we
can now avoid three fdatasync calls and a compaction per open:

  Before: ~80000 microseconds
  After:    ~130 microseconds

Details:

(1) Added Options::reuse_logs (currently defaults to false) to control
new behavior.  The intention is to change the default to true after some
baking.

(2) Added Env::NewAppendableFile() whose default implementation returns
a not-supported error.

(3) VersionSet::Recovery attempts to reuse the MANIFEST from which
it is recovering.

(4) DBImpl recovery attempts to reuse the last log file and memtable.

(5) db_test.cc now tests a new configuration that sets reuse_logs to true.

(6) fault_injection_test also tests a reuse_logs==true config.

(7) Added a new recovery_test.
2015-08-11 14:56:39 -07:00
Chris Mumford
803d69203a Release 1.18
Changes are:

* Update version number to 1.18
* Replace the basic fprintf call with a call to fwrite in order to
  work around the apparent compiler optimization/rewrite failure that we are
  seeing with the new toolchain/iOS SDKs provided with Xcode6 and iOS8.
* Fix ALL the header guards.
* Createed a README.md with the LevelDB project description.
* A new CONTRIBUTING file.
* Don't implicitly convert uint64_t to size_t or int.  Either preserve it as
  uint64_t, or explicitly cast. This fixes MSVC warnings about possible value
  truncation when compiling this code in Chromium.
* Added a DumpFile() library function that encapsulates the guts of the
  "leveldbutil dump" command. This will allow clients to dump
  data to their log files instead of stdout. It will also allow clients to
  supply their own environment.
* leveldb: Remove unused function 'ConsumeChar'.
* leveldbutil: Remove unused member variables from WriteBatchItemPrinter.
* OpenBSD, NetBSD and DragonflyBSD have _LITTLE_ENDIAN, so define
  PLATFORM_IS_LITTLE_ENDIAN like on FreeBSD. This fixes:
   * issue #143
   * issue #198
   * issue #249
* Switch from <cstdatomic> to <atomic>. The former never made it into the
  standard and doesn't exist in modern gcc versions at all.  The later contains
  everything that leveldb was using from the former.
  This problem was noticed when porting to Portable Native Client where no memory
  barrier is defined.  The fact that <cstdatomic> is missing normally goes
  unnoticed since memory barriers are defined for most architectures.
* Make Hash() treat its input as unsigned.  Before this change LevelDB files
  from platforms with different signedness of char were not compatible. This
  change fixes: issue #243
* Verify checksums of index/meta/filter blocks when paranoid_checks set.
* Invoke all tools for iOS with xcrun. (This was causing problems with the new
  XCode 5.1.1 image on pulse.)
* include <sys/stat.h> only once, and fix the following linter warning:
  "Found C system header after C++ system header"
* When encountering a corrupted table file, return Status::Corruption instead of
  Status::InvalidArgument.
* Support cygwin as build platform, patch is from https://code.google.com/p/leveldb/issues/detail?id=188
* Fix typo, merge patch from https://code.google.com/p/leveldb/issues/detail?id=159
* Fix typos and comments, and address the following two issues:
  * issue #166
  * issue #241
* Add missing db synchronize after "fillseq" in the benchmark.
* Removed unused variable in SeekRandom: value (issue #201)
2014-09-16 14:19:52 -07:00
David Grogan
0cfb990d58 Release LevelDB 1.15
- switched from mmap based writing to simpler stdio based writing. Has a
  minor impact (0.5 microseconds) on microbenchmarks for asynchronous
  writes. Synchronous writes speed up from 30ms to 10ms on linux/ext4.
  Should be much more reliable on diverse platforms.
- compaction errors now immediately put the database into a read-only
  mode (until it is re-opened). As a downside, a disk going out of
  space and then space being created will require a re-open to recover
  from, whereas previously that would happen automatically. On the
  plus side, many corruption possibilities go away.
- force the DB to enter an error-state so that all future writes fail
  when a synchronous log write succeeds but the sync fails.
- repair now regenerates sstables that exhibit problems
- fix issue 218 - Use native memory barriers on OSX
- fix issue 212 - QNX build is broken
- fix build on iOS with xcode 5
- make tests compile and pass on windows
2013-12-10 10:36:31 -08:00
David Grogan
0b9a89f40e Release LevelDB 1.14
Fix issues 200, 201

Also,
* Fix link to bigtable paper in docs.
* New sstables will have the file extension .ldb. .sst files will
continue to be recognized.
* When building for iOS, use xcrun to execute the compiler. This may
affect issue 177.
2013-09-19 13:49:19 -07:00
David Grogan
748539c183 LevelDB 1.13
Fix issues 77, 87, 182, 190.

Additionally, fix the bug described in
https://groups.google.com/d/msg/leveldb/yL6h1mAOc20/vLU64RylIdMJ
where a large contiguous keyspace of deleted data was not getting
compacted.

Also fix a bug where options.max_open_files was not getting clamped
properly.
2013-08-21 11:12:47 -07:00
David Grogan
7b094f12e4 Release leveldb 1.11
Fixes issues
161
174
178

As well as the issue reported by edouarda14@gmail.com about
MissingSSTFile unit test failing on windows.
2013-06-13 16:14:06 -07:00
David Grogan
28dad918f2 Release leveldb 1.10
Fixes issues
147 - thanks feniksgordonfreeman
153
156
166

Additionally,
* Remove calls to exit(1).
* Fix unused-variable warnings from clang.
* Fix possible overflow error related to num_restart value >= (2^32/4).
* Add leveldbutil to .gitignore.
* Add better log messages when Write is stalled on a compaction.
2013-05-14 17:03:07 -07:00
David Grogan
514c943a8e Make DB::Open fail if sst files are missing.
Also, cleanup for Clang's -Wimplicit-fallthrough warning.
2013-02-06 18:03:32 -08:00
David Grogan
946e5b5a4c Update to leveldb 1.6
Highlights
----------
Mmap at most 1000 files on Posix to improve performance for large databases.
Support for more architectures (thanks to Alexander K.)

Building and porting
--------------------
HP/UX support (issue 126)
AtomicPointer for ia64 (issue 123)
Sparc v9 support (issue 124)
Atomic ops for powerpc
Use -fno-builtin-memcmp only when using g++
Simplify IOS build rules (issue 114)
Use CXXFLAGS instead of CFLAGS when invoking C++ compiler (issue 118)
Fix snappy shared library problem (issue 94)
Fix shared library installation path regression
Endian-ness detection tweak for FreeBSD

Bug fixes
---------
Stop ignoring FLAGS_open_files in db_bench
Make bloom test behavior agnostic to endian-ness

Performance
-----------
Limit number of mmapped files to 1000 to improve perf for large dbs
Do not delay for 1 second on shutdown path (issue 125)

Misc
----
Make InMemoryEnv return a no-op logger
C binding now has a wrapper for free (issue 117)
Add thread-safety annotations
Added an in-process lock table (issue 120)
Make RandomAccessFile and SequentialFile non-copyable
2012-10-12 11:53:12 -07:00
Sanjay Ghemawat
075a35a6d3 Remove static initializer; fix endian-ness detection; fix build on
various platforms; improve android port speed.

Avoid static initializer by using a new portability interface for
thread-safe lazy initialization.  Custom ports will need to be
extended to implement InitOnce/OnceType/LEVELDB_ONCE_INIT.

Fix endian-ness detection (fixes Powerpc builds).

Build related fixes:
- Support platforms that have unversioned shared libraries.
- Fix IOS build rules.

Android improvements
- Speed up atomic pointers
- Share more code with port_posix.

Do not spin in a tight loop attempting compactions if the file system
is inaccessible (e.g., if kerberos tickets have expired or if it is out
of space).
2012-05-30 09:45:46 -07:00
Sanjay Ghemawat
85584d497e Added bloom filter support.
In particular, we add a new FilterPolicy class.  An instance
of this class can be supplied in Options when opening a
database.  If supplied, the instance is used to generate
summaries of keys (e.g., a bloom filter) which are placed in
sstables.  These summaries are consulted by DB::Get() so we
can avoid reading sstable blocks that are guaranteed to not
contain the key we are looking for.

This change provides one implementation of FilterPolicy
based on bloom filters.

Other changes:
- Updated version number to 1.4.
- Some build tweaks.
- C binding for CompactRange.
- A few more benchmarks: deleteseq, deleterandom, readmissing, seekrandom.
- Minor .gitignore update.
2012-04-17 08:36:46 -07:00
Sanjay Ghemawat
9013f13b15 use mmap on 64-bit machines to speed-up reads; small build fixes 2012-03-15 09:14:00 -07:00
Sanjay Ghemawat
3c8be108bf fixed issues 66 (leaking files on disk error) and 68 (no sync of CURRENT file) 2012-01-25 14:56:52 -08:00
Hans Wennborg
42fb47f6ed Pass system's CFLAGS, remove exit time destructor, sstable bug fix.
- Pass system's values of CFLAGS,LDFLAGS.
  Don't override OPT if it's already set.
  Original patch by Alessio Treglia <alessio@debian.org>:
  http://code.google.com/p/leveldb/issues/detail?id=27#c6

- Remove 1 exit time destructor from leveldb.
  See http://crbug.com/101600

- Fix problem where sstable building code would pass an
  internal key to the user comparator.

(Sync with uptream at 25436817.)
2011-11-14 17:06:16 +00:00
Hans Wennborg
36a5f8ed7f A number of fixes:
- Replace raw slice comparison with a call to user comparator.
  Added test for custom comparators.

- Fix end of namespace comments.

- Fixed bug in picking inputs for a level-0 compaction.

  When finding overlapping files, the covered range may expand
  as files are added to the input set.  We now correctly expand
  the range when this happens instead of continuing to use the
  old range.  For example, suppose L0 contains files with the
  following ranges:

      F1: a .. d
      F2:    c .. g
      F3:       f .. j

  and the initial compaction target is F3.  We used to search
  for range f..j which yielded {F2,F3}.  However we now expand
  the range as soon as another file is added.  In this case,
  when F2 is added, we expand the range to c..j and restart the
  search.  That picks up file F1 as well.

  This change fixes a bug related to deleted keys showing up
  incorrectly after a compaction as described in Issue 44.

(Sync with upstream @25072954)
2011-10-31 17:22:06 +00:00
Gabor Cselle
299ccedfec A number of bugfixes:
- Added DB::CompactRange() method.

  Changed manual compaction code so it breaks up compactions of
  big ranges into smaller compactions.

  Changed the code that pushes the output of memtable compactions
  to higher levels to obey the grandparent constraint: i.e., we
  must never have a single file in level L that overlaps too
  much data in level L+1 (to avoid very expensive L-1 compactions).

  Added code to pretty-print internal keys.

- Fixed bug where we would not detect overlap with files in
  level-0 because we were incorrectly using binary search
  on an array of files with overlapping ranges.

  Added "leveldb.sstables" property that can be used to dump
  all of the sstables and ranges that make up the db state.

- Removing post_write_snapshot support.  Email to leveldb mailing
  list brought up no users, just confusion from one person about
  what it meant.

- Fixing static_cast char to unsigned on BIG_ENDIAN platforms.

  Fixes	Issue 35 and Issue 36.

- Comment clarification to address leveldb Issue 37.

- Change license in posix_logger.h to match other files.

- A build problem where uint32 was used instead of uint32_t.

Sync with upstream @24408625
2011-10-05 16:30:28 -07:00
Hans Wennborg
213a68eb68 Sync with upstream @23860137.
Fix GCC -Wshadow warnings in LevelDB's public header files,
reported by Dustin.

Add in-memory Env implementation (helpers/memenv/*).
This enables users to create LevelDB databases in-memory.

Initialize ShardedLRUCache::last_id_ to zero.
This fixes a Valgrind warning.

(Also delete port/sha1_* which were removed upstream some time ago.)
2011-09-12 10:21:10 +01:00
gabor@google.com
e3584f9c28 Bugfix for issue 33; reduce lock contention in Get(), parallel benchmarks.
- Fix for issue 33 (non-null-terminated result from
  leveldb_property_value())

- Support for running multiple instances of a benchmark in parallel.

- Reduce lock contention on Get():
  (1) Do not hold the lock while searching memtables.
  (2) Shard block and table caches 16-ways.

  Benchmark for evaluating this change:
  $ db_bench --benchmarks=fillseq1,readrandom --threads=$n
  (fillseq1 is a small hack to make sure fillseq runs once regardless
  of number of threads specified on the command line).



git-svn-id: https://leveldb.googlecode.com/svn/trunk@49 62dab493-f737-651d-591e-8d6aee1b9529
2011-08-22 21:08:51 +00:00
gabor@google.com
ab323f7e1e Bugfixes for iterator and documentation.
- Fix bug in Iterator::Prev where it would return the wrong key.
  Fixes issues 29 and 30.

- Added a tweak to testharness to allow running just some tests.

- Fixing two minor documentation errors based on issues 28 and 25.

- Cleanup; fix namespaces of export-to-C code.
  Also fix one "const char*" vs "char*" mismatch.



git-svn-id: https://leveldb.googlecode.com/svn/trunk@48 62dab493-f737-651d-591e-8d6aee1b9529
2011-08-16 01:21:01 +00:00
dgrogan@chromium.org
a05525d13b @23023120
git-svn-id: https://leveldb.googlecode.com/svn/trunk@47 62dab493-f737-651d-591e-8d6aee1b9529
2011-08-06 00:19:37 +00:00
gabor@google.com
f122c6dfbb Adding FreeBSD support, removing Chromium files, adding benchmark.
- LevelDB patch for FreeBSD. This resolves Issue 22.
  Contributed by dforsythe (thanks!).

- Removing Chromium-specific files.
  They are now going to live in the Chromium repository.

- Adding a benchmark page comparing LevelDB performance
  to SQLite and Kyoto Cabinet's TreeDB, along with
  code to generate the benchmarks.
  Thanks to Kevin Tseng for compiling the benchmarks,
  and Scott Hess and Mikio Hirabayashi for their
  help and advice.



git-svn-id: https://leveldb.googlecode.com/svn/trunk@40 62dab493-f737-651d-591e-8d6aee1b9529
2011-07-27 01:46:25 +00:00
gabor@google.com
60bd8015f2 Speed up Snappy uncompression, new Logger interface.
- Removed one copy of an uncompressed block contents changing
  the signature of Snappy_Uncompress() so it uncompresses into a
  flat array instead of a std::string.
        
  Speeds up readrandom ~10%.

- Instead of a combination of Env/WritableFile, we now have a
  Logger interface that can be easily overridden applications
  that want to supply their own logging.

- Separated out the gcc and Sun Studio parts of atomic_pointer.h
  so we can use 'asm', 'volatile' keywords for Sun Studio.




git-svn-id: https://leveldb.googlecode.com/svn/trunk@39 62dab493-f737-651d-591e-8d6aee1b9529
2011-07-21 02:40:18 +00:00
gabor@google.com
6872ace901 Sun Studio support, and fix for test related memory fixes.
- LevelDB patch for Sun Studio
  Based on a patch submitted by Theo Schlossnagle - thanks!
  This fixes Issue 17.

- Fix a couple of test related memory leaks.



git-svn-id: https://leveldb.googlecode.com/svn/trunk@38 62dab493-f737-651d-591e-8d6aee1b9529
2011-07-19 23:36:47 +00:00
gabor@google.com
6699c7ebe6 Small tweaks and bugfixes for Issue 18 and 19.
Slight tweak to the no-overlap optimization: only push to
level 2 to reduce the amount of wasted space when the same
small key range is being repeatedly overwritten.

Fix for Issue 18: Avoid failure on Windows by avoiding
deletion of lock file until the end of DestroyDB().

Fix for Issue 19: Disregard sequence numbers when checking for 
overlap in sstable ranges. This fixes issue 19: when writing 
the same key over and over again, we would generate a sequence 
of sstables that were never merged together since their sequence
numbers were disjoint.

Don't ignore map/unmap error checks.

Miscellaneous fixes for small problems Sanjay found while diagnosing
issue/9 and issue/16 (corruption_testr failures).
- log::Reader reports the record type when it finds an unexpected type.
- log::Reader no longer reports an error when it encounters an expected
  zero record regardless of the setting of the "checksum" flag.
- Added a missing forward declaration.
- Documented a side-effects of larger write buffer sizes
  (longer recovery time).



git-svn-id: https://leveldb.googlecode.com/svn/trunk@37 62dab493-f737-651d-591e-8d6aee1b9529
2011-07-15 00:20:57 +00:00
gabor@google.com
f57e23351f Platform detection during build, plus compatibility patches for machines without <cstdatomic>.
This revision adds two major changes:
1. build_detect_platform which generates build_config.mk
   with platform-dependent flags for the build process
2. /port/atomic_pointer.h with anAtomicPointerimplementation
   for platforms without <cstdatomic>

Some of this code is loosely based on patches submitted to the 
LevelDB mailing list at https://groups.google.com/forum/#!forum/leveldb
Tip of the hat to Dave Smith and Edouard A, who both sent patches.

The presence of Snappy (http://code.google.com/p/snappy/) and
cstdatomic are now both detected in the build_detect_platform
script (1.) which gets executing during make.

For (2.), instead of broadly importing atomicops_* from Chromium or
the Google performance tools, we chose to just implement AtomicPointer 
and the limited atomic load and store operations it needs. 
This resulted in much less code and fewer files - everything is 
contained in atomic_pointer.h.



git-svn-id: https://leveldb.googlecode.com/svn/trunk@34 62dab493-f737-651d-591e-8d6aee1b9529
2011-06-29 00:30:50 +00:00
dgrogan@chromium.org
740d8b3d00 Update from upstream @21551990
* Patch LevelDB to build for OSX and iOS
* Fix race condition in memtable iterator deletion.
* Other small fixes.

git-svn-id: https://leveldb.googlecode.com/svn/trunk@29 62dab493-f737-651d-591e-8d6aee1b9529
2011-05-28 00:53:58 +00:00
dgrogan@chromium.org
da79909507 sync with upstream @ 21409451
Check the NEWS file for details of what changed.

git-svn-id: https://leveldb.googlecode.com/svn/trunk@28 62dab493-f737-651d-591e-8d6aee1b9529
2011-05-21 02:17:43 +00:00
dgrogan@chromium.org
be9f061d2f pull in hans' mac build fix
git-svn-id: https://leveldb.googlecode.com/svn/trunk@26 62dab493-f737-651d-591e-8d6aee1b9529
2011-04-21 01:54:51 +00:00
dgrogan@chromium.org
ba6dac0e80 @20776309
* env_chromium.cc should not export symbols.
* Fix MSVC warnings.
* Removed large value support.
* Fix broken reference to documentation file

git-svn-id: https://leveldb.googlecode.com/svn/trunk@24 62dab493-f737-651d-591e-8d6aee1b9529
2011-04-20 22:48:11 +00:00
dgrogan@chromium.org
69c6d38342 reverting disastrous MOE commit, returning to r21
git-svn-id: https://leveldb.googlecode.com/svn/trunk@23 62dab493-f737-651d-591e-8d6aee1b9529
2011-04-19 23:11:15 +00:00
dgrogan@chromium.org
b743906eea Revision created by MOE tool push_codebase.
MOE_MIGRATION=


git-svn-id: https://leveldb.googlecode.com/svn/trunk@22 62dab493-f737-651d-591e-8d6aee1b9529
2011-04-19 23:01:25 +00:00
dgrogan@chromium.org
b409afe968 chmod a-x
git-svn-id: https://leveldb.googlecode.com/svn/trunk@21 62dab493-f737-651d-591e-8d6aee1b9529
2011-04-18 23:15:58 +00:00
dgrogan@chromium.org
f779e7a5d8 @20602303. Default file permission is now 755.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@20 62dab493-f737-651d-591e-8d6aee1b9529
2011-04-12 19:38:58 +00:00
jorlow@chromium.org
4671a695fc Move include files into a leveldb subdir.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@18 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-30 18:35:40 +00:00
jorlow@chromium.org
4d66fd5af3 Upstream change.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@17 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-29 22:41:11 +00:00
jorlow@chromium.org
e2da744e12 Upstream changes.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@16 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-28 20:43:44 +00:00
jorlow@chromium.org
e11bdf1935 Upstream changes
git-svn-id: https://leveldb.googlecode.com/svn/trunk@15 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-25 20:27:43 +00:00
jorlow@chromium.org
8303bb1b33 Pull from upstream.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@14 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-22 23:24:02 +00:00
jorlow@chromium.org
6d243ebf79 Make GetTestDirectory threadsafe within Chromium and make it work on Windows.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@13 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-22 19:07:54 +00:00
jorlow@chromium.org
4bcb231187 more upstream changes
git-svn-id: https://leveldb.googlecode.com/svn/trunk@10 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-21 21:06:49 +00:00
jorlow@chromium.org
0e38925490 Sync in bug fixes
git-svn-id: https://leveldb.googlecode.com/svn/trunk@9 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-21 19:40:57 +00:00
jorlow@chromium.org
f67e15e50f Initial checkin.
git-svn-id: https://leveldb.googlecode.com/svn/trunk@2 62dab493-f737-651d-591e-8d6aee1b9529
2011-03-18 22:37:00 +00:00