foundationdb / fdb-document-layer Goto Github PK
View Code? Open in Web Editor NEWA document data model on FoundationDB, implementing MongoDB® wire protocol
License: Apache License 2.0
A document data model on FoundationDB, implementing MongoDB® wire protocol
License: Apache License 2.0
Document Layer allows users to create compound indexes and updates them properly. But, the query planner is not yet using them. Query planner uses them just like Simple indexes. For example, if we have a compound index on fields a
and b
, query planner treats the index as a Simple Index on a
. So, if there is a query with predicate a == "foo" and b == "bar"
- query planner would scan the index with bounds foo, foo0
, and runs FilterPlan on it to look for bar
. Instead, it should scan with bounds foo:bar, foo:bar0
.
Document Layer returns only secondary indexes in response to listIndexes()
. Mongo drivers expect primary index (on _id
, which is maintained anyway) also part of the response. It's an easy fix, we might as well do it just to make the drivers happy.
Document Layer is storing numeric field names as binary integers, this is limiting numeric field names to be an integer. I am guessing this decision is taken to maintain the order of numeric fields. This is not a valid assumption. All JSON field names are strings.
In [74]: db.coll.insert({'_id': 'foo', '12345678901234567890': 'test'})
Out[74]: 'foo'
In [75]: for row in db.coll.find():
...: print row
...:
{u'_id': u'foo', u'-1': u'test'}
Also, changing this behavior would make array expansion code much cleaner. As array indexes are integers, it is hard to guess if a kye is pointing to an array element or a numeric field.
Related code is here
void insertElementRecursive(bson::BSONElement const& elem, Reference<IReadWriteContext> cx) {
std::string fn = elem.fieldName();
if (std::all_of(fn.begin(), fn.end(), ::isdigit)) {
const char* c_fn = fn.c_str();
insertElementRecursive(atoi(c_fn), elem, cx);
} else {
insertElementRecursive(fn, elem, cx);
}
}
It's just syntactic sugar on top of FDB transactions. This makes code much better and more readable by making it clear if a transaction was ever used for updates. We can't do this with current code, Flow bindings classes are not virtual.
blocked on: apple/foundationdb#1027
mongo-express admin web interface is broken with
at /app/server/node_modules/mongo-express/lib/routes/database.js:40:49
at handleCallback (/app/server/node_modules/mongodb/lib/utils.js:95:56)
at /app/server/node_modules/mongodb/lib/db.js:313:5
at /app/server/node_modules/mongodb-core/lib/connection/pool.js:455:18
at process.internalTickCallback (internal/process/next_tick.js:70:11)
It is failing in this line of the code, where it is trying to show numExtents
of dbStats
.
https://docs.mongodb.com/manual/reference/command/dbStats/#dbStats.numExtents
Blocking #29
In quite a few places we use std::string
to store byte buffers that are not printable strings. We should use Standalone<StringRef>
instead. And use std::string
only for printable strings.
This will help to keep the binary smaller.
As with FoundationDB, there's no official Docker image, despite the presence of a Dockerfile
. It'd be nice to have an official image.
We have some smoke tests part of correctness. Although we call them unit tests, we need to setup FDB and Doc Layer instances and the python scripts run some smoke tests. We should have some kind of verification with PRBs.
We don't support capped collections. Until we support we should explicitly fail if someone tries to create one.
$ python test/correctness/document-correctness.py --doclayer-host localhost --doclayer-port 27018 forever doclayer mm --seed 9203356367461099619 --num-doc 300 --num-iter 1 --no-update --no-sort --no-numeric-fieldnames
Instance: 0459504932136
========================================================
ID : 35746 iteration : 1
========================================================
Query results didn't match!
Query: {'$and': [{u'E': None}, {'$and': [{u'C': None}, {u'A': {'$lte': 'c'}}]}]}
Projection: OrderedDict([(u'C', True), (u'D', True)])
pymongo.collection (0)
mongo_model (1): {u'_id': datetime.datetime(1970, 1, 22, 10, 7, 43)}
RESULT SET DIFFERENCES (as 'sets' so order within the returned results is not considered)
Only in mongo_model : {'_id': 1970-01-22 10:07:43}
python /Users/bmuppana/src/fdb-document-layer/test/correctness/document-correctness.py --mongo-host localhost --mongo-port 27018 --doclayer-host localhost --doclayer-port 27018 forever doclayer mm --seed 9203356367461099619 --num-doc 300 --num-iter 1 --no-update --no-sort --no-numeric-fieldnames
Found this against d2840e9. Consistently reproducible with the above seed.
explain()
return information about query plan. Whether an index is being used or not, and how the scan is being used. Right now, it looks something like this
In [13]: db.correctness475041058659.find({'A': { 'A': None, 'C': {}}}).explain()
Out[13]:
{u'explanation': {u'source_plan': {u'projection': u'{}',
u'source_plan': {u'bounds': {u'begin': u'3\\x10\\x00\\xff\\x00\\xff\\x00\\xff\\x0aA\\x00\\xff\\x03C\\x00\\xff\\x05\\x00\\xff\\x00\\xff\\x00\\xff\\x00\\xff\\x00\\xff\\x00',
u'end': u'3\\x10\\x00\\xff\\x00\\xff\\x00\\xff\\x0aA\\x00\\xff\\x03C\\x00\\xff\\x05\\x00\\xff\\x00\\xff\\x00\\xff\\x00\\xff\\x00\\xff\\x00'},
u'index name': u'A_1_B_1_B_1',
u'type': u'index scan'},
u'type': u'projection'},
u'type': u'non-isolated'}}
Although, this gives an overview of what’s going on. It will be much more useful if the keys are user readable, instead of FDB keys.
If you create an unique index and the fields of the index are missing, that will be treated as a unique value for the index. This is not how MongoDB behaves.
This is a follow-up feature request for #4.
Once we have fault-tolerant index builds we should look into parallel builds. This can follow task bucket pattern. Tasks should be small enough to fit in one transaction. If the task doesn't finish in a single transaction, it should be split. This makes each task atomic. Maintaining status could be as simple as tracking pending tasks. Reading shard keys between index bounds would give us an estimation of the index size.
It would have been ideal to use the task buckets here. But, the task buckets live in fdbclient and use ReadYourWritesTransaction. They don't go through fdb_flow. Also, index rebuild is a very specific case we are better off having the custom code.
building error with "error: version 'fdb' requested but 'g++-fdb' not found and version '7.3.0' of default 'g++' does not match", here is the full log.
zhifan@ubuntu-zhifan:~/github$ curl -L -J -O https://dl.bintray.com/boostorg/release/1.67.0/source/boost_1_67_0.tar.gz && tar -xzf boost_1_67_0.tar.gz && cd boost_1_67_0 && ./bootstrap.sh --prefix=./ && echo "using gcc : fdb : ${CXX} ;" >> ./tools/build/src/user-config.jam && cat ./tools/build/src/user-config.jam && ./b2 toolset=gcc-fdb install --with-filesystem --with-system
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:09 --:--:-- 0
100 98.5M 100 98.5M 0 0 3102k 0 0:00:32 0:00:32 --:--:-- 5666k
curl: Saved to filename 'boost_1_67_0.tar.gz'
Building Boost.Build engine with toolset gcc... tools/build/src/engine/bin.linuxx86_64/b2
Detecting Python version... 2.7
Detecting Python root... /usr
Unicode/ICU support for Boost.Regex?... /usr
Generating Boost.Build configuration in project-config.jam...
Bootstrapping is done. To build, run:
./b2
To adjust configuration, edit 'project-config.jam'.
Further information:
- Command line help:
./b2 --help
- Getting started guide:
http://www.boost.org/more/getting_started/unix-variants.html
- Boost.Build documentation:
http://www.boost.org/build/doc/html/index.html
using gcc : fdb : ;
/home/zhifan/github/boost_1_67_0/tools/build/src/tools/gcc.jam:125: in gcc.init from module gcc
error: toolset gcc initialization:
error: version 'fdb' requested but 'g++-fdb' not found and version '7.3.0' of default 'g++' does not match
error: initialized from /home/zhifan/github/boost_1_67_0/tools/build/src/user-config.jam:1
/home/zhifan/github/boost_1_67_0/tools/build/src/build/toolset.jam:44: in toolset.using from module toolset
/home/zhifan/github/boost_1_67_0/tools/build/src/build/project.jam:1052: in using from module project-rules
/home/zhifan/github/boost_1_67_0/tools/build/src/user-config.jam:1: in modules.load from module user-config
/home/zhifan/github/boost_1_67_0/tools/build/src/build-system.jam:255: in load-config from module build-system
/home/zhifan/github/boost_1_67_0/tools/build/src/build-system.jam:453: in load-configuration-files from module build-system
/home/zhifan/github/boost_1_67_0/tools/build/src/build-system.jam:607: in load from module build-system
/home/zhifan/github/boost_1_67_0/tools/build/src/kernel/modules.jam:295: in import from module modules
/home/zhifan/github/boost_1_67_0/tools/build/src/kernel/bootstrap.jam:139: in boost-build from module
/home/zhifan/github/boost_1_67_0/boost-build.jam:17: in module scope from module
My env:
zhifan@ubuntu-zhifan:~/github/boost_1_67_0$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.3.0-27ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)
Currently, some constant values like system namespace
or wire protocol version
are defined as plain string literals at random places all over the code base. This will soon become a tech debt and will probably bite our ass pretty soon in the future. Thus it's a good idea to consolidate these values into one/few places (think of the error_definitions.h
for example).
> db.adminCommand( { renameCollection: "vishaltest.stores", to : "vishaltest.stores123"})
{
"errmsg" : "no such cmd: renamecollection",
"bad cmd" : "{ renameCollection: "vishaltest.stores", to: "vishaltest.stores123" }",
"ok" : 0
}
> use vishaltest
switched to db vishaltest
> db.stores.renameCollection("stores1234")
{
"errmsg" : "no such cmd: renamecollection",
"bad cmd" : "{ renameCollection: "vishaltest.stores", to: "vishaltest.stores1234", dropTarget: false }",
"ok" : 0
}
>
building error with "error: version 'fdb' requested but 'g++-fdb' not found and version '7.3.0' of default 'g++' does not match", but I don't know how to install g++-fdb. here is the full log.
zhifan@ubuntu-zhifan:~/github$ curl -L -J -O https://dl.bintray.com/boostorg/release/1.67.0/source/boost_1_67_0.tar.gz && tar -xzf boost_1_67_0.tar.gz && cd boost_1_67_0 && ./bootstrap.sh --prefix=./ && echo "using gcc : fdb : ${CXX} ;" >> ./tools/build/src/user-config.jam && cat ./tools/build/src/user-config.jam && ./b2 toolset=gcc-fdb install --with-filesystem --with-system
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:09 --:--:-- 0
100 98.5M 100 98.5M 0 0 3102k 0 0:00:32 0:00:32 --:--:-- 5666k
curl: Saved to filename 'boost_1_67_0.tar.gz'
Building Boost.Build engine with toolset gcc... tools/build/src/engine/bin.linuxx86_64/b2
Detecting Python version... 2.7
Detecting Python root... /usr
Unicode/ICU support for Boost.Regex?... /usr
Generating Boost.Build configuration in project-config.jam...
Bootstrapping is done. To build, run:
./b2
To adjust configuration, edit 'project-config.jam'.
Further information:
- Command line help:
./b2 --help
- Getting started guide:
http://www.boost.org/more/getting_started/unix-variants.html
- Boost.Build documentation:
http://www.boost.org/build/doc/html/index.html
using gcc : fdb : ;
/home/zhifan/github/boost_1_67_0/tools/build/src/tools/gcc.jam:125: in gcc.init from module gcc
error: toolset gcc initialization:
error: version 'fdb' requested but 'g++-fdb' not found and version '7.3.0' of default 'g++' does not match
error: initialized from /home/zhifan/github/boost_1_67_0/tools/build/src/user-config.jam:1
/home/zhifan/github/boost_1_67_0/tools/build/src/build/toolset.jam:44: in toolset.using from module toolset
/home/zhifan/github/boost_1_67_0/tools/build/src/build/project.jam:1052: in using from module project-rules
/home/zhifan/github/boost_1_67_0/tools/build/src/user-config.jam:1: in modules.load from module user-config
/home/zhifan/github/boost_1_67_0/tools/build/src/build-system.jam:255: in load-config from module build-system
/home/zhifan/github/boost_1_67_0/tools/build/src/build-system.jam:453: in load-configuration-files from module build-system
/home/zhifan/github/boost_1_67_0/tools/build/src/build-system.jam:607: in load from module build-system
/home/zhifan/github/boost_1_67_0/tools/build/src/kernel/modules.jam:295: in import from module modules
/home/zhifan/github/boost_1_67_0/tools/build/src/kernel/bootstrap.jam:139: in boost-build from module
/home/zhifan/github/boost_1_67_0/boost-build.jam:17: in module scope from module
My env:
zhifan@ubuntu-zhifan:~/github/boost_1_67_0$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.3.0-27ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)
Right now, Document Layer
does not have any other metric reporting plugin other than the default ConsoleMetric
reporter which simply log the metrics to TraceFiles
, which is an FDB specific logging infrastructure. Thus we want a new metric plugin that is more familiar to the community. Prometheus has picked up a lot of momentum during recent years so we would go with that.
Thanks to the design, providing a new metric reporting plugin is easy:
DocLayer
process, tell it to load the plugin library by passing in the --metric_plugin
and --metric_plugin_config
arguments.We think this will benefit the community in many ways.
§ When using/creating the client code, keep in mind that DocLayer is written in Flow
, whose model has only one process and one thread running on a giant event loop. And thus the plugin code must NOT be blocking.
Query planner uses indexes only to satisfy predicates. Sorting is done in memory. If we have a query like db.coll.find().sort('section').limit(5)
, without indexes, it would have to bring in all documents into memory and sort. Whereas with indexes, assuming index on section
is present it just has to fetch 5 documents. It's bad both in terms of the number of keys that needs to be fetched from FDB and memory utilization.
If the Document Layer instance bounces, ongoing index rebuild tasks would stop and wouldn't restart later. It is possible to find these stopped tasks by querying special keys. By maintaining index status we can restart these index rebuilds. To make sure two instances do the same work, we have to maintain some kind of locks based on FoundationDB keys.
We have a .clang-format style file committed to the repo. All commits should make sure they follow the style. We should have a CMake target that fails if the format is not good. This will make it easy to enforce.
Even though Document Layer implements MongoDB API, it has completely different performance characteristics. The reasons for this are described here. We should do a standardized performance test and document the issues and procedure. We should also set a continuous performance test runs to identify any regressions. But, that is not in the scope of this issue.
collStats
command is used across many different tools and frameworks like MongoExpress and Spark. Some Spark jobs wouldn't even start without some stats in collStats
. Following is the collStats
response format.
{
"ns" : <string>,
"count" : <number>,
"size" : <number>,
"avgObjSize" : <number>,
"storageSize" : <number>,
"capped" : <boolean>,
"max" : <number>,
"maxSize" : <number>,
"wiredTiger" : {
},
"nindexes" : <number>, // number of indexes
"totalIndexSize" : <number>, // total index size in bytes
"indexSizes" : { // size of specific indexes in bytes
"_id_" : <number>,
"username" : <number>
},
// ...
"ok" : <number>
}
We don't have to implement all the fields part of this issue.
count
- This is important for Spark jobs to work reasonably well. We can use Atomic operations to maintain the count. A trivial implementation would just maintain a single counter, which would generate a hotkey. Considering write hotkeys are not as bad, and also Atomic operations don't cause any conflict ranges, this could be a reasonable immediate solution.
mongo-express is a popular web-based admin tool. This issue is more of an umbrella issue that covers all tasks to support compatibility.
Document Layer indexes are always ascending irrespective of the direction provided in the index specification. This is fine for simple indexes as Index Scan can call reverse getRange()
on FDB to get keys in descending order. But, mixed directions in compound indexes can be very useful. Especially for SortPlan.
According to the forum post here.
$ python test/correctness/document-correctness.py --doclayer-host localhost --doclayer-port 27018 forever doclayer mm --seed 4994950075151235634 --num-doc 300 --num-iter 1 --no-update --no-sort --no-numeric-fieldnames
Instance: 621285605982
========================================================
ID : 44348 iteration : 1
========================================================
Key length exceeds limit
.......
python /Users/bmuppana/src/fdb-document-layer/test/correctness/document-correctness.py --mongo-host localhost --mongo-port 27018 --doclayer-host localhost --doclayer-port 27018 forever doclayer mm --seed 4994950075151235634 --num-doc 300 --num-iter 1 --no-update --no-sort --no-numeric-fieldnames
Stack trace from trace logs
<Event Severity="40" Time="1547842023.948421" Type="BD_doIndexUpdate" ID="0000000000000000" error="Key length exceeds limit" Backtrace="atos -o fdbdoc.debug -arch x86_64 -l 0x10e5c4000 0x10e968c91 0x10e968d6f 0x10e7eedb3 0x10e800482 0x10e7ee699 0x10e7f53ce 0x10e7f4de5 0x10e7f7437 0x10e7f3af7 0x10e5eed2e 0x10e7f4ee4 0x10e7f52a0 0x10e7dd43e 0x10e7dc125 0x10e7dc07d 0x10e7dd582 0x10e7dbc37 0x10e744cdc 0x10e77f1d8 0x10e7fabd3 0x10e7fb8d2 0x10e7f99d7 0x10e77ce6e 0x10e77cc95 0x10e77d157 0x10e77c947 0x10e7c3b2e 0x10e8d61b5 0x10e8d6a57 0x10e8d6177 0x10e77ce6e 0x10e8d28b5 0x10e8d3509 0x10e8d5672 0x10e8d1af7 0x10e7ca18c 0x10e7ca118 0x..." Machine="127.0.0.1:27018" LogGroup="default" />
Found this against d2840e9. Consistently reproducible with the above seed.
In [2]: db.coll.create_index('A', name='A_index')
Out[2]: 'A_index'
In [3]: db.coll.drop_index('A_index')
From server logs
S -> C: REPLY: documents=[ { ok: 0.0, err: "Range begin key larger than end key", code: 2005 } ], responseFlags=0, cursorID=0, startingFrom=0 (HEADER: messageLength=108, requestID=0, responseTo=114807987, opCode=1)
We had to disable this warning due to bad code in one place. It should be easy to reproduce by re-enabling this flag here.
Instance: 449970630051
Traceback (most recent call last):
File "/app/deploy/ensembles/20180925-094251-bmuppana-cc417819a7354838/correctness/document-correctness.py", line 550, in <module>
okay = ns['func'](ns)
File "/app/deploy/ensembles/20180925-094251-bmuppana-cc417819a7354838/correctness/document-correctness.py", line 438, in start_forever_test
return test_forever(ns)
File "/app/deploy/ensembles/20180925-094251-bmuppana-cc417819a7354838/correctness/document-correctness.py", line 378, in test_forever
(client1, client2, collection1, collection2) = get_clients_and_collections(ns)
File "/app/deploy/ensembles/20180925-094251-bmuppana-cc417819a7354838/correctness/document-correctness.py", line 43, in get_clients_and_collections
transactional_shim.remove(collection1)
File "/app/deploy/ensembles/20180925-094251-bmuppana-cc417819a7354838/correctness/transactional_shim.py", line 8, in func_wrapper
ret = func(*args, **kwargs)
File "/app/deploy/ensembles/20180925-094251-bmuppana-cc417819a7354838/correctness/transactional_shim.py", line 47, in func_wrapper
return func(*args, **kwargs)
File "/app/deploy/ensembles/20180925-094251-bmuppana-cc417819a7354838/correctness/transactional_shim.py", line 56, in _gen_func
return getattr(collection, name)(*args, **kwargs)
File "/app/.python2/lib/python2.7/site-packages/pymongo/collection.py", line 2996, in remove
spec_or_id, multi, write_concern, collation=collation)
File "/app/.python2/lib/python2.7/site-packages/pymongo/collection.py", line 1123, in _delete_retryable
_delete, session)
File "/app/.python2/lib/python2.7/site-packages/pymongo/mongo_client.py", line 1102, in _retryable_write
return self._retry_with_session(retryable, func, s, None)
File "/app/.python2/lib/python2.7/site-packages/pymongo/mongo_client.py", line 1079, in _retry_with_session
return func(session, sock_info, retryable)
File "/app/.python2/lib/python2.7/site-packages/pymongo/collection.py", line 1119, in _delete
retryable_write=retryable_write)
File "/app/.python2/lib/python2.7/site-packages/pymongo/collection.py", line 1099, in _delete
_check_write_command_response(result)
File "/app/.python2/lib/python2.7/site-packages/pymongo/helpers.py", line 207, in _check_write_command_response
_raise_last_write_error(write_errors)
File "/app/.python2/lib/python2.7/site-packages/pymongo/helpers.py", line 189, in _raise_last_write_error
raise WriteError(error.get("errmsg"), error.get("code"), error)
pymongo.errors.WriteError: "Collection metadata changed during operation."
This turned out to be a race condition on the new collection metadata creation.
On every request, collection metadata is fetched with the function assembleCollectionContext()
. This function creates a new collection in case the collection is not already present. Also, the collection context is cached for the sake of performance. When a new collection is created, it's not immediately inserted into the cache, as it is possible transaction might fail. Next request would insert into the cache.
Race condition happens with following steps
assembleCollectionContext()
, it creates new context again. And obviously, it doesn't match the old context that was created. Hence the issue.Couple of things we need to do for this issue
assembleCollectionContext()
to only create new collection, only if askedNonIsolatedPlan
is used for requests that can't guarantee atomic operations. For example, updates or queries can have predicates that can match too many documents to read within 5 seconds, consequently in one transaction. NonIsolatedPlan
splits this task into multiple transactions. The task is checkpointed after each transaction is committed, so in case of failure, the task is retried since the last checkpoint.
NonIsolatedPlan
maintains the transactions itself except the first transaction, which is created before metadata was read. The first transaction would be usually read-only transaction until it passed into NonIsolatedPlan
, especially RW plan. The only exception is when a new collection is created implicitly. We should maintain the transaction at a single location (function) - that includes creation, retries and commit. We can make this possible by having read-only transactions and splitting collection creation into its own transaction (when we are not in an explicit transaction).
This cleanup needs following subtasks
Occasional failures not reproducible with any specific seed, but reproduce in 1 out of 4000 runs.
the mongo driver I'm using: https://github.com/mongodb/mongo-go-driver
The UpdateOne and UpdateMany functions work with MongoDB itself but not with fdb-doc.
The error I got is command failure: {"errmsg": "command [update] failed with err: An unknown error occurred","bad cmd": "{ update: \"clickdata\", updates: [ { q: { auction_id_with_imp_index: \"hohololiiii\" }, u: { $set: {abc: \"aaa\" } }, multi: false } ] }","ok": {"$numberInt":"0"}}
The only update function I can make it work is the FindOneAndUpdate function, but that function will always return something even the update didn't work.
The issue is reported on forums. I also tried this on 16.04, same issue there as well.
We have slow query logging, which logs the query and plan if the plan contains a table scan. We measure query time for metrics, we should also log slow queries taking time more than a certain threshold.
At the moment distribution packages are built with default build (with no make targets). packages
target builds tar.gz
. Ideally, we should build all packages with packages
target. This is cleaner and makes default build faster, which is what we usually care in development.
So far, MongoModel
, in correctness code, is silently ignoring indexes. As indexes are not important to verify the correctness of Document Layer. But, some indexes could have an impact on the functionality. We are doing relevant work for unique indexes at #67.
There are other cases like this. We are having quite a few correctness failures due to Multi index limitations, as we don't allow compound indexes on arrays that can create more than 1000 index keys for a document. We should generate errors from MongoModel for this kind of failures to be consistent.
Blocking #67
Linking bin/fdbdoc
ld: warning: text-based stub file /System/Library/Frameworks//CoreFoundation.framework/CoreFoundation.tbd and library file /System/Library/Frameworks//CoreFoundation.framework/CoreFoundation are out of sync. Falling back to library file for linking.
ld: warning: text-based stub file /System/Library/Frameworks//IOKit.framework/IOKit.tbd and library file /System/Library/Frameworks//IOKit.framework/IOKit are out of sync. Falling back to library file for linking.
ld: warning: direct access in function 'boost::system::error_category::std_category::equivalent(int, std::__1::error_condition const&) const' from file '.objs/./ConsoleMetric.actor.g.cpp.o' to global weak symbol 'typeinfo for boost::system::error_category::std_category' from file '/Users/bmuppana/src/boost_1_67_0/stage/lib/libboost_filesystem.a(codecvt_error_category.o)' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings.
ld: warning: direct access in function 'boost::system::error_category::std_category::equivalent(std::__1::error_code const&, int) const' from file '.objs/./ConsoleMetric.actor.g.cpp.o' to global weak symbol 'typeinfo for boost::system::error_category::std_category' from file '/Users/bmuppana/src/boost_1_67_0/stage/lib/libboost_filesystem.a(codecvt_error_category.o)' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings.
ld: warning: direct access in function 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >::rethrow() const' from file '.extdep/osx/flow-6.0.8-osx-x86_64/lib/libflow.a(Net2.actor.g.cpp.o)' to global weak symbol 'typeinfo for boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >' from file '.objs/./IMetric.cpp.o' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings.
The Document Layer doesn't implement any authentication mechanisms. As long as the Document Layer is used as a sidecar this is not a problem, as it can be configured to accept only local connections. But, if we want to use the Document Layer as a service we have to depend on other security mechanisms. Mutual TLS is the strongest mechanism, among the authentication schemes supported by MongoDB drivers. This issue should address it.
FoundationDB does support mutual TLS. As the Document Layer is also in Flow, we should be able to reuse that code.
To be able to work seamlessly with drivers, Document Layer's TLS implementation should be compatible with MongoDB drivers.
findAndModify
and collStats
are broken with #26, when we changed the command name check with all small case. We should instead look for the first field in the command query BSON.
Downloads page has macOS and Ubuntu packages, but not CentOS. Test and publish CentOS packages.
Document Layer has correctness framework that depends on a deterministic comparison of query results between the in-memory simulation of MongoDB and our implementation. We should have some documentation on that, the least on how to run it.
At the moment, we only have unit tests on unique indexes. We should update correctness to test unique indexes. MongoModel doesn't implement indexes as indexes in MongoModel are not necessary to test the correctness of indexes in Doc Layer. But, unique indexes show the difference in behavior in case there can be duplicates. We should update MongoModel to have this check to be consistent with Doc Layer.
As reported here, mongo shell is returning version 2.4.0. We are compatible with 3.0.0 it should return that instead.
The issue is mongo shell depends on buildInfo
command response to find out version. We are updating isMaster
but not buildInfo
. We should change there too.
Note: Not directly related to this, but with buildInfo
we don't return anything about git hash or branch information, as we are not really MongoDB. So far, that didn't seem to cause any issues, or we didn't notice yet. We do have a custom command getdoclayerversion
that gives us more information on Document Layer versions.
Wire protocol 4 adds
mongos
doesn't implement this as well. The client never sends this message. We don't have to implement this.We have functional tests for deletes. And some basic deletes are done part of deterministic correctness. We should add delete tests with randomized queries there.
Document layer stores a field under single key in FDB. So, we are limited by FDB value size (100K). By splitting it across multiple FDB keys we can support bigger fields.
DataValue
defines the type of value. Other than all the normal types, we also have arrays and objects. We could have another type, ex: SPLIT_VALUE, and this could just piggyback on arrays code. We have to change the code in insert and updates, which assume values are always stored under a single key. Path expansion should not treat split key component as part of the key. So, array expansion rules should be carefully adjusted to work with this.
Correctness implements unit tests with some custom framework. There would be a lot of benefits if we use a standard testing framework
fdbserver
and fdbdoc
processesA declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.