ucbrise / fluent-old Goto Github PK
View Code? Open in Web Editor NEWBloom + C++
Bloom + C++
When creating a driver, you have to specify the names and types of each collection. Then, you call Tick
with a function that takes arguments of the same types. Can we avoid this redundancy?
Low priority, but some thoughts:
I think the less painful way might make more sense because both can include vendored projects that use CMake (not an issue because typically we don't change those) and can generate CMake files for those who don't want to deal with a more complex build system
To support unordered statements, we should do fixpoint computation. This will:
Upon receipt of a new tuple, figure out the rules to fire (in order if we're preserving rule order).
Make sure that classes are not copyable if they should not be copied. We could use a DISALLOW_COPY
macro or something similar that I've seen in some code elsewhere.
Figure out how to serialize things quickly and ergonomically. Ideally, we can support things like ProtoBuf and Arrow.
To support this, agg functions should have a signature vector<std:size_t>
@cw75, if you could set up all this stuff, that would be great. The project is still young, so it's not incredibly urgent, but I suppose the sooner the better.
zmq_utils/zmq_utils.h
has a bunch of global functions. I think these functions should maybe be put into a namespace.
I'll add this to the C++ project template too.
It can be a collection that is initially not empty but then is forever empty afterwards.
Right now, we rely on global ToString and FromString functions. We should change that to be more like the Hash
and ToSql
templates we use.
Right now, ZMQ sockets are REQ/REP which means a client and server have to send messages back and forth in lock step. Figure out which socket type allows to send any number of messages back and forth.
Status
and StatusOr
are incredibly useful classes used within Google, but there is not a standard public version. Projects like protobuf
, tensorflow
, etc. implement public facing ones. Lots of open source projects seem to implement their own versions too. See https://github.com/search?p=1&q=filename:statusor.h+StatusOr+language:C%2B%2B
for some examples.
Happy to submit PR (e.g., install_dependencies_arm.sh
), but wasn't sure if you wanted to mix into this library.
libssl-dev
version explicitly as libssl-dev/xenial
sudo apt install postgresql-9.5 postgresql-server-dev-9.5
)./build/Debug/src/libpqxx_project/config/config.guess
with the config.guess
found in the automake
installed on the system (for me /usr/share/automake-1.15/config.guess
). Otherwise, automake
complains that it can't recognize the build typeNot totally relevant to this issue, but benchmarks on PX 2:
nvidia@nvidia:~/fluent/build/Debug$ ctest -L BENCHMARK
Test project /home/nvidia/fluent/build/Debug
Start 48: ra_physical_cross_bench
1/8 Test #48: ra_physical_cross_bench .......... Passed 1.96 sec
Start 49: ra_physical_filter_bench
2/8 Test #49: ra_physical_filter_bench ......... Passed 1.67 sec
Start 50: ra_physical_flat_map_bench
3/8 Test #50: ra_physical_flat_map_bench ....... Passed 1.59 sec
Start 51: ra_physical_group_by_bench
4/8 Test #51: ra_physical_group_by_bench ....... Passed 2.05 sec
Start 52: ra_physical_hash_join_bench
5/8 Test #52: ra_physical_hash_join_bench ...... Passed 2.52 sec
Start 53: ra_physical_iterable_bench
6/8 Test #53: ra_physical_iterable_bench ....... Passed 2.05 sec
Start 54: ra_physical_map_bench
7/8 Test #54: ra_physical_map_bench ............ Passed 1.53 sec
Start 55: ra_physical_project_bench
8/8 Test #55: ra_physical_project_bench ........ Passed 1.70 sec
100% tests passed, 0 tests failed out of 8
Label Time Summary:
BENCHMARK = 15.07 sec (8 tests)
Total Test time (real) = 15.08 sec
When a Fluent node writes into a channel, it expects the tuple to be sent to another Fluent node and placed in its channel. In order to know which channel the tuple should be placed in, each channel has to have a unique name. If all Fluent programs are running the same program, this can be checked at runtime when the program starts. If different Fluent programs are all communicating, then enforcing global uniqueness is probably impossible, but we can do something to make it harder to mess up.
See more discussion in #3 from @jhellerstein :
This may be something we metaprogram in Fluent. That is, we write a Fluent program deployer that deploys Fluent programs. deployed checks invariants like unique naming, etc.
Alternatively, deployer could assign namespaces (name prefixes) for each program to be deployed, and install shim operators on the program's output and input that prepend and strip (respectively) the namespace to the Collections.
Pending #68
remove zmq dependency from RA
Pending #57
Due to the set -e, the tmux new-window
command fails trying to allocate a window that is already indexed to 0.
$ ./src/examples/kvs/launch_kvs.sh
+ main
+ [[ -z dummy ]]
++ tmux display-message -p '#S'
+ session=0
+ tmux new-window -t 0 -n kvs
create window failed: index in use: 0
Adding -t
to the display-message command solves this
session="$(tmux display-message -t -p '#S')"
Should be fixable according to https://stackoverflow.com/questions/14061605/override-option-in-cmake-subproject, but no dice on first try
@jhellerstein from #3:
Eventually we may want a way to control a fluent program from the outside. Useful for testing. So we may want some kind of check for external interrupts between Tick and Receive. Or perhaps we can have an interrupt handler as the first channel in every fluent program, so it can preempt execution of other channels.
@jhellerstein from #3:
We should benchmark the overhead of this lookup for dispatching messages. I'm concerned we may need to avoid to achieve line speeds.
This includes:
@jhellerstein, feel free to make the repo public whenever you want. I don't remember why I made it private to start.
It would be cool to use Boost's Concept Check Library in the code.
Pick a library to perform logging and assertion checking. For example, Google has some nice libraries to do stuff like LOG(INFO) << "this is a logged message"
and also ASSERT_NOTNULL(p)
. Stuff like that makes the code easier to read and easier to debug. A StatusOr
type would also be useful.
When I figure this out, I can add it to the C++ project template.
#3 implements the collections and communication from Bloom. We also have to figure out how to get all the advantages of declarative programming.
Building Fluent tests can take a long time because of range-v3. Compiling an empty program that includes range-v3 takes about 4 seconds. There might be a way to tell CMake to build all of the tests at once, so that the overhead of including range-v3 is only incurred once rather than once for every test.
ubuntu@aria:/vagrant_data/fluent$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial
ubuntu@aria:/vagrant_data/fluent$ ./scripts/build.sh Debug 4
-- Boost version: 1.58.0
-- [include directories]
-- /vagrant_data/fluent/build/Debug/src/aws_project/aws-cpp-sdk-core/include
-- /vagrant_data/fluent/build/Debug/src/aws_project/aws-cpp-sdk-s3/include
-- /usr/include
-- /vagrant_data/fluent/build/Debug/src/cassandra_project/include
-- /vagrant_data/fluent/build/Debug/src/cereal_project/include
-- /vagrant_data/fluent/build/Debug/src/fmt_project
-- /vagrant_data/fluent/build/Debug/src/googlebenchmark_project/include
-- /vagrant_data/fluent/build/Debug/src/googlelog_project/src
-- /vagrant_data/fluent/build/Debug/src/googlelog_project-build
-- /vagrant_data/fluent/build/Debug/src/googletest_project/googletest/include
-- /vagrant_data/fluent/build/Debug/src/googletest_project/googlemock/include
-- /vagrant_data/fluent/build/Debug/src/grpc_project/third_party/protobuf/src
-- /vagrant_data/fluent/build/Debug/src/grpc_project/third_party/protobuf/src
-- /vagrant_data/fluent/build/Debug/src/grpc_project/include
-- /vagrant_data/fluent/build/Debug/src/libpqxx_project/include
-- /vagrant_data/fluent/build/Debug/src/range-v3_project/include
-- /vagrant_data/fluent/build/Debug/src/redox_project/include
-- /vagrant_data/fluent/build/Debug/src/zeromq_project/include
-- /vagrant_data/fluent/build/Debug/src/zeromqcpp_project
-- /vagrant_data/fluent/src/.
-- /vagrant_data/fluent/build/Debug
-- [link directories]
-- /vagrant_data/fluent/build/Debug/src/aws_project-build/aws-cpp-sdk-core
-- /vagrant_data/fluent/build/Debug/src/aws_project-build/aws-cpp-sdk-s3
-- /vagrant_data/fluent/build/Debug/src/cassandra_project-build
-- /vagrant_data/fluent/build/Debug/src/fmt_project-build/fmt
-- /vagrant_data/fluent/build/Debug/src/googlebenchmark_project-build/src
-- /vagrant_data/fluent/build/Debug/src/googlelog_project-build
-- /vagrant_data/fluent/build/Debug/src/googletest_project-build/googlemock/gtest
-- /vagrant_data/fluent/build/Debug/src/googletest_project-build/googlemock
-- /vagrant_data/fluent/build/Debug/src/grpc_project/third_party/protobuf/src/.libs
-- /vagrant_data/fluent/build/Debug/src/grpc_project/libs/opt
-- /vagrant_data/fluent/build/Debug/src/libpqxx_project/src/.libs
-- /vagrant_data/fluent/build/Debug/src/redox_project-build
-- /vagrant_data/fluent/build/Debug/src/zeromq_project/.libs
-- Configuring done
-- Generating done
-- Build files have been written to: /vagrant_data/fluent/build/Debug
[ 2%] Built target zeromq_project
[ 4%] Built target zeromqcpp_project
[ 4%] Performing update step for 'redox_project'
[ 6%] Built target range-v3_project
[ 7%] Performing update step for 'googletest_project'
[ 8%] Built target grpc_project
[ 8%] Performing update step for 'cassandra_project'
[ 10%] Built target libpqxx_project
[ 10%] Performing configure step for 'googletest_project'
[ 12%] Built target cereal_project
[ 13%] Performing configure step for 'cassandra_project'
[ 14%] Performing update step for 'googlebenchmark_project'
-- Project version: 2.7.0
-- Using std::atomic implementation for atomic operations
-- Configuring done
[ 14%] Performing configure step for 'googlebenchmark_project'
-- Generating done
-- Build files have been written to: /vagrant_data/fluent/build/Debug/src/googletest_project-build
[ 15%] Performing build step for 'googletest_project'
-- git Version: v0.0.0
-- Version: 0.0.0
-- Performing Test HAVE_STD_REGEX
[ 36%] Built target gmock_main
Current branch master is up to date.
[ 15%] Performing configure step for 'redox_project'
[ 63%] Built target gmock
Building for x86_64
-- Configuring done
[ 81%] Built target gtest
-- Generating done
-- Build files have been written to: /vagrant_data/fluent/build/Debug/src/redox_project-build
[ 16%] Performing build step for 'redox_project'
[100%] Built target gtest_main
[ 16%] No install step for 'googletest_project'
[ 50%] Built target redox_static
[ 16%] Completed 'googletest_project'
[ 17%] Built target googletest_project
[100%] Built target redox
[ 17%] Performing update step for 'aws_project'
-- Using hash header and namespace "std"
-- Configuring done
[ 17%] No install step for 'redox_project'
[ 17%] Performing configure step for 'aws_project'
[ 18%] Completed 'redox_project'
-- TARGET_ARCH not specified; inferring host OS to be platform compilation target
-- Building AWS libraries as shared objects
-- Generating linux build config
[ 19%] Built target redox_project
-- Building project version: 1.1.7
[ 19%] Performing update step for 'fmt_project'
-- Zlib include directory: /usr/include
-- Zlib library: /usr/lib/x86_64-linux-gnu/libz.so
-- Encryption: Openssl
-- Openssl include directory: /usr/include
-- Openssl library: /usr/lib/x86_64-linux-gnu/libssl.so;/usr/lib/x86_64-linux-gnu/libcrypto.so
-- Http client: Curl
-- Curl include directory: /usr/include/x86_64-linux-gnu
-- Curl library: /usr/lib/x86_64-linux-gnu/libcurl.so
-- Considering s3
[ 19%] Performing configure step for 'fmt_project'
-- CMake version: 3.5.1
-- Build type: Release
-- Target 'doc' disabled (requires doxygen)
-- Configuring done
-- Generating done
-- Generating done
-- Build files have been written to: /vagrant_data/fluent/build/Debug/src/cassandra_project-build
[ 19%] Performing build step for 'cassandra_project'
-- Build files have been written to: /vagrant_data/fluent/build/Debug/src/fmt_project-build
[ 19%] Performing build step for 'fmt_project'
[ 2%] Linking CXX shared library libfmt.so
[ 10%] Built target fmt
[ 15%] Built target gmock
[ 20%] Built target noexception-test
-- Updating version info to 1.1.7
-- Performing Test HAVE_STD_REGEX -- success
-- Performing Test HAVE_GNU_POSIX_REGEX
-- Custom memory management enabled; stl objects now using custom allocators
[ 30%] Built target test-main
[ 45%] Built target posix-mock-test
-- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
-- Performing Test HAVE_POSIX_REGEX
[ 47%] Linking CXX executable ../bin/posix-test
[ 50%] Built target posix-test
[ 52%] Linking CXX executable ../bin/ostream-test
-- Performing Test HAVE_POSIX_REGEX -- success
-- Performing Test HAVE_STEADY_CLOCK
-- Configuring done
[ 55%] Built target ostream-test
[ 57%] Linking CXX executable ../bin/printf-test
-- Performing Test HAVE_STEADY_CLOCK -- success
-- Configuring done
-- Generating done
-- Build files have been written to: /vagrant_data/fluent/build/Debug/src/googlebenchmark_project-build
[ 19%] Performing build step for 'googlebenchmark_project'
[ 60%] Built target printf-test
[ 62%] Linking CXX executable ../bin/util-test
[ 31%] Built target benchmark
[ 36%] Built target output_test_helper
[ 40%] Built target benchmark_test
-- Generating done
-- Build files have been written to: /vagrant_data/fluent/build/Debug/src/aws_project-build
[ 45%] Built target skip_with_error_test
[ 20%] Performing build step for 'aws_project'
[ 50%] Built target multiple_ranges_test
[ 54%] Built target register_benchmark_test
[ 65%] Built target util-test
[ 59%] Built target donotoptimize_test
[ 67%] Linking CXX executable ../bin/assert-test
[ 63%] Built target map_test
[ 68%] Built target fixture_test
[ 72%] Built target reporter_output_test
[ 77%] Built target complexity_test
[ 70%] Built target assert-test
[ 81%] Built target diagnostics_test
[ 72%] Linking CXX executable ../bin/macro-test
[ 1%] Building CXX object CMakeFiles/cpp-driver.dir/src/token_aware_policy.cpp.o
[ 86%] Built target cxx03_test
[ 90%] Built target filter_test
[ 95%] Built target basic_test
[100%] Built target options_test
[ 75%] Built target macro-test
[ 85%] Built target header-only-test
[ 21%] No install step for 'googlebenchmark_project'
[ 21%] Completed 'googlebenchmark_project'
[ 87%] Linking CXX executable ../bin/format-impl-test
[ 21%] Built target googlebenchmark_project
[ 22%] Performing update step for 'googlelog_project'
[ 22%] Performing configure step for 'googlelog_project'
[ 90%] Built target format-impl-test
[ 92%] Linking CXX executable ../bin/format-test
CMake Warning at CMakeLists.txt:52 (find_package):
By not providing "Findgflags.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "gflags", but
CMake did not find one.
Could not find a package configuration file provided by "gflags" with any
of the following names:
gflagsConfig.cmake
gflags-config.cmake
Add the installation prefix of "gflags" to CMAKE_PREFIX_PATH or set
"gflags_DIR" to a directory containing one of the above files. If "gflags"
provides a separate development package or SDK, be sure it has been
installed.
[ 95%] Built target format-test
[ 97%] Linking CXX executable ../bin/gtest-extra-test
[100%] Built target gtest-extra-test
[ 21%] Built target aws-cpp-sdk-core
[ 22%] No install step for 'fmt_project'
[ 22%] Completed 'fmt_project'
[ 24%] Built target fmt_project
[ 25%] Built target examples_distributed_kvs_proto
Scanning dependencies of target examples_file_system
[ 25%] Building CXX object examples/file_system/CMakeFiles/examples_file_system.dir/string_store.cc.o
[ 2%] Building CXX object CMakeFiles/cpp-driver.dir/src/latency_aware_policy.cpp.o
[ 3%] Building CXX object CMakeFiles/cpp-driver.dir/src/murmur3.cpp.o
[ 5%] Building CXX object CMakeFiles/cpp-driver.dir/src/whitelist_policy.cpp.o
[ 26%] Linking CXX static library libexamples_file_system.a
[ 26%] Built target examples_file_system
[ 26%] Generating api.grpc.pb.cc, api.grpc.pb.h
[ 26%] Generating api.pb.cc, api.pb.h
Scanning dependencies of target examples_grcp_proto
[ 6%] Building CXX object CMakeFiles/cpp-driver.dir/src/user_type_value.cpp.o
-- Configuring done
[ 26%] Building CXX object examples/grpc/CMakeFiles/examples_grcp_proto.dir/api.pb.cc.o
-- Generating done
-- Build files have been written to: /vagrant_data/fluent/build/Debug/src/googlelog_project-build
[ 27%] Performing build step for 'googlelog_project'
[ 4%] Linking CXX shared library libglog.so
[ 36%] Built target glog
[ 40%] Linking CXX executable signalhandler_unittest
[ 7%] Building CXX object CMakeFiles/cpp-driver.dir/src/address.cpp.o
[ 45%] Built target signalhandler_unittest
[ 50%] Linking CXX executable stl_logging_unittest
[ 54%] Built target stl_logging_unittest
[ 59%] Linking CXX executable stacktrace_unittest
[ 63%] Built target stacktrace_unittest
[ 68%] Linking CXX executable demangle_unittest
[ 9%] Building CXX object CMakeFiles/cpp-driver.dir/src/cluster.cpp.o
[ 72%] Built target demangle_unittest
[ 77%] Linking CXX executable logging_unittest
[ 81%] Built target logging_unittest
[ 86%] Linking CXX executable symbolize_unittest
[ 90%] Built target symbolize_unittest
[ 95%] Linking CXX executable utilities_unittest
[100%] Built target utilities_unittest
[ 27%] No install step for 'googlelog_project'
[ 28%] Completed 'googlelog_project'
[ 28%] Built target googlelog_project
Scanning dependencies of target testing
[ 29%] Building CXX object testing/CMakeFiles/testing.dir/mock_clock.cc.o
[ 22%] Building CXX object aws-cpp-sdk-s3/CMakeFiles/aws-cpp-sdk-s3.dir/source/S3Client.cpp.o
[ 29%] Linking CXX static library libtesting.a
[ 29%] Built target testing
[ 30%] Building CXX object examples/grpc/CMakeFiles/examples_grcp_proto.dir/api.grpc.pb.cc.o
[ 10%] Building CXX object CMakeFiles/cpp-driver.dir/src/round_robin_policy.cpp.o
[ 11%] Building CXX object CMakeFiles/cpp-driver.dir/src/timestamp_generator.cpp.o
[ 13%] Building CXX object CMakeFiles/cpp-driver.dir/src/get_time-unix.cpp.o
[ 14%] Building CXX object CMakeFiles/cpp-driver.dir/src/collection.cpp.o
[ 15%] Building CXX object CMakeFiles/cpp-driver.dir/src/tuple.cpp.o
Scanning dependencies of target ra_logical_to_debug_string_test
[ 17%] Building CXX object CMakeFiles/cpp-driver.dir/src/third_party/hdr_histogram/hdr_histogram.cpp.o
[ 31%] Building CXX object ra/logical/CMakeFiles/ra_logical_to_debug_string_test.dir/to_debug_string_test.cc.o
[ 18%] Building CXX object CMakeFiles/cpp-driver.dir/src/third_party/curl/hostcheck.cpp.o
[ 19%] Building CXX object CMakeFiles/cpp-driver.dir/src/ssl/ssl_openssl_impl.cpp.o
[ 21%] Building CXX object CMakeFiles/cpp-driver.dir/src/ssl/ring_buffer_bio.cpp.o
[ 97%] Built target cpp-driver
Scanning dependencies of target cassandra
[ 98%] Linking CXX shared library libcassandra.so
virtual memory exhausted: Cannot allocate memory
ra/logical/CMakeFiles/ra_logical_to_debug_string_test.dir/build.make:62: recipe for target 'ra/logical/CMakeFiles/ra_logical_to_debug_string_test.dir/to_debug_string_test.cc.o' failed
make[2]: *** [ra/logical/CMakeFiles/ra_logical_to_debug_string_test.dir/to_debug_string_test.cc.o] Error 1
CMakeFiles/Makefile2:4433: recipe for target 'ra/logical/CMakeFiles/ra_logical_to_debug_string_test.dir/all' failed
make[1]: *** [ra/logical/CMakeFiles/ra_logical_to_debug_string_test.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
Scanning dependencies of target cassandra_static
[ 98%] Built target cassandra
[ 22%] Building CXX object aws-cpp-sdk-s3/CMakeFiles/aws-cpp-sdk-s3.dir/source/model/FilterRuleName.cpp.o
[100%] Linking CXX static library libcassandra_static.a
c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-6/README.Bugs for instructions.
aws-cpp-sdk-s3/CMakeFiles/aws-cpp-sdk-s3.dir/build.make:138: recipe for target 'aws-cpp-sdk-s3/CMakeFiles/aws-cpp-sdk-s3.dir/source/S3Client.cpp.o' failed
make[5]: *** [aws-cpp-sdk-s3/CMakeFiles/aws-cpp-sdk-s3.dir/source/S3Client.cpp.o] Error 4
make[5]: *** Waiting for unfinished jobs....
Scanning dependencies of target testing-resources
CMakeFiles/Makefile2:117: recipe for target 'aws-cpp-sdk-s3/CMakeFiles/aws-cpp-sdk-s3.dir/all' failed
make[4]: *** [aws-cpp-sdk-s3/CMakeFiles/aws-cpp-sdk-s3.dir/all] Error 2
make[4]: *** Waiting for unfinished jobs....
[ 22%] Building CXX object testing-resources/CMakeFiles/testing-resources.dir/source/MemoryTesting.cpp.o
[ 23%] Building CXX object testing-resources/CMakeFiles/testing-resources.dir/source/TestingEnvironment.cpp.o
[ 31%] Linking CXX static library libexamples_grcp_proto.a
[ 23%] Building CXX object testing-resources/CMakeFiles/testing-resources.dir/source/external/gtest-all.cc.o
[ 31%] Built target examples_grcp_proto
[ 23%] Building CXX object testing-resources/CMakeFiles/testing-resources.dir/source/platform/linux-shared/PlatformTesting.cpp.o
[100%] Built target cassandra_static
[ 31%] No install step for 'cassandra_project'
[ 31%] Completed 'cassandra_project'
[ 32%] Built target cassandra_project
[ 24%] Linking CXX shared library libtesting-resources.so
[ 24%] Built target testing-resources
Makefile:127: recipe for target 'all' failed
make[3]: *** [all] Error 2
CMakeFiles/aws_project.dir/build.make:110: recipe for target 'src/aws_project-stamp/aws_project-build' failed
make[2]: *** [src/aws_project-stamp/aws_project-build] Error 2
CMakeFiles/Makefile2:437: recipe for target 'CMakeFiles/aws_project.dir/all' failed
make[1]: *** [CMakeFiles/aws_project.dir/all] Error 2
Makefile:94: recipe for target 'all' failed
make: *** [all] Error 2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.