Comments (5)
You can now find links to the datasets in the README.
The longitude and latitude values should be GPS coordinates from randomly-selected locations in OSM. But I did not generate the longitude and latitude values myself, so I don't know the exact selection procedure.
from alex.
Thanks a lot!
from alex.
Dear Jialing,
I found that the lognormal dataset and YCSB dataset cannot be run properly if you bulk load them. Could you double-check if the two datasets are the original ones in the paper?
For the lognormal dataset, there are 190M keys.
- If you want to bulk load the lognormal dataset with less than 629,145 keys, everything is fine.
- But if I bulk load the lognormal dataset with more than 629,145 keys, ALEX suddenly goes out of control.
To be specific, I have tested the following amount of keys to bulk load: - bulk load the lognormal dataset with 600,000 keys, ALEX has 0 model nodes and 1 data node, and the maximum depth is 0.
- bulk load the lognormal dataset with 620,000 keys, ALEX has 0 model nodes and 1 data node, and the maximum depth is 0.
- bulk load the lognormal dataset with 630,000 keys, ALEX has 855 model nodes and 856 data nodes, and the maximum depth is 855.
- bulk load the lognormal dataset with 700,000 keys, ALEX cannot run with an error message: Segmentation fault (core dumped).
For the YCSB dataset, there are 200M keys.
- If you want to bulk load the YCSB dataset with less than 629,145 keys, everything is fine.
- But if I bulk load the YCSB dataset with more than 629,145 keys, ALEX suddenly goes out of control.
To be specific, I have tested the following amount of keys to bulk load: - bulk load the YCSB dataset with 600,000 keys, ALEX has 0 model nodes and 1 data node, and the maximum depth is 0.
- bulk load the YCSB dataset with 620,000 keys, ALEX has 0 model nodes and 1 data node, and the maximum depth is 0.
- bulk load the YCSB dataset with 630,000 keys, ALEX has 855 model nodes and 856 data nodes, and the maximum depth is 855.
- bulk load the YCSB dataset with 700,000 keys, ALEX cannot run with an error message: Segmentation fault (core dumped).
I thus debug the code to see what has happened. The problem is at Line 731 of 'alex.h'. The if condition will decide if num_keys <= derived_params_.max_data_node_slots * data_node_type::kMinDensity_.
derived_params_.max_data_node_slots is 1,048,576, and the data_node_type::kMinDensity_ is 0.6, thus less than 1,048,576 * 0.6 = 629,145.6 keys are fine for bulk loading, but if there are too many keys in lognormal or YCSB, ALEX cannot handle.
The weird thing is that I test the same amount of keys on longitudes and longlat, everything is fine.
- bulk load the longitudes dataset with 630,000 keys, ALEX has 8 model nodes and 8823 data nodes, and the maximum depth is 2.
- bulk load the longlat dataset with 630,000 keys, ALEX has 791 model nodes and 22404 data nodes, and the maximum depth is 3.
I thus doubt if the lognormal dataset and YCSB dataset are correct? Or should I set some parameters specifically for the two datasets? Thanks.
from alex.
I can't reproduce these errors. Can you try running the benchmark executable, as described in the README? For example, to bulk load 700K keys from YCSB, change line 16 of src/benchmark/main.cpp
to #define KEY_TYPE uint64_t
, then run this command:
./build/benchmark \
--keys_file=[path to location of YCSB dataset, might need to be an absolute path] \
--keys_file_type=binary \
--init_num_keys=700000 \
--total_num_keys=1000000 \
--batch_size=100000 \
--insert_frac=0.5 \
--lookup_distribution=zipf \
--print_batch_stats
from alex.
Thank you so much, the problem is solved now. I used int64_t before, so I could not succeed. When change int64_t to uint64_t, everything is fine. Thank you.
from alex.
Related Issues (20)
- Segmentation fault after insertions/lookups/deletions
- ALEX crashes in non-batch mode HOT 4
- Cannot replicate experiments in Table 2 HOT 4
- Question about Figure 9: ALEX supports redundant key values, but STX B+ Tree does not has such feature HOT 8
- FLAG Setting: target specific option mismatch for _tzcnt_u64
- alex insert on Ubuntu
- Unable to erase previously inserted key HOT 2
- Ale
- Github
- DataNode: `min_key_` not set. HOT 2
- `bulk_load_from_existing` doesn't set `min_key_` and `max_key_` correctly HOT 1
- expand_root: some new_children pointers are not assigned
- The meaning of expected shifts approximation formula
- Keys out of order after insert/delete/lookup operations HOT 5
- Insertion uses all memory
- Compiler Error: target specific option mismatch HOT 1
- cost questions
- Segmentation fault problem when lookup after bulk load HOT 3
- Key is correct but payload is wrong
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alex.