electrum / ssb-dbgen Goto Github PK
View Code? Open in Web Editor NEWStar Schema Benchmark dbgen
Home Page: http://www.cs.umb.edu/~poneil/publist.html
Star Schema Benchmark dbgen
Home Page: http://www.cs.umb.edu/~poneil/publist.html
Note: In our research paper we use the SSB instead of SSBM Version of 2/28/10: Cardinality of supplier fixed to follow benchmark spec: now 2000*SF (previously was 10000*SF, in error): line 226, driver.c Type of time value changed from long to time_t (now 64 bits on Windows): line 688, build.c Building in Visual Studio 2008: Use Win32 console project, not using precompiled headers, in Properties>C/C++>CommandLine, additional options: /D "SSBM" /D "DBNAME" /D "DB2" (for DB2) Building using makefile_win: set for DB2 build: nmake -f makefile_win (Change DATABASE symbol for other database) SSBM dbgen readme: SSBM is based on TPC-H dbgen source. The coding style and architecture follows the TPCH dbgen. The original TPCH dbgen code stays untouched and all new code related to SSBM dbgen follow the "#ifdef SSBM" statements. For original detailed TPC-H documentation, please refer TPCH_README document under the same directory. Here we just list few things that are specific to SSBM. 1. How is SSBM DBGEN built? Same idea as TPCH dbgen setup, which requires user to create an appropriate makefile, using makefile.suite as a basis. Make sure to use "SSBM" for the workload variable. Type "make" to compile and to generate the SSBM dbgen executable. Please refer to Porting.Notes for more details and for suggested compile time options. Note: If you want to generate the data files to a diffent directory, you should copy the dbgen executable as well as the dists.dss file to that directory. 2. How to generate SSBM data files? To generate the dimension tables: (customer.tbl) dbgen -s 1 -T c (part.tbl) dbgen -s 1 -T p (supplier.tbl) dbgen -s 1 -T s (date.tbl) dbgen -s 1 -T d (fact table lineorder.tbl) dbgen -s 1 -T l (for all SSBM tables) dbgen -s 1 -T a To generate the refresh (insert/delete) data set: (create delete.[1-4] and lineorder.tbl.u[1-4] with refreshing fact 0.05%) dbgen -s 1 -r 5 -U 4 where "-r 5" specifies refreshin fact n/10000 "-U 4" specifies 4 segments for deletes and inserts At this moment there is no QGEN for SSBM. So the command line options related to those features won't apply. 3. What are the changes upon TPC-H dbgen changes made upon original TPC-H dbgen 1. removed snowflake tables such as nation and region (done) 2. removed the partsupply table (done) 3. removed the order table (done) 4. renamed the fact table as Lineorder and added/removed many fields ( done) 5. added the date dimension table (done) 6. adding and removing fields in dimension tables (done) 7. have data cross reference for supplycost, revenue in lineorder (done) 8. apply the refreshing only to lineorder table (done) The command line option keeps the same as TPC-H dbgen (The -T options are changed to reflect different set of tables) ===================== End of README ========================================
Please help,
When trying to generate data on a virtual machine, a pb occurs with fac table (buffer overflow!!)
$ dbgen -s 1 -T l
SSBM (Star Schema Benchmark) Population Generator (Version 1.0.0)
Copyright Transaction Processing Performance Council 1994 - 2000
Abort trap: 6
The resulting file appears incomplete:
$ cat lineorder.tbl
1|1|73801|465569|8273|19960102|5-LOW|0|17|2608718|21280402|4|2504369|92072|2|19960212|TRUCK|
1|2|73801|201928|1630|19960102|5-LOW|0|36|6587676|21280402|9|5994785|109794|6|19960228|MAIL|
1|3|73801|191100|709|19960102|5-LOW|0|8|952880|21280402|10|857592|71466|2|19960305|REG AIR|
1|4|73801|6395|9421|19960102|5-LOW|0|28|3643892|21280402|9|3315941|78083|6|19960330|AIR|
1|5|73801|72080|16246|19960102|5-LOW|0|24|2524992|21280402|10|2272492|63124|4|19960314|FOB|
1|6|73801|46904|13672|19960102|5-LOW|0|32|5922880|21280402|7|5508278|111054|2|19960207|MAIL|
2|1|156004|318510|10657|19961201|1-URGENT|0|38|5808300|6098715|0|5808300|91710|5|19970114|RAIL|
3|1|246628|12891|19585|19931014|5-LOW|0|45|8117505|23639715|6|7630454|108233|0|19940104|AIR|
3|2|246628|57107|16666|19931014|5-LOW|0|49|5214090|23639715|10|4692681|63846|0|19931220|RAIL|
3|3|246628|385345|14090|19931014|5-LOW|0|27|3861891|23639715|6|3630177|85819|7|19931122|SHIP|
3|4|246628|88139|7417|19931014|5-LOW|0|2|225426|23639715|1|223171|67627|6|19940107|TRUCK|
3|5|246628|549283|16067|19931014|5-LOW|0|28|3730328|23639715|4|3581114|79935|0|19940110|FOB|
3|6|246628|186428|17654|19931014|5-LOW|0|26|3937492|23639715|10|3543742|90865|2|19931218|RAIL|
4|1|273553|264105|18120|19951011|5-LOW|0|30|3207270|3359935|3|3111051|64145|8|19951214|REG AIR|
5|1|88970|325708|417|19940730|5-LOW|0|15|2600535|15086669|2|2548524|104021|4|19940831|AIR|
5|2|88970|371781|2590|19940730|5-LOW|0|26|4817202|15086669|7|4479997|111166|8|19940925|FOB|
5|3|88970|112591|8331|19940730|5-LOW|0|50|8017950|15086669|8|7376514|96215|3|19941013|AIR|
This is on OSX 10.12.6
While compiling in debian, I get this error:
driver.c: In function ‘process_options’:
driver.c:892:35: error: ‘pid_t’ undeclared (first use in this function)
pids = malloc(children * sizeof(pid_t));
^
Hi, is there a qgen for this ssb test?
Hello,
I'm using Mac OS Catalina and was able to compile dbgen with the given makefile. However, I cannot generate the following tables:
Parts Table:
SSBM (Star Schema Benchmark) Population Generator (Version 1.0.0)
Copyright Transaction Processing Performance Council 1994 - 2000
No load routine has been defined for the part table
Segmentation fault: 11
SSBM (Star Schema Benchmark) Population Generator (Version 1.0.0)
Copyright Transaction Processing Performance Council 1994 - 2000
Illegal instruction: 4
Anyone facing similar issues?
you know, i use this for clickhouse
but i found that the Date's format generated by this program is YYYYMMDD can not import to clickhouse
so ,i want to know
How to generate a specified date format like YYYY-MM-DD
Stderr: NOTE: Data generation for scale factors > 1000 GB is still in development,
and is not yet supported.
Your resulting data set MAY NOT BE COMPLIANT!
SSBM (Star Schema Benchmark) Population Generator (Version 1.0.0)
Copyright Transaction Processing Performance Council 1994 - 2000
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.