cwida / duckdb-pgq Goto Github PK

View Code? Open in Web Editor NEW

This project forked from duckdb/duckdb

36.0 36.0 1.0 283.68 MB

DuckDB is an in-process SQL OLAP Database Management System

Home Page: http://www.duckdb.org

License: MIT License

Shell 0.09% C++ 86.95% Python 4.69% C 5.14% Java 1.14% R 0.01% Julia 0.71% Makefile 0.05% CMake 0.45% Swift 0.78%

duckdb-pgq's People

Contributors

Stargazers

Watchers

Forkers

zahidabasher

duckdb-pgq's Issues

Ensure PGQ keywords can be used outside of match statement

keywords such as GROUPS and PATH can now not be used outside of PGQ queries, but this should be possible.

Sync to the latest DuckDB version

Unaliased pattern leads to segfault

Loading the snb.duckdb dataset followed by these queries

CREATE PROPERTY GRAPH pg 
VERTEX TABLES(
 Person     PROPERTIES(id,firstName) LABEL Person,
 Tag        PROPERTIES(id,name)      LABEL Tag,
 University PROPERTIES(id,name)      LABEL University)
EDGE TABLES(
 Person_knows_Person  
  SOURCE KEY(person1id) REFERENCES Person(id) 
  DESTINATION KEY(person2id) REFERENCES Person(id)
  PROPERTIES(creationDate,person1id,person2id)
  LABEL know, 
 Person_hasInterest_Tag 
  SOURCE KEY(personid) REFERENCES Person(id)
  DESTINATION KEY(tagid) REFERENCES Tag(id)
  PROPERTIES(personid,tagid) 
  LABEL hasInterest,
 Person_studyAt_University 
  SOURCE KEY(personid) REFERENCES Person(id)
  DESTINATION KEY(universityid) REFERENCES University(id)
  PROPERTIES(personid,universityid,classyear)
  LABEL studyAt)

SELECT study.university_name, study.id FROM GRAPH_TABLE (pg, 
  MATCH (:Person WHERE a.name='Bob')-[s:studyAt]->
        (u:University)
  COLUMNS (u.name as university_name, a.id as pid)) study

Remove comma between property graph name and match

Apparently not necessary to have a comma according to the official spec ¯_(ツ)_/¯

Improve Equals function

bool IterativeLengthFunctionData::Equals(const FunctionData &other_p) const {
	// TODO: Change this to check if both are on same CSR
	return true;
}

Investigate old bidirectional implementation being very slow

Old bidirectional implementation is very slow on graph500-22 dataset

Implement parser

Can be done by using hooks (figure out how)

SIGKILL new bidirectional implementation

What happens?

SIGKILL on 16 threads bidirectional-new implementation with graphalytics graph500-22 dataset

Inheritance will work as follows:
A vertex table can be defined as:
Organisation LABEL Organisation_kind IN (Company, University)
This means that there is a column Organisation_kind that is a BIGINT in which every row that is also part of the table company the value is 1 (0001). The row that is also a University has value (0010).
If the user then does a check on an inherited label, we need to add a clause to the where that is:
WHERE organisation_kind & <index of that label>

When true, it means that the row is part of the table . An element can only be part of one other table

Create basic support for the match statement

Parser create property graph multiple properties

What happens?

On the following query:

CREATE PROPERTY GRAPH pg
VERTEX TABLES (
    Student PROPERTIES ( id, name ) LABEL Person
    )
EDGE TABLES (
    know    SOURCE KEY ( src ) REFERENCES Student ( id )
            DESTINATION KEY ( dst ) REFERENCES Student ( id )
            PROPERTIES ( createDate ) LABEL Knows
    )

The following error is thrown:

Parser Error: syntax error at or near "name"
LINE 3:     Student PROPERTIES ( id, name ) LABEL Person

To Reproduce

CREATE TABLE Student(id BIGINT, name VARCHAR);

CREATE TABLE know(src BIGINT, dst BIGINT, createDate BIGINT);

INSERT INTO Student VALUES (0, 'Daniel'), (1, 'Tavneet'), (2, 'Gabor'), (3, 'Peter');

INSERT INTO know VALUES (0,1, 10), (0,2, 11), (0,3, 12), (1,2, 14), (1,3, 15), (2,3, 16);

CREATE PROPERTY GRAPH pg
VERTEX TABLES (
Student PROPERTIES ( id, name ) LABEL Person
)
EDGE TABLES (
know SOURCE KEY ( src ) REFERENCES Student ( id )
DESTINATION KEY ( dst ) REFERENCES Student ( id )
PROPERTIES ( createDate ) LABEL Knows
)

OS:

macOs 13 - Apple M1 Pro

DuckDB Version:

6.0.1

DuckDB Client:

CLI

Full Name:

Daniel ten Wolde

Affiliation:

Centrum Wiskunde & Informatica

Have you tried this on the latest `master` branch?

I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

I agree

Multithreaded iterativelength getting a segfault

See branch debug_generating_path_segfault
Test debug_generating_path.test with threads not limited, the result will be a segmentation fault seemingly because started_searches goes out of range. Setting the number of threads to 1 will not cause this to happen.

Change SQL/PGQ properties in client data to extension client data

Make labels case insensitive

Labels are not yet case insensitive, though they should be to avoid duplicate entries

Add edge ID for CSR

Add error handling when CSR has not been created

Add in check to not run path length functions when the CSR has not been initialized
(Currently creates a segfault, see test_path_length.test)

This was forgotten when migrating to the updated DuckDB repo

Edge table references a table that is not defined as a vertex table

What should happen in the following case:

CREATE TABLE person(id BIGINT, name VARCHAR);
CREATE TABLE know(src BIGINT, dst BIGINT, createDate BIGINT);
CREATE TABLE employer(id BIGINT);


CREATE PROPERTY GRAPH pg
VERTEX TABLES (
    person PROPERTIES ( id, name ) LABEL Person,
    )
EDGE TABLES (
    employs    SOURCE KEY ( src ) REFERENCES employer ( id )
            DESTINATION KEY ( dst ) REFERENCES Student ( id )
            PROPERTIES ( createDate ) LABEL Employs
    )

In this case the table employer is not defined as a vertex table. Therefore this should not be valid?
Reading the docs the definition of an edge table includes:
The name of the destination vertex table. [p.12]

But in this case, the table referenced is not a vertex table as defined on page 11.

Crash when * in column clause

What happens?

see title

To Reproduce

SELECT *
FROM GRAPH_TABLE(pg
    MATCH (p:Person)-[w:worksAt]->(u:university)
    COLUMNS (*)
    ) result

OS:

macOs 13 - Apple M1 Pro

DuckDB Version:

latest

DuckDB Client:

cli

Full Name:

Daniel ten Wolde

Affiliation:

CWI

Have you tried this on the latest `master` branch?

I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

I agree

Create CSR leads to segfault in delete_csr.test

What happens?

see title

To Reproduce

Run the test/sql/sqlpgq/delete_csr.test file. Make sure to add the query I or statement ok before the query as it will crash in the current state

OS:

macOs 13 - Apple M1 Pro

DuckDB Version:

latest

DuckDB Client:

CLI

Full Name:

Daniel ten Wolde

Affiliation:

CWI

Have you tried this on the latest `master` branch?

I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

I agree

Support the EXCEPT list for a property graph table

Various things from the parser are not yet supported in the transformer and binder. In particular, keywords to define what properties are in the pg table

NO PROPERTIES
PROPERTIES ALL COLUMNS
PROPERTIES ALL COLUMNS EXCEPT (list of columns)
Optional keyword ARE should also be checked if parsed correctly

Forgot to add UDF to sqlpgq_config.py

https://github.com/cwida/duckdb-pgq/blob/534854d530766cff9a48ff67ff4dac110ab81ec0/extension/sqlpgq/sqlpgq_config.py

When registering a new UDF, make sure to add it to the list of files in the sqlpgq_config.py @vlowingkloude

Also now you have to make a pull request and (if correct) I have to approve the merge :)

duckdb gives an internal error when trying to select from multiple GRAPH_TABLE's

What happens?

When attempting to run a query that selects from multiple property graph tables, the following error is given: "Error: INTERNAL Error: Attempted to dereference unique_ptr that is NULL!"

According to the property graph sql standard described in ISO/IEC 9075-16, selecting from multiple GRAPH_TABLE's (implicit cross join) should be possible.

To Reproduce

CREATE TABLE cities (
name VARCHAR,
lat DECIMAL,
lon DECIMAL
);

CREATE TABLE cities_are_adjacent (
city1name VARCHAR,
city2name VARCHAR
);

-CREATE PROPERTY GRAPH citymap
VERTEX TABLES (
cities PROPERTIES (name,lat,lon) LABEL city
)
EDGE TABLES (
cities_are_adjacent SOURCE KEY ( city1name ) REFERENCES cities ( name )
DESTINATION KEY ( city2name ) REFERENCES cities ( name )
LABEL adjacent
);

D -select * from GRAPH_TABLE (citymap MATCH (s:city)-[r:adjacent]->(t:city)) g1;

┌─────────┬───────────────┬───────────────┬───┬───────────────┬───────────────┐
│ name │ lat │ lon │ … │ lat_1 │ lon_1 │
│ varchar │ decimal(18,3) │ decimal(18,3) │ │ decimal(18,3) │ decimal(18,3) │
├─────────────────────────────────────────────────────────────────────────────┤
│ 0 rows │
└─────────────────────────────────────────────────────────────────────────────┘
D -select * from GRAPH_TABLE (citymap MATCH (s:city)-[r:adjacent]->(t:city)) g1, GRAPH_TABLE (citymap MATCH (s:city)-[r:adjacent]->(t:city)) g2;
Error: INTERNAL Error: Attempted to dereference unique_ptr that is NULL!

OS:

Linux x86_64

DuckDB Version:

v0.10.1-dev17 bb9b820

DuckDB Client:

c++

Full Name:

Jeff Cavano

Affiliation:

eBay

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Yes, I have

Implement MatchRef::ToString()

Currently returns an empty string. Not sure what should exactly be in this.

cwida / duckdb-pgq Goto Github PK

duckdb-pgq's People

Contributors

Stargazers

Watchers

Forkers

duckdb-pgq's Issues

What happens?

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Affiliation:

Have you tried this on the latest master branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Affiliation:

Have you tried this on the latest master branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Affiliation:

Have you tried this on the latest master branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Recommend Projects

Recommend Topics

Recommend Org

Have you tried this on the latest `master` branch?

Have you tried this on the latest `master` branch?

Have you tried this on the latest `master` branch?