Comments (10)
Hi, @kszucs!
Integration with different data sources is a good idea. But it should be in wrapper packages, not in the core package. There are many sources to integrate with. Interfaces for communication can change really fast. Unfortunately I'm not familiar with all data sources and I can't stay in touch with these changes.
You can write you wrapper and place it on pypi. Feel free to ask if you need any information about integration. I'll try to help.
Note: latest release contains version of driver:
from clickhouse_driver import VERSION
It might help with flawless integration.
from clickhouse-driver.
Hi @xzkostyan
Actually I try to implement a columnar version of QueryResult
, but there are a couple of inconsistencies in the Block
implementation. AFAIK when receiving the block.data
stores column-wise data whereas when sending it contains row-wise data.
I'm kinda blocked because the tests are running really slowly (I don't know why), so instead I share my findings: master...kszucs:columnar_block
from clickhouse-driver.
Hi, @kszucs.
Yup, you are right. There is some inconsistencies in Block.data
storing:
- When you emit
INSERT
query, data for insert is stored in block in row-wise way. - On
SELECT
statement data received from CH is stored inBlock.data
in column-wise way.
That's why get_rows
method is used to transpose received column-wise data to row-wise. This behavior should be split into different blocks later.
You can check this branch: https://github.com/mymarilyn/clickhouse-driver/tree/feature-deferred-rows-length-validation. There are some speed optimizations on SELECT
.
If you want to do some research on performance you can use following profiling snippets:
from clickhouse_driver import Client
c = Client('localhost')
%prun c.execute('SELECT * FROM large_table')
from clickhouse_driver import Client
c = Client('localhost')
%prun c.execute('INSERT INTO test (a, b, c) VALUES', [(x, x, x) for x in range(N)])
from clickhouse-driver.
If you need only to implement columnar version of QueryResult
you can implement get_columns
method that will pick raw block.data
. After if you can iteratively .extend()
this data in result.
That's it.
from clickhouse-driver.
I've created a PR according to your comment.
from clickhouse-driver.
@xzkostyan would You mint to draft a new release? I'd like to use here the columnar result extending fix.
from clickhouse-driver.
Sure! I'll make new release on Saturday or Sunday.
from clickhouse-driver.
Great! Thanks Kostya!
from clickhouse-driver.
Hi, @kszucs!
0.0.8 version is released.
from clickhouse-driver.
Eventually pandas interop will be released in ibis, so I'm closing this ticket now. Thanks!
from clickhouse-driver.
Related Issues (20)
- mac m2 error HOT 1
- Not proper processing of per insert profile event
- Dynamic query with params substitution HOT 2
- Support python 3.12 or whatever HOT 3
- Using a specific name in the parameter substitutions leads to a ServerException. HOT 3
- Getting `EOFError: Unexpected EOF while reading bytes` when reading data from clickhouse HOT 2
- Access the database after disconnecting
- Code:50. Unknown type Bool
- doesnt insert bigint with execute and list of values on mac m1 HOT 4
- input_format_null_as_default is applied incorrectly to UUID columns, causes an exception in the UUID constructor HOT 1
- Running tests on clickhouse 23.8 HOT 3
- Debug query HOT 1
- BrokenPipeError when instert too large array HOT 1
- Memory leak when clickhouse raise error on data send.
- clickhouse_driver.varint.read_varint is called recursively HOT 2
- Incorrect minimum value of Date32 type
- Using both client-side and server-side binding prevents using server-side binding HOT 2
- Unexpected 'driver' HOT 1
- use_client_time_zone doesn't return offset-aware datetime object HOT 3
- execute_iter loads all rows in RAM HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clickhouse-driver.