Comments (11)
@FrancoisZim any ideas?
from hyper-api-samples.
@JMXAAS - will try to recreate this but I may need you to provide part of your debug.log file to understand the defect. Will allocate some time to this in the next three days.
from hyper-api-samples.
I've created a branch with the fix for this bug: https://github.com/FrancoisZim/hyper-api-samples/tree/clouddb-extractor-bugfix
Can you let me know if this resolves the issue and I will commit to the main repo.
from hyper-api-samples.
FYI, for smaller tables it actually works fine, but not when you reach 10M rows...
from hyper-api-samples.
Thanks alot @FrancoisZim
Here's the debug of my latest run.
Bigquery
Tableau Online
10,9M Rows
from hyper-api-samples.
Awesome! It worked @FrancoisZim!
Is there some settings in uploading number of rows that can be tweaked? At the moment it just uploads ~400k rows in every chunks to the extract
from hyper-api-samples.
Thanks for testing. There is a constant that you can change to include more chunks in each extract.
Odd that you are only getting 400K rows as I thought the default BigQuery behaviour was 1G per chunk - so should be 5G of CSV per hyper file?
BLOBS_PER_HYPER_FILE: int = 5
from hyper-api-samples.
That's what I read as well, but we get ~6Mb chunks of gzipped CSV?
Yes, I played around with the BLOBS_PER_HYPER_FILE and you clearly see it loads several chunks to the hyperfile. Although what I noticed when increasing the constant was if the hyperfile exceeded the 64Mb limit it started to split the hyperfile to multipart but failed every time. Running the script as with lower constant ending up in hyperfiles below 64Mb worked fine.
I'll see if i can recreate the issue again when exceeding the 64Mb
from hyper-api-samples.
Would be really useful if you could send me one of the debug logs for the multi-part >64MB bug. That should be handled by TSC so I will include a fix for this as well
from hyper-api-samples.
I'll probably manage to try it out tomorrow again
from hyper-api-samples.
@FrancoisZim attaching debug.log from todays test where I increased BLOBS_PER_HYPER_FILE to create a larger hyperfile.
What's not in the logfile is the error message. No extract is published
`Traceback (most recent call last):
File "/hyper-api-samples/Community-Supported/clouddb-extractor/extractor_cli.py", line 354, in
main()
File "/hyper-api-samples/Community-Supported/clouddb-extractor/extractor_cli.py", line 296, in main
extractor.export_load(
File "/hyper-api-samples/Community-Supported/clouddb-extractor/base_extractor.py", line 182, in execution_timer
result = func(*args, **kw)
File "/hyper-api-samples/Community-Supported/clouddb-extractor/base_extractor.py", line 776, in export_load
self.publish_hyper_file(path_to_database, tab_ds_name, publish_mode)
File "/hyper-api-samples/Community-Supported/clouddb-extractor/base_extractor.py", line 527, in publish_hyper_file
datasource = self.tableau_server.datasources.publish(datasource, path_to_database, publish_mode)
File "/usr/local/lib/python3.10/dist-packages/tableauserverclient/server/endpoint/endpoint.py", line 292, in wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tableauserverclient/server/endpoint/endpoint.py", line 334, in wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tableauserverclient/server/endpoint/endpoint.py", line 334, in wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tableauserverclient/server/endpoint/datasources_endpoint.py", line 298, in publish
server_response = self.post_request(url, xml_request, content_type)
File "/usr/local/lib/python3.10/dist-packages/tableauserverclient/server/endpoint/endpoint.py", line 249, in post_request
return self._make_request(
File "/usr/local/lib/python3.10/dist-packages/tableauserverclient/server/endpoint/endpoint.py", line 166, in _make_request
self._check_status(server_response, url)
File "/usr/local/lib/python3.10/dist-packages/tableauserverclient/server/endpoint/endpoint.py", line 189, in _check_status
raise ServerResponseError.from_response(server_response.content, self.parent_srv.namespace, url)
tableauserverclient.server.endpoint.exceptions.ServerResponseError:
400011: Bad Request
There was a problem publishing the file '26261:5027d532c029418681b095ae9b659165-0:0'.`
from hyper-api-samples.
Related Issues (20)
- will hyper release the arrch64 linux version? HOT 2
- the max level of WITH RECURSIVE is about 150000? how to increase it? HOT 3
- why query this parquet file reports Scanning of nested columns in Parquet files is disabled? HOT 1
- the external file cannot be joined? HOT 3
- suggest add data type promote
- Bad performance on inequality joins HOT 2
- Add support for Azure HOT 1
- tableauhyperapi (v0.0.16123) HyperException: This database does not support 128-bit numerics. HOT 3
- Connect multiple tables only possible with joins? HOT 1
- Could not interpret 'experimental_external_s3' as global setting: No internal setting named 'experimental_external_s3' exists HOT 1
- does hyper support reading multiple files from the same folders HOT 1
- wrong result when multiply by a calculus HOT 1
- Query 18 spill to disk and take a lot of space HOT 1
- "'type' must be a SqlType instance" HOT 3
- Read from s3 private buckets HOT 1
- Unable to install tableauhyperapi on my macbook pro HOT 5
- File is locked HOT 5
- add support for CREATE EXTERNAL TABLE HOT 4
- can hyper python API use multi-core๏ผ HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hyper-api-samples.