Comments (4)
We need more details - Are there any errors raised?
Run ".show ingestion failures" on your cluster and see if errors pop up.
Also, the default ingestion policy batches the data for a few minutes before ingesting it and the connector waits for the ingestion to complete (by default).
Try running it with a shorter ingestion batching policy - https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/batchingpolicy
from azure-kusto-spark.
Thank you for the troubleshooting steps. Apparently this is a resource issue on the free tier ADX.
The cluster has reached its full data capacity of 10,468,982,784 bytes, to ingest more data please reduce the data size by either deleting some of the data or adjusting the retention policy.
I thought (based on this page) that the limit was 100 GB, but in any case, the issue is not with the connector.
I appreciate the guidance on the batching too.
from azure-kusto-spark.
One thing that can be related to the connector and if your cluster is low on resources its worth considering - the connector can create leftovers which are supposed to be cleaned at the end of each run - if not they are auto-cleaned by the service after a week (this can happen if a 'write' run is canceled before fully waited). Read here about stagingResourcesAutoCleanupTimeout
https://github.com/Azure/azure-kusto-spark/blob/master/docs/KustoSink.md
This cannot happen in the connector 'Queued' mode.
from azure-kusto-spark.
Would this be the "on demand" equivalent of the staging resources auto cleanup? This seems to trigger an IndependentGarbageCollectionRequested
in the cluster.
.clean databases extentcontainers
from azure-kusto-spark.
Related Issues (20)
- Ingest without spark temp tables HOT 2
- Ingestion hangs when using Maven package but works fine with GitHub release HOT 17
- Azure CLI authentication support HOT 3
- no option to pass in the appId/appKey with the API call for authentication in Synapse HOT 2
- Support for Scala 2.13 HOT 2
- Write to Kusto in Synapse with option "sparkIngestionPropertiesJson" always failed in spark 3.3 HOT 2
- Cannot write to ADX from Azure Databricks using Kusto connector for pyspark "com.microsoft.kusto.spark.datasource" HOT 6
- ThrottleExceptions when writing data to ADX/Kusto HOT 12
- Stuck at connecting to Kusto HOT 1
- Ingestion fails for tables with "-" in the names HOT 1
- KUSTO_MANAGED_IDENTITY_AUTH is not a member of com.microsoft.kusto.spark.common.KustoOptions and com.microsoft.kusto.spark.datasink.KustoSinkOptions HOT 7
- Importing the spark connector enables verbose logging HOT 6
- com.microsoft.azure.kusto.data.auth.CloudDependentTokenProviderBase.initializeWithCloudInfo throws Null Pointer Exception HOT 2
- Overwrite data option not working HOT 1
- Spark write to Synapse error: java.lang.NoClassDefFoundError: com/twitter/util/TimeoutException HOT 17
- Unable to Authenticate Using Managed Identity HOT 3
- ExtendedKustoClient: Some extents were not processed and we got an empty move result'1' Please open issue if you see this trace. At: https://github.com/Azure/azure-kusto-spark/issues HOT 1
- DeviceAuthentication does not exist in the JVM on Databricks Runtime 14.3 LTS HOT 2
- Data from subsequent batches are skipped after an BlobAlreadyReceived_BlobAlreadyFoundInBatch error HOT 4
- Dependency issues after update to maven kusto-spark_3.0_2.12:5.0.7 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from azure-kusto-spark.