druid-io / rdruid Goto Github PK
View Code? Open in Web Editor NEWDruid connector for R
Home Page: http://druid-io.github.io/RDruid/
License: Other
Druid connector for R
Home Page: http://druid-io.github.io/RDruid/
License: Other
I have a query that works fine with a single interval, but when I define a list of intervals it fails.
intervals <- interval(ymd(20130701), ymd(20130720))
intervals <- list(interval(ymd(20130701), ymd(20130710)), interval(ymd(20130710), ymd(20130720)))
If it uses the first one, it succeeds. If I use the second one, it fails with
Error in UseMethod("toISO", t) :
no applicable method for 'toISO' applied to an object of class "list"
ThetaSketch is nice feature. Please support thetaSketchs
@xvrl
Druid 0.6 supports HyperUniques, Update RDruid client Library to support HyperUniques
Doing something in the query specification that causes the resulting query json to be invalid does not allow the error reported from druid to be bubbled up.
For example,
"filter": {
"type": "and",
"fields": [
{
"type": "and",
"fields": [
true
]
}
]
}
]
}
should report an error like:
{"error":"Unexpected token (VALUE_TRUE), expected FIELD_NAME: missing property 'type' that is to contain type id (for class io.druid.query.filter.DimFilter)\n at [Source: HttpInputOverHTTP@46ca8094; line: 1, column: 967] (through reference chain: java.util.ArrayList[0]->java.util.ArrayList[0])"}
But instead it simply shows:
Error in httr::http_condition(res, "error", message = err$error, call = sys.call(-1)) :
unused argument (message = err$error)
Installing RDruid
"D:/Program Files/R/R-3.3.1/bin/x64/R" --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL \
"C:/Users/CharlesAllen/AppData/Local/Temp/RtmpAzbBzo/devtools2de4790d6462/druid-io-RDruid-7e205cf" \
--library="D:/Program Files/R/R-3.3.1/library" --install-tests
'D:\Program' is not recognized as an internal or external command,
operable program or batch file.
Error: Command failed (1)
"Can not deserialize instance of java.util.ArrayList out of VALUE_STRING token\n at [Source: [B@1c968b4; line: 27, column: 20]\n"
I believe the problem is json function which calls toJSON with asIs =[False] ie default so that single column vector is converted into scalar ie "dimensions": "country" instead of dimensions: ["country"]
But I do not know whether setting asIs
json <- function(obj, ...) {
toJSON(obj, digits=22, collapse="", asIs=T,...)
} will cause other things to fail!
query.js <- json(list(intervals = as.list(toISO(intervals)),
aggregations = renameagg(aggregations),
dataSource = dataSource,
filter = filter,
having = having,
granularity = granularity,
dimensions = dimensions,
postAggregations = renameagg(postAggregations),
limitSpec = limitSpec,
queryType = "groupBy",
context = context), pretty=verbose)
While running:
druid.query.dimensions(url = druid.url(host="my_hostname",
port=8082),dataSource = "my_data_source_name")
I get following error:
Error in simplify(obj, simplifyVector = simplifyVector, simplifyDataFrame = simplifyDataFrame, :
unused argument (encoding = NULL)
jsonlite version: 0.9.19
httr version: 1.1.0
Will we get a working R package for Druid? Pydruid seems to develop quickly, in contrast to RDruid.
hi:
since the raw api of druid has a search method, it would be better that RDruid has search method!
If there are missing fields , then laply fails ( error - result must have same number of dimensions [eg missing country code for a certain record])
this would be my change - but am not an R expert
druid.groupBytodf <- function(result) {...
query.R line 75:
df[, c] <- laply(result, function(x){ if(is.null(x$event[c][[1]])) NA else x$event[c][[1]] })
I get the following error : "Error in x$timestamp : $ operator is invalid for atomic vectors" given the following query.
start = ymd(20150113)
end = ymd(20150120)
r = druid.query.groupBy(
url = druid.url("master1.dw.xyz.com", port=8080),
dataSource = "mysource",
intervals = interval(start, end),
aggregations = list(
sum(metric("mycounter"))
),
granularity = granularity("all"),
dimensions = list('xyz')
)
If I run the query with rawData=T, it works and the set looks like this.
version timestamp event.mycounter event.xyz
1 v1 2015-01-13T00:00:00.000Z 576 <NA>
2 v1 2015-01-13T00:00:00.000Z 1167 1
3 v1 2015-01-13T00:00:00.000Z 5 10
4 v1 2015-01-13T00:00:00.000Z 102 10080
5 v1 2015-01-13T00:00:00.000Z 24 1020
6 v1 2015-01-13T00:00:00.000Z 25 1080
7 v1 2015-01-13T00:00:00.000Z 2 11
8 v1 2015-01-13T00:00:00.000Z 29 1140
9 v1 2015-01-13T00:00:00.000Z 79 11520
10 v1 2015-01-13T00:00:00.000Z 6 12
11 v1 2015-01-13T00:00:00.000Z 55 120
12 v1 2015-01-13T00:00:00.000Z 25 1200
13 v1 2015-01-13T00:00:00.000Z 35 1260
14 v1 2015-01-13T00:00:00.000Z 65 12960
I looked into query.R / groupBytodf where it fails in ts <- laply(result, function(x) { x$timestamp }
whereas class(x$timestamp) is "character".
Any ideas what is wrong here?
Ciao,
Martin
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.