delta-rho / trelliscope Goto Github PK
View Code? Open in Web Editor NEWDetailed Visualization of Large Complex Data in R
License: Other
Detailed Visualization of Large Complex Data in R
License: Other
The group
parameter does a good job of helping to organize panels but I have an exploration that has grown to almost 20 individual panels and will continue to grow. As a suggestion for an enhancement, it might be nice if I could organize panels hierarchically.
This could probably be done with the existing makeDisplay
interface letting /
indicate a sub group. The display pop-up could then show a collapsible group hierarchy.
I believe this is a typo
For greater user flexibility, I'd recommend adding 'logical' as one of the acceptable 'types' in cog(), and the just converting the TRUE or FALSE to a character string.
removeDisplay
is defined twice. Once in /R/makeDisplay.R
and the other in /R/displayObj.R
I don't know which one you want to keep
Is there a good way to debug error messages that don't show up until interacting with the viewer? I'm getting the following error message only when opening the Table Sort/Filter tab.
Error: argument is of length zero.
The message in the terminal is below.
Error in if (!is.na(curInfo$type)) { : argument is of length zero
I suspect it's a problem with the cognostics I've defined but I'm not positive (the univariate and bivariate filters work fine).
Add a "state" argument to makeDisplay()
that specifies state info to create both the base display and a view of the display with the specified state. Also make a makeView()
function that takes an existing display and adds a state to it for a new view of the display. state
would be a list, something like this:
state = list(
panelLayout = list(nrow = 2, ncol = 3),
sort = list(var1 = "asc"),
filter = list(var2 = list(from = 100, to = 200)),
skip = 3,
desc = "…"
)
The collapsible sidebar on the right of the viewer is a placeholder for a "CogMap". This is basically a scatterplot of all of the cognostics for the specified variables, with the cognostics of the currently-viewed panels being highlighted on the plot. This helps get bearings as to where what you are looking at falls with respect to specified cognostics.
Currently this sidebar covers up panels. It should not do this - the main panel area should resize to accommodate for it.
I'm getting the intermittent error:
Error in ncol(data) :
error in evaluating the argument 'x' in selecting a method for function 'ncol': Error in `[.data.frame`(curRows, i, labelVars, drop = FALSE) :
undefined columns selected
The error can be reproduced by sourcing, https://raw.githubusercontent.com/kaneplusplus/akde-pubmed/master/ebola_vis/example.r, and changing between displays and changing the panel labels.
The code below produces the error
Error in list2env(cdo$relatedData) : first argument must be a named list
when trying to view individual displays.
library(trelliscope)
iris_small = iris[c(1:2, 51:52, 101:102),]
vdbConn("iris_small_test", name="iris_small_test", autoYes=TRUE)
by_species = divide(iris_small, by="Species", update=TRUE)
makeDisplay(by_species,
name="iris",
panelFn=function(x) qplot(Sepal.Length, data=x, geom="histogram"),
cogFn=function(x) list(num_plants=cog(nrow(x), "Plants")))
https://github.com/tesseradata/trelliscope/blob/master/R/makeDisplay.R#L28
This line links to a thing about inputVars
. I do not know where this is suppose to link to
* checking DESCRIPTION meta-information ... NOTE
Deprecated license: BSD
Any different name to use?
In the "Visible Cognostics" menu, users should have the option of selecting no cognostics to display. It currently allows deselecting all of them, but then shows the default label again in the display.
A user may want to look at visual filters of cognostics given the current filtered state. I think it would be useful in the scatterplots and quantile plots to show filtered points in a different color (gray?). But then if the "conditional" button is clicked, all filtered points should be removed from the interactive plot.
It has also been suggested to change the naming of "conditional" and "marginal" as someone not familiar with statistical distribution terminology might not get it. Any ideas here are welcome. e.g. marginal=all, conditional=after filtering...
NA's for any cognostic cause the entire Table Sort/Filter view to throw an error, and it also causes the Univarite Filter to fail for that particular cognostic. Would be nice to add the following functionality:
When the cognostic function is applied in makeDisplay(), run some quick check over the resulting cognostics list, and if there are NA's, remove them from the list before passing them on to the display (so they never appear in the display), and issue a warning or note to the user explaining why the cognostic is not present in the viewer.
* checking R code for possible problems ... NOTE
Any ideas on the displayListNames, displayList, displayListDF, or logMsg ? I don't know where they're coming from.
In the cogDisplayHref function, the parameter displayGroup has a default value of NULL, but that leads to URLs that are invalid. The displayGroup parameter must be specified to get a valid URL. Recommend changing the default value to "common" instead, as this is the default group used by makeDisplay.
Currently it loads a full-resolution panel from each display, which is slow.
See this example. It runs without error or warning--but the xlims
and ylims
are not fixed.
library(trelliscope)
vdbConn("iris_test", name = "iris_test", autoYes = TRUE)
by_species <- divide(iris, by = "Species", update = TRUE)
makeDisplay(by_species,
name ="iris",
panelFn = function(x) {plot(x$Sepal.Length, x$Petal.Width); return(NULL)},
cogFn = function(x) list(Sepal.Width.Mean = cogMean(x$Sepal.Width)),
lims = list(x = "same", y = "same",
prepanelFn = function(x) list(xlim = range(x$Sepal.Length),
ylim = range(x$Petal.Width))))
view()
This will help organize possible cognostics to choose from when the number of cognostics is large.
This will allow the user to specify which cognostics they currently care about to reduce clutter in the other interactive cognostics controls.
Is the collect
function suppose to come from the memisc
package? It currently can not be found
Use data.tables for cognostics storage and interactions for faster performance.
The multivariate filter currently does not update the selected panels after being highlighted and "apply" is clicked. Fix this. This is a quick fix, just needs to be done.
For reference, the multivariate filter computes an independent components projection of 2 or more selected quantitative cognostics, in hopes to find an interesting projection of these variables, with the idea that interesting subsets of panels can be found through this mechanism.
If my panel function doesn't render anything for one subset, can we put in a filler image?
On Windows, the thumbnail view of the display in "Open Displays" is blank (at least on Chrome, haven't tested it on Firefox).
https://github.com/tesseradata/trelliscope/blob/master/R/conn_vdb.R#L14
This does not link to anything. :-(
As an alternative to viewing a display in the trelliscope viewer, create a print()
method that will write the display to a multi-page pdf file.
Instead of rendering page by page, for a fixed state of filtering, sorting, sampling, and panel layout, start to build a cache of hidden divs containing each page's rendered content.
Thus when the user clicks next, if the next page has been pre-cached, it will simply load quickly, otherwise, it will trigger the output to be rendered the usual way. Users typically stop to study the current page for a small amount of time, so it would be good to remove the page changing latency by rendering it while they are viewing.
I think the best way to do this is by a specially-numbered div for each page and a special shiny output that keeps triggering more data to be sent as it keeps rendering.
Add the ability to save the sort/filter/sample/etc. state of a display, essentially creating a view of the display, with the ability to annotate what the state signifies. Views would show up in the display list as sub-items under the display they were created from. Selecting the view would show the display in the state it was saved in.
This will allow the user to specify an additional set of displays created against the same division of data to be shown alongside the currently-selected display in the Trelliscope viewer. The trick is to figure out the best way to position and size multiple displays of varying aspect ratio.
Behavior: using commands such as vdbInit with a directory nested into a directory not yet created will fail.
Suggestion:
grep all uses of dir.create and use rec=TRUE
e.g.:
vdbDir <- "/tmp/jrounds/vdbtest"
vdbInit(vdbDir, name="testVDB", autoYes=TRUE)
Error in vdbInit(vdbDir, name = "testVDB", autoYes = TRUE) :
Could not create directory.
In addition: Warning message:
In dir.create(path) :
cannot create dir '/tmp/jrounds/vdbtest', reason 'No such file or directory'
Add a minimized file that can be sourced..
Main issue was long file path for R CMD check
Is there an easy way we can automatically check the following conditions:
I was looking around at travis ci stuff, but it wasn't clear to me how that might work, since we'd be checking the code itself--not the execution of it.
Add options for export the data corresponding to a given panel as .rda or .csv, etc.
On the last step in the housing data tutorial:
vdbPrefix is C:\Users\Brian\Documents\housingjunk\vdb
Warning in file(con, "rb") :
cannot open file 'C:\Users\Brian\Documents\housingjunk\vdb/displays/common/list_sold_vs_time/thumb_small.png': No such file or directory
Error in file(con, "rb") : cannot open the connection
The "list_sold_vs_time" directory is there, but no files are there. Then there is the error...
Can you fix this?
Error in [.kvLocalDisk
(cdo$panelDataSource, cogDF$panelKey) :
It appears you are trying to retrive a subset of the data using a hash of the key. Key hashes have not been computed for this data. Please call updateAttributes() on this data.
This covers a few of the outstanding issues. The idea is to have an overall notion of state in Trelliscope displays. By state, we are talking about being able to specify how a display is being shown with respect to sorting, filtering, panel layout, panel labels, etc.
Places where we would like to specify the state include:
state
argument to makeDisplay()
view()
The approach I have been working on uses a URL hash to store the state as interactions occur. Any time a state variable is changed (sorting, etc.), the URL will update. Thus at any point in the viewing process, the URL can be copied and shared with others to preserve the state (not everything is preserved, such as the current page you are on and related displays).
State is specified from the R console by a named list. Consider this example:
library(trelliscope)
vdbConn(file.path(tempdir(), "testVDB"), name = "test", autoYes = TRUE)
set.seed(1234)
iris$alpha <- sample(letters[1:5], 150, replace =TRUE)
bySpeciesAlpha <- divide(iris, by = c("Species", "alpha"))
makeDisplay(bySpeciesAlpha,
name = "testBySpeciesAlpha",
panelFn = function(x)
plot(x$Sepal.Length, x$Sepal.Width)
)
view("testBySpeciesAlpha", state = list(
layout = list(nrow = 2, ncol = 2),
sort = list(Species = "asc", alpha = "desc"),
filter = list(
alpha = list(select = c("a", "b", "c")),
Species = list(regex = "a$"))
labels = c("Species", "alpha")
))
Here, we add a new categorical variable to the iris data, split on both species and this new variable, and then create a dummy plot. Then, when we call view, we tell the viewer to launch with that display showing, reflecting the specified state. The panels will be laid out in 2 rows and 2 columns. Note that we can also specify arrange="row"
(default) or arrange="col"
in the layout. The panels will be sorted by Species
ascending and alpha
descending. The panels will be filtered on alpha
so that panels only for the letters a-c are showing and filter on Species
so that only species ending with the letter a are showing. For numeric variables, we can specify, for example with a variable called var
, the following: filter = list(var = list(from = 0, to = 1))
. Finally, we can specify labels by simply providing a character vector of cognostic variable names to display beneath each panel. By default, all splitting variables will be shown (if labels is not specified).
A function, validateState()
should be available to ensure the state is specified correctly and optionally check if it matches variables for a given display (provided through the display name).
This display will launch and should have the following URL:
As you interact with the display, the URL should reflect what you have done, and if you paste this URL in a new window, the display should be shown in the state specified by the URL.
Another thing we would like to be able to do is make it easy to reference other displays through cognostics. For example, suppose that I also have a division of the iris data by species. For each display of species, I might want to have a link the use can click on that will open up our display by both species and alpha but only show the panels for the corresponding species. This should be done through a cognostics helper function, cogDisplayHref()
. Here's an example:
bySpecies <- divide(iris, by = "Species")
makeDisplay(bySpecies,
name = "testBySpecies",
panelFn = function(x)
plot(x$Sepal.Length, x$Sepal.Width),
cogFn = function(x) {
list(alphaLink = cogDisplayHref(
desc = "species broken down by alpha",
displayName = "testBySpeciesAlpha",
displayGroup = "common",
state = list(
filter = list(Species = list(select = getSplitVar(x, "Species"))),
layout = list(nrow = 1, ncol = 5)
)
))
}
)
view(name = "testBySpecies", state = list(labels = c("Species", "alphaLink"), layout = list(ncol = 3)))
Here, we create the display with a cognostic that builds a link to our previous display, but should filter on the species we would like to view them for. After calling the view statement, we should see the three panels with a link showing (since we specified it with labels =
) that we can click on.
Comments would be stored, which would require thought on a portable database and how to keep it in sync with the web server, etc. This would also probably require a user login facility.
I'm getting errors when I try to use the hexbin view in the bivariate filter. The scatter view works fine, but when I switch to the hexbin view I see "Error: NA/NaN/Inf in foreign function call (arg 1)" in Trelliscope. In my R session, it put out
Error in hexbin:::hexbin(cogDF[, xVar], cogDF[, yVar], shape = shape, :
NA/NaN/Inf in foreign function call (arg 1)
The error persisted when I loaded the hexbin package in the global environment.
Here's my sessionInfo():
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] fastICA_1.2-0 hexbin_1.26.3 ggvis_0.3.0.9001 microbenchmark_1.3-0 mapproj_1.2-2
[6] maps_2.3-7 XML_3.98-1.1 ggmap_2.3 data.table_1.9.2 scagnostics_0.2-4
[11] rJava_0.9-6 base64enc_0.1-1 digest_0.6.4 jsonlite_0.9.11 shiny_0.10.0
[16] dplyr_0.2.0.99 lubridate_1.3.3 trelliscope_0.7.9.2 ggplot2_1.0.0 lattice_0.20-29
[21] datadr_0.7.3 Rcpp_0.11.2 devtools_1.5
loaded via a namespace (and not attached):
[1] assertthat_0.1 bitops_1.0-6 caTools_1.17 codetools_0.2-8 colorspace_1.2-4
[6] DBI_0.3.0 evaluate_0.5.5 formatR_0.10 gtable_0.1.2 htmltools_0.2.4
[11] httpuv_1.3.0 httr_0.3 knitr_1.6 labeling_0.2 magrittr_1.0.1
[16] markdown_0.7 MASS_7.3-33 memoise_0.2.1 munsell_0.4.2 plyr_1.8.1
[21] png_0.1-7 proto_0.3-10 R6_2.0 RCurl_1.95-4.1 reshape2_1.4
[26] RgoogleMaps_1.2.0.6 rjson_0.2.14 RJSONIO_1.2-0.2 scales_0.2.4 stringr_0.6.2
[31] testthat_0.8.1 tools_3.1.1 whisker_0.3-2 xtable_1.7-3
Is there a way to save the output displayed on the screen? Maybe being able to save the plot as a .png or .pdf such that when something is visually interesting it can be saved quickly to a file for future reference. Something other than a rough screenshot capture.
How are Rmarkdown documents supported in trelliscope? I remember that they are but I don't remember how.
The panel function essentially needs to take data for a given subset and turn that into html to put inside a div. Right now that content is simply a png file.
To make the panel function more generic, we can implement different types. For example, if the panel function wants to render a d3 plot, it should probably return a set of instructions for how to render the data (such as a link to d3 javascript code) as well as a json object of the data for the subset that is ready to be rendered. A useful one to consider would be generic support of RCharts.
However, one important consideration is for each potential rendering type, we would need to think about providing methods that allow us to rescale each panel. Scales are extremely important in trelliscope and if the user has specified "sliced" or "same", we want to modify the plots to have the appropriate axis limits.
If users have a parallel backend registered, Trelliscope should use it to render plots in parallel when they're called up the interactive Shiny session. This would be especially nice for ggplot graphics since they take so much longer to render than trellis.
When the user specifies "cogGeo" cognostics, use a leaflet.js map in the bivariate filter area for interaction. There is some thought as to how to go about this in a way that will be useful vs. simply flashy.
The ordering if panels in the table layout is by row. It would be good to have an option to specify that you want them arranged by column as well.
When launching the display with view()
(or when making the display), there should be an option to set an initial viewing state, including the rows and columns in the display and initial cognostics and filters.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.