Comments (4)
Just a quick note that I'm having memory issues with spread(..., drop=FALSE)
. If I use spread(..., drop=TRUE)
then everything works out fine, the process takes just a few seconds, and the result is of size 0.2Mb.
My input dataset is 0.4MB, has 6000 rows, and 11 variables. This is the result of a filter on a dataset which is of size 200Mb. When running with spread(..., drop=FALSE)
, the rsession memory expands to over 20Gb.
Unfortunately I can't provide the exact dataset, but if there is anything I can provide to help, I'll be happy to do so.
from tidyr.
I have not. It might be possible to replace the vectorised R code with optimised C++ code that would need less memory.
from tidyr.
How many unique values are there in the variables that you are spreading? It is easy to create very very large data frames with spread.
from tidyr.
There are some numeric variables with a few thousand unique values, but isn't spread
just going to make a variable for each key? Also, by virtue of spread(..., drop=TRUE)
working fine, the only variables remaining to spread only have one value: NA
.
from tidyr.
Related Issues (20)
- convert option for separate_wider_* HOT 1
- Generality of nest, which fails depending on the number of rows. Applying nest to tibble of 0 rows HOT 1
- "align both" in separate_wider_* HOT 3
- Consider changing dplyr suggestion in pivot_wider to use .by HOT 1
- testthat tests failing HOT 4
- FR: Provide optional argument to `tsibble::fill_gaps` to specify desired interval. HOT 2
- Feature Request: Allow across() for column selection in complete() HOT 4
- value in gather cannot be the same as any of the original dataset variables HOT 1
- `crossing()` adds missing factor levels (either a bug or a documentation issue)
- Upkeep for tidyr (2023)
- Make `separate_longer_*()` and `separate_wider_*()` generics?
- `pivot_longer` converts variable labels to new value labels HOT 1
- Release tidyr 1.3.1
- separate_wider_delim changes input column names when using names_sep with cols_remove=FALSE
- Change in pivot_wider behavior and error messages when using column numbers for `id_cols` HOT 1
- Feature request: function count the missing value
- Feature request: `.vary` in `expand_grid()`
- Solutions for a crowded namespace: selective removal of items? Any better ideas?
- int64 summation fails HOT 3
- Error in tidyr package & issue with columns being added to Markdown output where used to put into rows (no change to code) HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tidyr.