Comments (6)
Hi,
You're not doing anything wrong. I just havent had the time to add the functionality that lets you create one data frame from a folder.
That's the next thing on the plan but it's not currently supported
from easycsv.
Hi, instead of assigning to the global environment, why don't you assign to a list and apply rbindlist on it?
fread_folder <-
function (directory = NULL, extension = "CSV", sep = "auto",
nrows = -1L, header = "auto", na.strings = "NA", stringsAsFactors = FALSE,
verbose = getOption("datatable.verbose"), skip = 0L, drop = NULL,
colClasses = NULL, integer64 = getOption("datatable.integer64"),
dec = if (sep != ".") "." else ",", check.names = FALSE,
encoding = "unknown", quote = "\"", strip.white = TRUE,
fill = FALSE, blank.lines.skip = FALSE, key = NULL, Names = NULL,
prefix = NULL, showProgress = interactive(), data.table = TRUE)
{
if ("data.table" %in% rownames(installed.packages()) ==
FALSE) {
stop("data.table needed for this function to work. Please install it.",
call. = FALSE)
}
if (is.null(directory)) {
os = Identify.OS()
if (tolower(os) == "windows") {
directory <- utils::choose.dir()
if (tolower(os) == "linux" | tolower(os) == "macosx") {
directory <- choose_dir()
}
}
else {
stop("Please supply a valid local directory")
}
}
directory = paste(gsub(pattern = "\\", "/", directory, fixed = TRUE))
endings = list()
if (tolower(extension) == "txt") {
endings[1] = "*\\.txt$"
}
if (tolower(extension) == "csv") {
endings[1] = "*\\.csv$"
}
if (tolower(extension) == "both") {
endings[1] = "*\\.txt$"
endings[2] = "*\\.csv$"
}
if ((tolower(extension) %in% c("txt", "csv", "both")) ==
FALSE) {
stop("Pleas supply a valid value for 'extension',\n\n allowed values are: 'TXT','CSV','BOTH'.")
}
tempfiles = list()
temppath = list()
tempdf_list = list()
num = 1
for (i in endings) {
temppath = paste(directory, list.files(path = directory,
pattern = i), sep = "/")
tempfiles = list.files(path = directory, pattern = i)
num = num + 1
if (length(temppath) < 1 | length(tempfiles) < 1) {
num = num + 1
} else {
temppath = unlist(temppath)
tempfiles = unlist(tempfiles)
count = 0
for (tbl in temppath) {
count = count + 1
DTname1 = paste0(gsub(directory, "", tbl))
DTname2 = paste0(gsub("/", "", DTname1))
if (!is.null(Names)) {
if ((length(Names) != length(temppath)) |
(class(Names) != "character")) {
stop("Names must a character vector of same length as the files to be read.")
} else {
DTname3 = Names[count]
}
} else {
DTname3 = paste0(gsub(i, "", DTname2))
}
if (!is.null(prefix) && is.character(prefix)) {
DTname4 = paste(prefix, DTname3, sep = "")
} else {
DTname4 = DTname3
}
DTable <- data.table::fread(input = tbl, sep = sep,
nrows = nrows, header = header, na.strings = na.strings,
stringsAsFactors = stringsAsFactors, verbose = verbose,
skip = skip, drop = drop, colClasses = colClasses,
dec = if (sep != ".") "." else ",",
check.names = check.names, encoding = encoding,
quote = quote, strip.white = strip.white,
fill = fill, blank.lines.skip = blank.lines.skip,
key = key, showProgress = showProgress, data.table = data.table)
# assign_to_global <- function(pos = 1) {
# assign(x = DTname4, value = DTable, envir = as.environment(pos))
# }
# assign_to_global()
tempdf_list <- append(tempdf_list, list(DTable))
rm(DTable)
}
}
}
tempdf = data.table::rbindlist(tempdf_list)
if(!data.table) {
tempdf = as.data.frame(tempdf)
}
return(tempdf)
}
from easycsv.
@alexfun looks good. Do you want to create a pull request?
from easycsv.
@bogind I would be more than happy to submit the function above, however I am not sure whether you had something in mind with the code that assigns a variable name based on the file name in the folder. If you would like, I can add a new parameter combine
taking one of the following values: c("data.frame", "global", "list")
so that
global
preserves existing behaviour.list
returns a named list of the csvs, using the currently used naming convention.data.frame
returns one data frame viarbindlist
.
from easycsv.
@alexfun The combine
parameter seems logical, I think the regular behavior should be using global
as the value
from easycsv.
ok, i will write the code with global
as the default behaviour and submit it to you for review.
from easycsv.
Related Issues (3)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from easycsv.