bogind / easycsv Goto Github PK
View Code? Open in Web Editor NEWAn R package for easy data loading from multiple tables
License: GNU Lesser General Public License v2.1
An R package for easy data loading from multiple tables
License: GNU Lesser General Public License v2.1
Hello, in our efforts to prepare data.table
for an update on CRAN (-> v1.11.0), we noticed that recent changes will break your package:
install.packages('data.table', type = 'source', repos = 'http://Rdatatable.github.io/data.table')
pkg = 'easycsv'
install.packages(pkg, dependencies = TRUE)
tools::testInstalledPackage(pkg, outDir = "~/Desktop/dt_revdep")
library(pkg, character.only = TRUE)
fn = 'fread_folder'
example(fn, character.only = TRUE)
Error: isTrueFalse(showProgress) is not TRUE
This is because the default argument of showProgress
has changed to be equivalent to the output of interactive()
, and the default value of getOption('datatable.showprogress')
has been deleted.
Please follow Rdatatable/data.table#2785 and feel free to provide input there if you have strong opinions about how to proceed.
Note that this issue also effects your SIRItoGTFS
package; I'll file an official issue there once the aforementioned pull request is resolved, if needed.
Hi, I am using the fread_folder function of your easycsv
package, but I am unable to access the resulting data.table.
For example,
> vir <- fread_folder("folder_name")
> vir
NULL
How do you access the resulting data.table? Will you add some examples?
Thank you.
After running fread_folder I'm left with a few hundred dataframes in my environment but there is no merged dataframe generated. I'm not sure if it is just the csv files I'm using. Maybe I'm just an edge case. It's the first time I've used easycsv.
...
sep=',' with 2 lines of 3 fields using quote rule 0
sep=0x9 with 3 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area
builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (546 bytes from row 1 to eof) / (2 * 546 jump0size) == 0
Type codes (jump 000) : A775775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 2 sample rows
All rows were sampled since file is small so we know nrow=2 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A775775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 2 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=422
Read 2 rows x 13 columns from 547 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
6 : int32 '5'
4 : float64 '7'
3 : string 'A'
=============================
0.000s ( 17%) Memory map 0.000GB file
0.001s ( 75%) sep='\t' ncol=13 and header detection
0.000s ( 3%) Column type detection using 2 sample rows
0.000s ( 2%) Allocation of 2 rows x 13 cols (0.000GB) of which 2 (100%) rows used
0.000s ( 3%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 2 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 0%) Transpose
+ 0.000s ( 3%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/73.csv
File opened, size = 54.40KB (55701 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 99 lines of 3 fields using quote rule 0
sep=0x9 with 100 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (55700 bytes from row 1 to eof) / (2 * 21245 jump0size) == 1
Type codes (jump 000) : A577775555A5A Quote rule 0
Type codes (jump 001) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 149 sample rows
=====
Sampled 149 rows (handled \n inside quoted fields) at 2 jump points
Bytes from first data row on line 2 to the end of last row: 55576
Line length: mean=214.02 sd=8.98 min=192 max=230
Estimated number of rows: 55576 / 214.02 = 260
Initial alloc = 286 rows (260 + 10%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
=====
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 286 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=55576
Read 260 rows x 13 columns from 54.40KB (55701 bytes) file in 00:00.005 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 5%) Memory map 0.000GB file
0.003s ( 67%) sep='\t' ncol=13 and header detection
0.000s ( 2%) Column type detection using 149 sample rows
0.000s ( 1%) Allocation of 286 rows x 13 cols (0.000GB) of which 260 ( 91%) rows used
0.001s ( 25%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 260 rows) using 1 threads
+ 0.000s ( 3%) Parse to row-major thread buffers (grown 0 times)
+ 0.001s ( 21%) Transpose
+ 0.000s ( 1%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.005s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/74.csv
File opened, size = 1.560KB (1597 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 7 lines of 3 fields using quote rule 0
sep=0x9 with 8 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (1596 bytes from row 1 to eof) / (2 * 1596 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 7 sample rows
All rows were sampled since file is small so we know nrow=7 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 7 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=1472
Read 7 rows x 13 columns from 1.560KB (1597 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 27%) Memory map 0.000GB file
0.000s ( 53%) sep='\t' ncol=13 and header detection
0.000s ( 6%) Column type detection using 7 sample rows
0.000s ( 5%) Allocation of 7 rows x 13 cols (0.000GB) of which 7 (100%) rows used
0.000s ( 10%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 7 rows) using 1 threads
+ 0.000s ( 1%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 3%) Transpose
+ 0.000s ( 6%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/75.csv
File opened, size = 342 bytes.
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 1 lines of 3 fields using quote rule 0
sep=0x9 with 2 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (341 bytes from row 1 to eof) / (2 * 341 jump0size) == 0
Type codes (jump 000) : A775575555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 1 sample rows
All rows were sampled since file is small so we know nrow=1 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A775575555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 1 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=217
Read 1 rows x 13 columns from 342 bytes file in 00:00.000 wall clock time
[12] Finalizing the datatable
Type counts:
7 : int32 '5'
3 : float64 '7'
3 : string 'A'
=============================
0.000s ( 31%) Memory map 0.000GB file
0.000s ( 54%) sep='\t' ncol=13 and header detection
0.000s ( 5%) Column type detection using 1 sample rows
0.000s ( 4%) Allocation of 1 rows x 13 cols (0.000GB) of which 1 (100%) rows used
0.000s ( 6%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 1 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 1%) Transpose
+ 0.000s ( 6%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.000s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/77.csv
File opened, size = 17.74KB (18166 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 84 lines of 3 fields using quote rule 0
sep=0x9 with 85 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (18165 bytes from row 1 to eof) / (2 * 18165 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 84 sample rows
All rows were sampled since file is small so we know nrow=84 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 84 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=18041
Read 84 rows x 13 columns from 17.74KB (18166 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 12%) Memory map 0.000GB file
0.001s ( 72%) sep='\t' ncol=13 and header detection
0.000s ( 2%) Column type detection using 84 sample rows
0.000s ( 2%) Allocation of 84 rows x 13 cols (0.000GB) of which 84 (100%) rows used
0.000s ( 13%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 84 rows) using 1 threads
+ 0.000s ( 2%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 9%) Transpose
+ 0.000s ( 2%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/8.csv
File opened, size = 1.833KB (1877 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 8 lines of 3 fields using quote rule 0
sep=0x9 with 9 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (1876 bytes from row 1 to eof) / (2 * 1876 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 8 sample rows
All rows were sampled since file is small so we know nrow=8 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 8 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=1752
Read 8 rows x 13 columns from 1.833KB (1877 bytes) file in 00:00.002 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.001s ( 39%) Memory map 0.000GB file
0.001s ( 47%) sep='\t' ncol=13 and header detection
0.000s ( 3%) Column type detection using 8 sample rows
0.000s ( 3%) Allocation of 8 rows x 13 cols (0.000GB) of which 8 (100%) rows used
0.000s ( 8%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 8 rows) using 1 threads
+ 0.000s ( 1%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 3%) Transpose
+ 0.000s ( 4%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.002s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/80.csv
File opened, size = 775 bytes.
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 3 lines of 3 fields using quote rule 0
sep=0x9 with 4 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (774 bytes from row 1 to eof) / (2 * 774 jump0size) == 0
Type codes (jump 000) : A777755555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 3 sample rows
All rows were sampled since file is small so we know nrow=3 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777755555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 3 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=650
Read 3 rows x 13 columns from 775 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
6 : int32 '5'
4 : float64 '7'
3 : string 'A'
=============================
0.000s ( 43%) Memory map 0.000GB file
0.000s ( 45%) sep='\t' ncol=13 and header detection
0.000s ( 4%) Column type detection using 3 sample rows
0.000s ( 3%) Allocation of 3 rows x 13 cols (0.000GB) of which 3 (100%) rows used
0.000s ( 5%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 3 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 1%) Transpose
+ 0.000s ( 3%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/81.csv
File opened, size = 27.89KB (28557 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 99 lines of 3 fields using quote rule 0
sep=0x9 with 100 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (28556 bytes from row 1 to eof) / (2 * 21758 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
Type codes (jump 001) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 131 sample rows
=====
Sampled 131 rows (handled \n inside quoted fields) at 2 jump points
Bytes from first data row on line 2 to the end of last row: 28432
Line length: mean=217.04 sd=8.14 min=206 max=258
Estimated number of rows: 28432 / 217.04 = 131
Initial alloc = 144 rows (131 + 9%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
=====
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 144 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=28432
Read 131 rows x 13 columns from 27.89KB (28557 bytes) file in 00:00.002 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 8%) Memory map 0.000GB file
0.001s ( 57%) sep='\t' ncol=13 and header detection
0.000s ( 2%) Column type detection using 131 sample rows
0.000s ( 2%) Allocation of 144 rows x 13 cols (0.000GB) of which 131 ( 91%) rows used
0.001s ( 32%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 131 rows) using 1 threads
+ 0.000s ( 3%) Parse to row-major thread buffers (grown 0 times)
+ 0.001s ( 26%) Transpose
+ 0.000s ( 3%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.002s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/82.csv
File opened, size = 1.156KB (1184 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 5 lines of 3 fields using quote rule 0
sep=0x9 with 6 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (1183 bytes from row 1 to eof) / (2 * 1183 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 5 sample rows
All rows were sampled since file is small so we know nrow=5 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 5 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=1059
Read 5 rows x 13 columns from 1.156KB (1184 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 35%) Memory map 0.000GB file
0.000s ( 53%) sep='\t' ncol=13 and header detection
0.000s ( 4%) Column type detection using 5 sample rows
0.000s ( 3%) Allocation of 5 rows x 13 cols (0.000GB) of which 5 (100%) rows used
0.000s ( 5%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 5 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 1%) Transpose
+ 0.000s ( 4%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/84.csv
File opened, size = 9.14KB (9357 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 44 lines of 3 fields using quote rule 0
sep=0x9 with 45 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (9356 bytes from row 1 to eof) / (2 * 9356 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 44 sample rows
All rows were sampled since file is small so we know nrow=44 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 44 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=9232
Read 44 rows x 13 columns from 9.14KB (9357 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 17%) Memory map 0.000GB file
0.001s ( 71%) sep='\t' ncol=13 and header detection
0.000s ( 2%) Column type detection using 44 sample rows
0.000s ( 2%) Allocation of 44 rows x 13 cols (0.000GB) of which 44 (100%) rows used
0.000s ( 8%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 44 rows) using 1 threads
+ 0.000s ( 2%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 4%) Transpose
+ 0.000s ( 2%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/86.csv
File opened, size = 3.249KB (3327 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 15 lines of 3 fields using quote rule 0
sep=0x9 with 16 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (3326 bytes from row 1 to eof) / (2 * 3326 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 15 sample rows
All rows were sampled since file is small so we know nrow=15 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 15 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=3202
Read 15 rows x 13 columns from 3.249KB (3327 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 26%) Memory map 0.000GB file
0.000s ( 58%) sep='\t' ncol=13 and header detection
0.000s ( 4%) Column type detection using 15 sample rows
0.000s ( 4%) Allocation of 15 rows x 13 cols (0.000GB) of which 15 (100%) rows used
0.000s ( 9%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 15 rows) using 1 threads
+ 0.000s ( 1%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 3%) Transpose
+ 0.000s ( 4%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/88.csv
File opened, size = 758 bytes.
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 3 lines of 3 fields using quote rule 0
sep=0x9 with 4 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (757 bytes from row 1 to eof) / (2 * 757 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 3 sample rows
All rows were sampled since file is small so we know nrow=3 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 3 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=633
Read 3 rows x 13 columns from 758 bytes file in 00:00.000 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 31%) Memory map 0.000GB file
0.000s ( 53%) sep='\t' ncol=13 and header detection
0.000s ( 5%) Column type detection using 3 sample rows
0.000s ( 4%) Allocation of 3 rows x 13 cols (0.000GB) of which 3 (100%) rows used
0.000s ( 7%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 3 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 1%) Transpose
+ 0.000s ( 6%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.000s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/89.csv
File opened, size = 793 bytes.
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 3 lines of 3 fields using quote rule 0
sep=0x9 with 4 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (792 bytes from row 1 to eof) / (2 * 792 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 3 sample rows
All rows were sampled since file is small so we know nrow=3 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 3 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=668
Read 3 rows x 13 columns from 793 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 29%) Memory map 0.000GB file
0.000s ( 54%) sep='\t' ncol=13 and header detection
0.000s ( 5%) Column type detection using 3 sample rows
0.000s ( 3%) Allocation of 3 rows x 13 cols (0.000GB) of which 3 (100%) rows used
0.000s ( 9%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 3 rows) using 1 threads
+ 0.000s ( 1%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 2%) Transpose
+ 0.000s ( 6%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/9.csv
File opened, size = 14.28KB (14618 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 66 lines of 3 fields using quote rule 0
sep=0x9 with 67 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (14617 bytes from row 1 to eof) / (2 * 14617 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 66 sample rows
All rows were sampled since file is small so we know nrow=66 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 66 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=14493
Read 66 rows x 13 columns from 14.28KB (14618 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 16%) Memory map 0.000GB file
0.001s ( 71%) sep='\t' ncol=13 and header detection
0.000s ( 2%) Column type detection using 66 sample rows
0.000s ( 2%) Allocation of 66 rows x 13 cols (0.000GB) of which 66 (100%) rows used
0.000s ( 9%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 66 rows) using 1 threads
+ 0.000s ( 2%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 5%) Transpose
+ 0.000s ( 2%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/90.csv
File opened, size = 536 bytes.
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 2 lines of 3 fields using quote rule 0
sep=0x9 with 3 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (535 bytes from row 1 to eof) / (2 * 535 jump0size) == 0
Type codes (jump 000) : A757575555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 2 sample rows
All rows were sampled since file is small so we know nrow=2 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A757575555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 2 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=411
Read 2 rows x 13 columns from 536 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
7 : int32 '5'
3 : float64 '7'
3 : string 'A'
=============================
0.000s ( 39%) Memory map 0.000GB file
0.000s ( 47%) sep='\t' ncol=13 and header detection
0.000s ( 4%) Column type detection using 2 sample rows
0.000s ( 4%) Allocation of 2 rows x 13 cols (0.000GB) of which 2 (100%) rows used
0.000s ( 6%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 2 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 1%) Transpose
+ 0.000s ( 4%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/91.csv
File opened, size = 2.021KB (2069 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 9 lines of 3 fields using quote rule 0
sep=0x9 with 10 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (2068 bytes from row 1 to eof) / (2 * 2068 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 9 sample rows
All rows were sampled since file is small so we know nrow=9 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 9 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=1944
Read 9 rows x 13 columns from 2.021KB (2069 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 28%) Memory map 0.000GB file
0.000s ( 57%) sep='\t' ncol=13 and header detection
0.000s ( 4%) Column type detection using 9 sample rows
0.000s ( 4%) Allocation of 9 rows x 13 cols (0.000GB) of which 9 (100%) rows used
0.000s ( 7%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 9 rows) using 1 threads
+ 0.000s ( 1%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 2%) Transpose
+ 0.000s ( 5%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/94.csv
File opened, size = 5.780KB (5919 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 27 lines of 3 fields using quote rule 0
sep=0x9 with 28 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (5918 bytes from row 1 to eof) / (2 * 5918 jump0size) == 0
Type codes (jump 000) : A777775555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 27 sample rows
All rows were sampled since file is small so we know nrow=27 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 27 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=5794
Read 27 rows x 13 columns from 5.780KB (5919 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
Type counts:
5 : int32 '5'
5 : float64 '7'
3 : string 'A'
=============================
0.000s ( 16%) Memory map 0.000GB file
0.001s ( 70%) sep='\t' ncol=13 and header detection
0.000s ( 3%) Column type detection using 27 sample rows
0.000s ( 4%) Allocation of 27 rows x 13 cols (0.000GB) of which 27 (100%) rows used
0.000s ( 7%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 27 rows) using 1 threads
+ 0.000s ( 1%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 4%) Transpose
+ 0.000s ( 3%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.001s Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 4 threads (omp_get_max_threads()=4, nth=4)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
skip num lines = 0
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file /home/xk/dataprojects/ghsl/97.csv
File opened, size = 342 bytes.
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<City Area builtup75 builtup90 >>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 1 lines of 3 fields using quote rule 0
sep=0x9 with 2 lines of 13 fields using quote rule 0
Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City Area builtup75 builtup90 >>
Quote rule picked = 0
fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 1 because (341 bytes from row 1 to eof) / (2 * 341 jump0size) == 0
Type codes (jump 000) : A777555555A5A Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 1 sample rows
All rows were sampled since file is small so we know nrow=1 exactly
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : A777555555A5A
[10] Allocate memory for the datatable
Allocating 13 column slots (13 - 0 dropped) with 1 rows
[11] Read the data
jumps=[0..1), chunk_size=1048576, total_size=217
Read 1 rows x 13 columns from 342 bytes file in 00:00.000 wall clock time
[12] Finalizing the datatable
Type counts:
7 : int32 '5'
3 : float64 '7'
3 : string 'A'
=============================
0.000s ( 30%) Memory map 0.000GB file
0.000s ( 53%) sep='\t' ncol=13 and header detection
0.000s ( 5%) Column type detection using 1 sample rows
0.000s ( 4%) Allocation of 1 rows x 13 cols (0.000GB) of which 1 (100%) rows used
0.000s ( 8%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 1 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.000s ( 1%) Transpose
+ 0.000s ( 7%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.000s Total
I'm not sure what I'm doing wrong. There's no 'error' message
fread_folder(directory = "~/dataprojects/ghsl",extension = "CSV", check.names=T,verbose = T)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.