bogind / easycsv Goto Github PK

An R package for easy data loading from multiple tables

License: GNU Lesser General Public License v2.1

R 100.00%

cran csv-files easycsv r r-package utilities

easycsv's Issues

fread_folder is broken w.r.t. recent data.table

Hello, in our efforts to prepare data.table for an update on CRAN (-> v1.11.0), we noticed that recent changes will break your package:

install.packages('data.table', type = 'source', repos = 'http://Rdatatable.github.io/data.table')
pkg = 'easycsv'
install.packages(pkg, dependencies = TRUE)

tools::testInstalledPackage(pkg, outDir = "~/Desktop/dt_revdep")

library(pkg, character.only = TRUE)
fn = 'fread_folder'
example(fn, character.only = TRUE)

Error: isTrueFalse(showProgress) is not TRUE

This is because the default argument of showProgress has changed to be equivalent to the output of interactive(), and the default value of getOption('datatable.showprogress') has been deleted.

Please follow Rdatatable/data.table#2785 and feel free to provide input there if you have strong opinions about how to proceed.

Note that this issue also effects your SIRItoGTFS package; I'll file an official issue there once the aforementioned pull request is resolved, if needed.

How do you access the resulting data.table?

Hi, I am using the fread_folder function of your easycsv package, but I am unable to access the resulting data.table.

For example,

> vir <- fread_folder("folder_name")
> vir 
  NULL

How do you access the resulting data.table? Will you add some examples?

Thank you.

merge into one dataframe?

After running fread_folder I'm left with a few hundred dataframes in my environment but there is no merged dataframe generated. I'm not sure if it is just the csv files I'm using. Maybe I'm just an edge case. It's the first time I've used easycsv.

...
  sep=','  with 2 lines of 3 fields using quote rule 0
  sep=0x9  with 3 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	
builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (546 bytes from row 1 to eof) / (2 * 546 jump0size) == 0
  Type codes (jump 000)    : A775775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 2 sample rows
  All rows were sampled since file is small so we know nrow=2 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A775775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 2 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=422
Read 2 rows x 13 columns from 547 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         6 : int32     '5'
         4 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 17%) Memory map 0.000GB file
   0.001s ( 75%) sep='\t' ncol=13 and header detection
   0.000s (  3%) Column type detection using 2 sample rows
   0.000s (  2%) Allocation of 2 rows x 13 cols (0.000GB) of which 2 (100%) rows used
   0.000s (  3%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 2 rows) using 1 threads
   +    0.000s (  0%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  0%) Transpose
   +    0.000s (  3%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/73.csv
  File opened, size = 54.40KB (55701 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 99 lines of 3 fields using quote rule 0
  sep=0x9  with 100 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (55700 bytes from row 1 to eof) / (2 * 21245 jump0size) == 1
  Type codes (jump 000)    : A577775555A5A  Quote rule 0
  Type codes (jump 001)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 149 sample rows
  =====
  Sampled 149 rows (handled \n inside quoted fields) at 2 jump points
  Bytes from first data row on line 2 to the end of last row: 55576
  Line length: mean=214.02 sd=8.98 min=192 max=230
  Estimated number of rows: 55576 / 214.02 = 260
  Initial alloc = 286 rows (260 + 10%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
  =====
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 286 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=55576
Read 260 rows x 13 columns from 54.40KB (55701 bytes) file in 00:00.005 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s (  5%) Memory map 0.000GB file
   0.003s ( 67%) sep='\t' ncol=13 and header detection
   0.000s (  2%) Column type detection using 149 sample rows
   0.000s (  1%) Allocation of 286 rows x 13 cols (0.000GB) of which 260 ( 91%) rows used
   0.001s ( 25%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 260 rows) using 1 threads
   +    0.000s (  3%) Parse to row-major thread buffers (grown 0 times)
   +    0.001s ( 21%) Transpose
   +    0.000s (  1%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.005s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/74.csv
  File opened, size = 1.560KB (1597 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 7 lines of 3 fields using quote rule 0
  sep=0x9  with 8 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (1596 bytes from row 1 to eof) / (2 * 1596 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 7 sample rows
  All rows were sampled since file is small so we know nrow=7 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 7 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=1472
Read 7 rows x 13 columns from 1.560KB (1597 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 27%) Memory map 0.000GB file
   0.000s ( 53%) sep='\t' ncol=13 and header detection
   0.000s (  6%) Column type detection using 7 sample rows
   0.000s (  5%) Allocation of 7 rows x 13 cols (0.000GB) of which 7 (100%) rows used
   0.000s ( 10%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 7 rows) using 1 threads
   +    0.000s (  1%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  3%) Transpose
   +    0.000s (  6%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/75.csv
  File opened, size = 342 bytes.
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 1 lines of 3 fields using quote rule 0
  sep=0x9  with 2 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (341 bytes from row 1 to eof) / (2 * 341 jump0size) == 0
  Type codes (jump 000)    : A775575555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 1 sample rows
  All rows were sampled since file is small so we know nrow=1 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A775575555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 1 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=217
Read 1 rows x 13 columns from 342 bytes file in 00:00.000 wall clock time
[12] Finalizing the datatable
  Type counts:
         7 : int32     '5'
         3 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 31%) Memory map 0.000GB file
   0.000s ( 54%) sep='\t' ncol=13 and header detection
   0.000s (  5%) Column type detection using 1 sample rows
   0.000s (  4%) Allocation of 1 rows x 13 cols (0.000GB) of which 1 (100%) rows used
   0.000s (  6%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 1 rows) using 1 threads
   +    0.000s (  0%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  1%) Transpose
   +    0.000s (  6%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.000s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/77.csv
  File opened, size = 17.74KB (18166 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 84 lines of 3 fields using quote rule 0
  sep=0x9  with 85 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (18165 bytes from row 1 to eof) / (2 * 18165 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 84 sample rows
  All rows were sampled since file is small so we know nrow=84 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 84 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=18041
Read 84 rows x 13 columns from 17.74KB (18166 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 12%) Memory map 0.000GB file
   0.001s ( 72%) sep='\t' ncol=13 and header detection
   0.000s (  2%) Column type detection using 84 sample rows
   0.000s (  2%) Allocation of 84 rows x 13 cols (0.000GB) of which 84 (100%) rows used
   0.000s ( 13%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 84 rows) using 1 threads
   +    0.000s (  2%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  9%) Transpose
   +    0.000s (  2%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/8.csv
  File opened, size = 1.833KB (1877 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 8 lines of 3 fields using quote rule 0
  sep=0x9  with 9 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (1876 bytes from row 1 to eof) / (2 * 1876 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 8 sample rows
  All rows were sampled since file is small so we know nrow=8 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 8 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=1752
Read 8 rows x 13 columns from 1.833KB (1877 bytes) file in 00:00.002 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.001s ( 39%) Memory map 0.000GB file
   0.001s ( 47%) sep='\t' ncol=13 and header detection
   0.000s (  3%) Column type detection using 8 sample rows
   0.000s (  3%) Allocation of 8 rows x 13 cols (0.000GB) of which 8 (100%) rows used
   0.000s (  8%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 8 rows) using 1 threads
   +    0.000s (  1%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  3%) Transpose
   +    0.000s (  4%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.002s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/80.csv
  File opened, size = 775 bytes.
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 3 lines of 3 fields using quote rule 0
  sep=0x9  with 4 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (774 bytes from row 1 to eof) / (2 * 774 jump0size) == 0
  Type codes (jump 000)    : A777755555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 3 sample rows
  All rows were sampled since file is small so we know nrow=3 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777755555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 3 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=650
Read 3 rows x 13 columns from 775 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         6 : int32     '5'
         4 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 43%) Memory map 0.000GB file
   0.000s ( 45%) sep='\t' ncol=13 and header detection
   0.000s (  4%) Column type detection using 3 sample rows
   0.000s (  3%) Allocation of 3 rows x 13 cols (0.000GB) of which 3 (100%) rows used
   0.000s (  5%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 3 rows) using 1 threads
   +    0.000s (  0%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  1%) Transpose
   +    0.000s (  3%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/81.csv
  File opened, size = 27.89KB (28557 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 99 lines of 3 fields using quote rule 0
  sep=0x9  with 100 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (28556 bytes from row 1 to eof) / (2 * 21758 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  Type codes (jump 001)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 131 sample rows
  =====
  Sampled 131 rows (handled \n inside quoted fields) at 2 jump points
  Bytes from first data row on line 2 to the end of last row: 28432
  Line length: mean=217.04 sd=8.14 min=206 max=258
  Estimated number of rows: 28432 / 217.04 = 131
  Initial alloc = 144 rows (131 + 9%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
  =====
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 144 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=28432
Read 131 rows x 13 columns from 27.89KB (28557 bytes) file in 00:00.002 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s (  8%) Memory map 0.000GB file
   0.001s ( 57%) sep='\t' ncol=13 and header detection
   0.000s (  2%) Column type detection using 131 sample rows
   0.000s (  2%) Allocation of 144 rows x 13 cols (0.000GB) of which 131 ( 91%) rows used
   0.001s ( 32%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 131 rows) using 1 threads
   +    0.000s (  3%) Parse to row-major thread buffers (grown 0 times)
   +    0.001s ( 26%) Transpose
   +    0.000s (  3%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.002s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/82.csv
  File opened, size = 1.156KB (1184 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 5 lines of 3 fields using quote rule 0
  sep=0x9  with 6 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (1183 bytes from row 1 to eof) / (2 * 1183 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 5 sample rows
  All rows were sampled since file is small so we know nrow=5 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 5 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=1059
Read 5 rows x 13 columns from 1.156KB (1184 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 35%) Memory map 0.000GB file
   0.000s ( 53%) sep='\t' ncol=13 and header detection
   0.000s (  4%) Column type detection using 5 sample rows
   0.000s (  3%) Allocation of 5 rows x 13 cols (0.000GB) of which 5 (100%) rows used
   0.000s (  5%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 5 rows) using 1 threads
   +    0.000s (  0%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  1%) Transpose
   +    0.000s (  4%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/84.csv
  File opened, size = 9.14KB (9357 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 44 lines of 3 fields using quote rule 0
  sep=0x9  with 45 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (9356 bytes from row 1 to eof) / (2 * 9356 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 44 sample rows
  All rows were sampled since file is small so we know nrow=44 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 44 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=9232
Read 44 rows x 13 columns from 9.14KB (9357 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 17%) Memory map 0.000GB file
   0.001s ( 71%) sep='\t' ncol=13 and header detection
   0.000s (  2%) Column type detection using 44 sample rows
   0.000s (  2%) Allocation of 44 rows x 13 cols (0.000GB) of which 44 (100%) rows used
   0.000s (  8%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 44 rows) using 1 threads
   +    0.000s (  2%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  4%) Transpose
   +    0.000s (  2%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/86.csv
  File opened, size = 3.249KB (3327 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 15 lines of 3 fields using quote rule 0
  sep=0x9  with 16 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (3326 bytes from row 1 to eof) / (2 * 3326 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 15 sample rows
  All rows were sampled since file is small so we know nrow=15 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 15 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=3202
Read 15 rows x 13 columns from 3.249KB (3327 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 26%) Memory map 0.000GB file
   0.000s ( 58%) sep='\t' ncol=13 and header detection
   0.000s (  4%) Column type detection using 15 sample rows
   0.000s (  4%) Allocation of 15 rows x 13 cols (0.000GB) of which 15 (100%) rows used
   0.000s (  9%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 15 rows) using 1 threads
   +    0.000s (  1%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  3%) Transpose
   +    0.000s (  4%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/88.csv
  File opened, size = 758 bytes.
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 3 lines of 3 fields using quote rule 0
  sep=0x9  with 4 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (757 bytes from row 1 to eof) / (2 * 757 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 3 sample rows
  All rows were sampled since file is small so we know nrow=3 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 3 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=633
Read 3 rows x 13 columns from 758 bytes file in 00:00.000 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 31%) Memory map 0.000GB file
   0.000s ( 53%) sep='\t' ncol=13 and header detection
   0.000s (  5%) Column type detection using 3 sample rows
   0.000s (  4%) Allocation of 3 rows x 13 cols (0.000GB) of which 3 (100%) rows used
   0.000s (  7%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 3 rows) using 1 threads
   +    0.000s (  0%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  1%) Transpose
   +    0.000s (  6%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.000s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/89.csv
  File opened, size = 793 bytes.
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 3 lines of 3 fields using quote rule 0
  sep=0x9  with 4 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (792 bytes from row 1 to eof) / (2 * 792 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 3 sample rows
  All rows were sampled since file is small so we know nrow=3 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 3 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=668
Read 3 rows x 13 columns from 793 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 29%) Memory map 0.000GB file
   0.000s ( 54%) sep='\t' ncol=13 and header detection
   0.000s (  5%) Column type detection using 3 sample rows
   0.000s (  3%) Allocation of 3 rows x 13 cols (0.000GB) of which 3 (100%) rows used
   0.000s (  9%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 3 rows) using 1 threads
   +    0.000s (  1%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  2%) Transpose
   +    0.000s (  6%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/9.csv
  File opened, size = 14.28KB (14618 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 66 lines of 3 fields using quote rule 0
  sep=0x9  with 67 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (14617 bytes from row 1 to eof) / (2 * 14617 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 66 sample rows
  All rows were sampled since file is small so we know nrow=66 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 66 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=14493
Read 66 rows x 13 columns from 14.28KB (14618 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 16%) Memory map 0.000GB file
   0.001s ( 71%) sep='\t' ncol=13 and header detection
   0.000s (  2%) Column type detection using 66 sample rows
   0.000s (  2%) Allocation of 66 rows x 13 cols (0.000GB) of which 66 (100%) rows used
   0.000s (  9%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 66 rows) using 1 threads
   +    0.000s (  2%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  5%) Transpose
   +    0.000s (  2%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/90.csv
  File opened, size = 536 bytes.
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 2 lines of 3 fields using quote rule 0
  sep=0x9  with 3 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (535 bytes from row 1 to eof) / (2 * 535 jump0size) == 0
  Type codes (jump 000)    : A757575555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 2 sample rows
  All rows were sampled since file is small so we know nrow=2 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A757575555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 2 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=411
Read 2 rows x 13 columns from 536 bytes file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         7 : int32     '5'
         3 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 39%) Memory map 0.000GB file
   0.000s ( 47%) sep='\t' ncol=13 and header detection
   0.000s (  4%) Column type detection using 2 sample rows
   0.000s (  4%) Allocation of 2 rows x 13 cols (0.000GB) of which 2 (100%) rows used
   0.000s (  6%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 2 rows) using 1 threads
   +    0.000s (  0%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  1%) Transpose
   +    0.000s (  4%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/91.csv
  File opened, size = 2.021KB (2069 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 9 lines of 3 fields using quote rule 0
  sep=0x9  with 10 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (2068 bytes from row 1 to eof) / (2 * 2068 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 9 sample rows
  All rows were sampled since file is small so we know nrow=9 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 9 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=1944
Read 9 rows x 13 columns from 2.021KB (2069 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 28%) Memory map 0.000GB file
   0.000s ( 57%) sep='\t' ncol=13 and header detection
   0.000s (  4%) Column type detection using 9 sample rows
   0.000s (  4%) Allocation of 9 rows x 13 cols (0.000GB) of which 9 (100%) rows used
   0.000s (  7%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 9 rows) using 1 threads
   +    0.000s (  1%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  2%) Transpose
   +    0.000s (  5%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/94.csv
  File opened, size = 5.780KB (5919 bytes).
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 27 lines of 3 fields using quote rule 0
  sep=0x9  with 28 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (5918 bytes from row 1 to eof) / (2 * 5918 jump0size) == 0
  Type codes (jump 000)    : A777775555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 27 sample rows
  All rows were sampled since file is small so we know nrow=27 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777775555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 27 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=5794
Read 27 rows x 13 columns from 5.780KB (5919 bytes) file in 00:00.001 wall clock time
[12] Finalizing the datatable
  Type counts:
         5 : int32     '5'
         5 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 16%) Memory map 0.000GB file
   0.001s ( 70%) sep='\t' ncol=13 and header detection
   0.000s (  3%) Column type detection using 27 sample rows
   0.000s (  4%) Allocation of 27 rows x 13 cols (0.000GB) of which 27 (100%) rows used
   0.000s (  7%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 27 rows) using 1 threads
   +    0.000s (  1%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  4%) Transpose
   +    0.000s (  3%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.001s        Total
omp_get_max_threads() = 4
omp_get_thread_limit() = 2147483647
DTthreads = 0
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
  Using 4 threads (omp_get_max_threads()=4, nth=4)
  NAstrings = [<<NA>>]
  None of the NAstrings look like numbers.
  skip num lines = 0
  show progress = 1
  0/1 column will be read as integer
[02] Opening the file
  Opening file /home/xk/dataprojects/ghsl/97.csv
  File opened, size = 342 bytes.
  Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
  \n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
  Positioned on line 1 starting: <<City	Area	builtup75	builtup90	>>
[06] Detect separator, quoting rule, and ncolumns
  Detecting sep automatically ...
  sep=','  with 1 lines of 3 fields using quote rule 0
  sep=0x9  with 2 lines of 13 fields using quote rule 0
  Detected 13 columns on line 1. This line is either column names or first data row. Line starts as: <<City	Area	builtup75	builtup90	>>
  Quote rule picked = 0
  fill=false and the most number of columns found is 13
[07] Detect column types, good nrow estimate and whether first row is column names
  Number of sampling jump points = 1 because (341 bytes from row 1 to eof) / (2 * 341 jump0size) == 0
  Type codes (jump 000)    : A777555555A5A  Quote rule 0
  'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 1 sample rows
  All rows were sampled since file is small so we know nrow=1 exactly
[08] Assign column names
[09] Apply user overrides on column types
  After 0 type and 0 drop user overrides : A777555555A5A
[10] Allocate memory for the datatable
  Allocating 13 column slots (13 - 0 dropped) with 1 rows
[11] Read the data
  jumps=[0..1), chunk_size=1048576, total_size=217
Read 1 rows x 13 columns from 342 bytes file in 00:00.000 wall clock time
[12] Finalizing the datatable
  Type counts:
         7 : int32     '5'
         3 : float64   '7'
         3 : string    'A'
=============================
   0.000s ( 30%) Memory map 0.000GB file
   0.000s ( 53%) sep='\t' ncol=13 and header detection
   0.000s (  5%) Column type detection using 1 sample rows
   0.000s (  4%) Allocation of 1 rows x 13 cols (0.000GB) of which 1 (100%) rows used
   0.000s (  8%) Reading 1 chunks (0 swept) of 1.000MB (each chunk 1 rows) using 1 threads
   +    0.000s (  0%) Parse to row-major thread buffers (grown 0 times)
   +    0.000s (  1%) Transpose
   +    0.000s (  7%) Waiting
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
   0.000s        Total

I'm not sure what I'm doing wrong. There's no 'error' message

fread_folder(directory = "~/dataprojects/ghsl",extension = "CSV", check.names=T,verbose = T)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.