Comments (6)
Hmmm.....I'm not sure we can do anything here; we're calling Mmap.mmap
from julia Base and I think that's just part of the behavior on windows with mmapping (i.e. you're not allowed to delete a file that is currently mmapped). As long as the CSV.Source
is in scope, I think we have to expect this behavior. But if you're just calling CSV.read
, you can probably call gc(); gc();
afterwards and then be able to delete the file (this forces a garbage collection on the mmap and should release the file to be deleted).
from csv.jl.
Thanks. That workaround works.
CSV.read("data.txt")
if is_windows()
gc()
gc()
end
Does not look nice though. If there is no handle of the file given to the user, I think the memory should be unmapped after reading.
from csv.jl.
This works as well
a=CSV.read("data.txt"; use_mmap=false)
from csv.jl.
readdlm
in Julia Base by default sets use_mmap
to false
on Windows by testing is_windows() ? false : true
probably because of the problem mentioned in this issue.
Maybe the same approach could be used in definition of Source
like:
Source(fullpath::Union{AbstractString,IO};
delim=COMMA,
quotechar=QUOTE,
escapechar=ESCAPE,
null::AbstractString="",
header::Union{Integer,UnitRange{Int},Vector}=1, # header can be a row number, range of rows, or actual string vector
datarow::Int=-1, # by default, data starts immediately after header or start of file
types::Union{Dict{Int,DataType},Dict{String,DataType},Vector{DataType}}=DataType[],
nullable::Bool=true,
weakrefstrings::Bool=true,
dateformat::Union{AbstractString,Dates.DateFormat}=Dates.ISODateFormat,
footerskip::Int=0,
rows_for_type_detect::Int=100,
rows::Int=0,
use_mmap::Bool=is_windows() ? false : true)
from csv.jl.
Here's the tradeoff: mmap is much faster, and this issue only comes up when you read a file and then quickly try to delete the underlying file. I could add some documentation around what needs to happen to safely delete, but it seems worth it to keep the speedups by using mmap by default on windows.
from csv.jl.
I would find this additional documentation very useful (and maybe the conclusions could be also used in readdlm
in Base). Is the problem only related to deletion of the file or also to its modification?
from csv.jl.
Related Issues (20)
- "writeshortest not defined" on macOS HOT 1
- UndefVarError: writeshortest not defined HOT 1
- Parsing based on first row when select, header and skipto are provided
- `CSV.io` is not defined
- CSV.File breaks with multiple input CSVs
- Reading large CSV files is slow/crashes HOT 1
- Performance regression since v0.8.0 HOT 1
- `stripwhitespace=true` not removing trailing white space?
- Do not edit "N/A", "NA", and similar entries **by default**. HOT 3
- skipto breaks if there is a quote in the skipped rows HOT 3
- getproperty on File makes internal use of dot notation problematic HOT 1
- big integers are parsed as Float64
- Too many missing warnings HOT 3
- writeheader=true ineffective in combination with header=
- Do not convert quoted cells
- CSV.write should conditionally convert type unstable iterators
- [Bug] CSV.read randomly changes eltype of column HOT 7
- pool kwarg documentation HOT 1
- There is no clear method reading non-UTF8 gzipped file in example
- burntsushi's issue HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from csv.jl.