Git Product home page Git Product logo

node-klaw's People

Contributors

ad-m avatar gsf-sellis avatar hulkish avatar jimmywarting avatar jprichardson avatar manidlou avatar mceachen avatar pimterry avatar remcohaszing avatar ryanzim avatar tomhughes avatar trustedtomato avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

node-klaw's Issues

preserveSymlinks not following links

  • OS: Windows 10
  • Node.js v14.15.0
  • Klaw version: 3.0.0

Klaw is not actually following symlinks at least on Windows.

If I include a symlink in the srcFolder structure, the link is only return as a symlink and is not traversed.

Example code:

        const folders = []
        const files = []
        const links = []

        klaw(srcFolder, {'preserveSymlinks': true})
            .on('data', item => {
                if (item.stats.isDirectory()) {
                    folders.push( item.path )
                } else if ( item.stats.isSymbolicLink() )  {
                    links.push( item.path )
                } else if (item.stats.isFile()) {
                    files.push( item.path )
                }
            })
            .on('end', () => {
                console.log({folders, files, links})
            })

API to stop walking

I am using klaw to find files but I need to be able to stop walking the directories after a certain limit. I don't think there is currently an official way to do this. Here is my workaround:

function getFilesToProcess(directory, maxElements) {
    return new Promise((resolve, reject) => {
        const items = [];
        const walkStream = fs.walk(directory, {queueMethod: 'pop'});
        walkStream.close = function () {
            this.paths = [];
        };
        walkStream
            .on('data', function (item) => {
                if (item.stats.isFile()) {
                    items.push(item.path);
                    if (maxElements > 0 && items.length >= maxElements) {
                        this.close();
                    }
                }
            })
            .on('end', () => resolve(items))
            .on('error', function (err) {
                this.close();
                reject(err);
            });
    });
}

I don't like to have to rely on an implementation detail to achieve this. What do you think about making an official close method ?

Maximum call stack size exceeded

RangeError: Maximum call stack size exceeded
at node_modules/klaw/src/index.js:45:23
at go$readdir$cb (node_modules/graceful-fs/graceful-fs.js:187:14)
at FSReqWrap.oncomplete (fs.js:135:15)

version 2.1.1 but nothing in new version relates here.

I have a folder with 200k+ files in it that crashes klaw.

        const crawl = async (subdir: string, cb) => {
            const done = resolvable();
            klaw(path.join(dir, subdir), {
                filter: (item) => {
                    return item.endsWith(".job");
                }
            }).on('data', (item) => {
                cb(item.path);               
            }).on('end', () => done.resolve());
            await done;
        };

Create LICENSE file

Would you please add a LICENSE file to your packages which includes your copyright information and the text of the MIT license? The MIT license states that the license text must accompany the source code. This also makes it easier for people like myself to package up your modules for Linux distributions.

Remove graceful-fs try-catch because of performance

There is an fs option on walk's options object, so the try-catch graceful-fs is unnecessary.
The main problem with it its performance overhead: with it, the library's loading time is around 20ms, and without it its 3ms in avarage.

Would you accept a PR?

Add only-file and only-dir support

Would you like to have only file and only directory functionalities?

That is, providing convenient functions to emit only files and only directories (excluding the root path) by piping a simple PassThrough stream to the walk function, or any other preferred approach?

fs.remove inside the walk leads to the errors

Hi,

Imagine that I want to walk through the directory contents and remove all the directories there.

Here is my directory:

tmp
├── dir1
│   └── hello.txt
├── dir2
├── dir3
└── hello.txt

3 directories, 2 files

Here is my code:

'use strict';

const klaw = require('klaw');
const through2 = require('through2');
const fs = require('fs-extra');


fs.walk('./tmp')
  .pipe(through2.obj((function (item, enc, next) {
  if (item.stats.isDirectory()) {
    fs.remove(item.path)
  }
  next();
})))
  /*.on('data', function (item) {

  })*/
  .on('end', function () {
    console.log('end of everything');
  })

Here is the error I have during the execution:
image

After the execution the directory looks the next way:

tmp
└── dir1
    └── hello.txt

1 directory, 1 file

I have tried different variants, including the marking the items as removed, but I haven't yet found anything working.

Could you, please, clarify if it's possible to cleanup the directory while "klawling" through it?

Regards,

Not traversing subdirectories if some filter functions applied

I was playing with #11 to see if I can find a solution, then I noticed another issue if filter function applied. This is different than #11 that the root directory is part of the result array although filter function is applied. This issue is about not traversing subdirectories at all if some filter functions applied, such as the following example.

Imagine we have

tmp
  |_dir1
  |   |_foo.md
  |_bar.md
  |_baz.md

and we want to get only .md files. If we use it like

var filterFunc = function (item) {
  return path.extname(item) === '.md'
}
var items = []
klaw('tmp', {filter: filterFunc}).on('data', function (item){
  items.push(item.path)
}).on('end', function () {
  console.dir(items)
})

the result array is ['tmp', 'bar.md', 'baz.md'].

So, I checked the code again and based on what I understood it happens because when a filter function is passed, all contents of the root directory pass through the filter function and since return path.extname(item) === '.md' fails for all subdirectories , so none of them will be read. Therefore, only items in the root directory itself are returned. Please correct me if I am wrong.

Edit

However, if we run the same example using pipe for filtering, everything is just fine. Apparently, the problem only arises when filter function is used.

var filter = through2.obj(function (item, enc, next) {
  if (path.extname(item.path) === '.md') this.push(item)
  next()
})

var items = []
klaw('tmp')
  .pipe(filter)
  .on('data', function (item) {
    items.push(item.path)
  })
  .on('end', function () {
    console.dir(items) // => ['bar.md', 'baz.md', 'dir1/foo.md']
  })

Document that symlinks not traversed

Spent a while trying to work out why my directory tree wasn't being traversed -- my root path was a symlink which isn't traversed.
It would be helpful to document explicitly that symlinks aren't traversed, or provide an option to permit this.

Support file URLs

The Node.js fs module supports file URLs where file paths are supported. It would be nice if this package accepts those too.

Update to mkdirp ≥ 1

Hi,

it looks easy to upgrade mkdirp dependency:

--- a/tests/_test.js
+++ b/tests/_test.js
@@ -10,17 +10,16 @@
     var testDir = path.join(os.tmpdir(), 'klaw-tests')
     rimraf(testDir, function (err) {
       if (err) return t.end(err)
-      mkdirp(testDir, function (err) {
-        if (err) return t.end(err)
-
+      mkdirp(testDir).then(() => {
         var oldEnd = t.end
         t.end = function () {
           rimraf(testDir, function (err) {
             err ? oldEnd.apply(t, [err]) : oldEnd.apply(t, arguments)
           })
         }
-
         testFn(t, testDir)
+      }).catch((err) => {
+        return t.end(err)
       })
     })
   })

Continue walking despite error?

Is there a way to get klaw to continue walking through a directory even if an error occurs by just ignoring the file/directory it encountered the error on? I'm wanting to walk through an entire /home/ but when I encounter a permission error the walk ends without emitting end. Thanks

Klaw stops on error

Hello,

I really appreciate this awesome package, but I cannot configure it to continue on error.

If directory access error occurs (not sure about file errors) then klaw just emits error and stops. end isn't called.

I'd like to continue seeking through directories.

Currently I cannot use klaw to scan whole readable filesystem, because it surely won't have access to some directories.

Any solutions?

example of 'skip directories' does not actually skip the directories

I am looking for a way to skip directories.

meaning - do not walk a specific directory's files or subfolders

The example in the main page does not skip folder, it simply filters out files.
I'd expect it to only give the files under the current folder.

so if I have

+ root 
+------- some-dir 
                 +----------some-file.txt
+------ another-file.txt

to output only another-file.txt and some-dir is skipped..
however, I see some-file.txt is also added to items.

is there a way to actually skip a directory?

I tried not calling next(), but that seems to stop the entire process.

.walk includes root directory, regardless of filter

I am walking a directory, ./source, and even though I have specified a filter to include only markdown files via path.extname(), I am still receiving the root directory as an item in my final array.

let filterFn = function(item) {
    return path.extname(item) === ".md";
}

return new Promise((resolve, reject) => {
    let items = [];
    fs.walk('./source', { filter: filterFn }).on('data', item => {
        items.push(item);
    }).on('end', () => {
        return resolve(items);
    });

}).then(items => {
    items.forEach(item => {
        console.log(item.path); // ['foo.md', 'bar.md', 'baz.md', 'source'];
    });
});

I expected

['foo.md', 'bar.md', 'baz.md'];

I actually got

['foo.md', 'bar.md', 'baz.md', 'source'];

Memory Usage

Does this module scale, if I have a directory with millions (or tens of millions) of files, will this scale elegantly as it iterates or does it have to read the entire directory into memory?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.