Git Product home page Git Product logo

node-pdf2img's People

Contributors

artcoding-git avatar ecostack avatar elgamala avatar elhigu avatar fitraditya avatar funkymusic avatar mgaeta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

node-pdf2img's Issues

Bug in Multiple pages PDFs converted simultaneously

When I try to convert two PDFs (multiple pages) at the same time, only one file gets converted successfully, and the other only gets the same response as first (instead of its own pages conversion).
So basically, say a file A.pdf (3 pages) and B.pdf (5 pages) are simultaneously converted. A.pdf gets converted into say: A_1.jpg, A_2.jpg, A_3.jpg. There is no images for B.pdf. But both instances gets the same response as: A_1.jpg, A_2.jpg, A_3.jpg.

Issue while processing multiple page pdf

<--- Last few GCs --->

11144 ms: Scavenge 969.5 (1008.3) -> 969.5 (1008.3) MB, 0.3 / 0 ms (+ 2.2 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
11256 ms: Mark-sweep 969.5 (1008.3) -> 585.1 (625.8) MB, 112.6 / 0 ms (+ 2.4 ms in 2 steps since start of marking, biggest step 2.2 ms) [last resort gc].
11329 ms: Mark-sweep 585.1 (625.8) -> 585.1 (623.8) MB, 72.9 / 0 ms [last resort gc].

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x298005e3ac1
2: /* anonymous / [/Documents/Nodedemo/pdf2image/node_modules/pdf2img/lib/pdf2img.js:~51] [pc=0x279bdc3b082](this=0x29800504101 ,pageCount=0x6097fb75189 <Number: 1.41414e+27>,callback=0x6097fb750f9 <JS Function %28SharedFunctionInfo 0x6097fb200a1%29)
4: /
anonymous */ [/Documents/Nodedemo/pdf2image/node_modules/async/lib/async.js:638] [pc...

FATAL ERROR: invalid array length Allocation failed - process out of memory
Abort trap: 6

Conversion not working and failing tests

Under NodeJs v0.10.36

  1. Conversion does not work here, using the example code from the ReadMe.

Error: [Error: File is not a PDF]

  1. Running tests result in an "[Error: File is not a PDF]" on your enclosed test file: see below
    npm test

[email protected] test /[MY_PATH]/node-pdf2img
./node_modules/.bin/mocha --reporter spec
Split and covert pdf into images
/[MY_PATH]/node-pdf2img/test/test.pdf
[Error: File is not a PDF]

  1. Create png files
    /vagrant/node-pdf2img/test/test.pdf
    [Error: File is not a PDF]
  2. Create jpg files

0 passing (2m)
2 failing

  1. Split and covert pdf into images Create png files:
    Error: timeout of 60000ms exceeded. Ensure the done() callback is being called in this test.

  2. Split and covert pdf into images Create jpg files:
    Error: timeout of 60000ms exceeded. Ensure the done() callback is being called in this test.

npm ERR! weird error 2
npm ERR! not ok code 0

Could you please fix that ?

Replace ImageMagick with GhostScript pagecount

I was having an issue with AWS Lambda where the ImageMagick line (72-73) was failing while trying to do a pagecount:

 gm(input).identify("%p ", function (err, value) {
	var pageCount = String(value).split(' ');

I removed ImageMagick (I also removed the import) and replaced line 72/73 with:

 gs()
  .executablePath('lambda-ghostscript/bin/./gs')
  .input(input)
  .pagecount(function (err, pageCount) {

and now it runs ok again

Requirements need to install before using pdf2img

I just started using this to convert pdf to images, but currently I am having problem with this error.

/bin/sh: identify: command not found
child_process.js:508
    throw err;
    ^

Error: Command failed: identify -format %n upload/2117c56efbd65908ae368201ec6517651468295928045.pdf
/bin/sh: identify: command not found

I've installed xpdf, using node v4.4.7 and trying to use local pdf file. Is there any requirements like imagemagick ?

Path with spaces

Does not work when the pdf path or name contains spaces in it.
For directory the walk around would be simply
dir.split(' ').join('\\ ')
but same trick won't work when the filename contains a space.

pdf2img, Error: Command failed: identify -format %n /docfire/EON.pdf

I've just installed pdf2img on my iMAC:

Running: macOS Sierra version 10.12.4
Node version 7.8.0

I'm trying to write a robust PDF to text package, I first use pdf2json to parse the file, if it contains no text nodes with content I then try the following:

    pdf2img   = require("pdf2img")
    var strFile = __dirname + "/" + aryOptions[0];
    console.log(strFile);
    pdf2img.setOptions({type:"png"
				   ,size:8192
				,density:600
			  ,outputdir:__dirname + "/output"
			 ,targetname:"pdf"});
    pdf2img.convert(strFile, function(err, info) {
	if ( err ) {
		console.log(err);
	} else {
		console.log(info);
	}
    });

Unfortunately this fails with:

    /bin/sh: identify: command not found
    child_process.js:524
    throw err;
    ^
    
    Error: Command failed: identify -format %n /docfire/EON.pdf
    /bin/sh: identify: command not found
    
    at checkExecSyncError (child_process.js:481:13)
    at execSync (child_process.js:521:13)
    at async.waterfall.pages (/docfire/node_modules/pdf2img/lib/pdf2img.js:47:32)
    at fn (/docfire/node_modules/async/lib/async.js:638:34)
    at Immediate.<anonymous> (/docfire/node_modules/async/lib/async.js:554:34)
    at runCallback (timers.js:672:20)
    at tryOnImmediate (timers.js:645:5)
    at processImmediate [as _immediateCallback] (timers.js:617:5)

[Edit] I have made some progress...after installing the dependencies which I missed initially, the output is now:

    execvp failed, errno = 2 (No such file or directory)
    gm identify: "gs" "-q" "-dBATCH" "-dSAFER" "-dMaxBitmap=50000000" "-dNOPAUSE" "-sDEVICE=ppmraw" "-dTextAlphaBits=4" "-dGraphicsAlphaBits=4" "-r72x72" "-sOutputFile=/var/folders/q7/hnxl054d71q277m5c3j25vdh0000gn/T/gmretZGf" "--" "/var/folders/q7/hnxl054d71q277m5c3j25vdh0000gn/T/gmdtH8SW" "-c" "quit".
    gm identify: Postscript delegate failed (/docfire/EON.pdf).
    gm identify: Request did not return an image.
    child_process.js:524
        throw err;
        ^
    Error: Command failed: gm identify -format %n /docfire/EON.pdf
    execvp failed, errno = 2 (No such file or directory)
    gm identify: "gs" "-q" "-dBATCH" "-dSAFER" "-dMaxBitmap=50000000" "-dNOPAUSE" "-sDEVICE=ppmraw" "-dTextAlphaBits=4" "-dGraphicsAlphaBits=4" "-r72x72" "-sOutputFile=/var/folders/q7/hnxl054d71q277m5c3j25vdh0000gn/T/gmretZGf" "--" "/var/folders/q7/hnxl054d71q277m5c3j25vdh0000gn/T/gmdtH8SW" "-c" "quit".
    gm identify: Postscript delegate failed (/docfire/EON.pdf).
    gm identify: Request did not return an image.
    
    at checkExecSyncError (child_process.js:481:13)
    at execSync (child_process.js:521:13)
    at async.waterfall.pages (/docfire/node_modules/pdf2img/lib/pdf2img.js:47:32)
    at fn (/docfire/node_modules/async/lib/async.js:638:34)
    at Immediate.<anonymous> (/docfire/node_modules/async/lib/async.js:554:34)
    at runCallback (timers.js:672:20)
    at tryOnImmediate (timers.js:645:5)
    at processImmediate [as _immediateCallback] (timers.js:617:5)

I've checked both the path and the file name, both are correct and exist.

After searching around I found that I had to also install:

    brew install ghostscript

Why is coming this err

I am a beginner plz help me.
C:\Users\kaushal-pc\Desktop\nodejs\pdf2img>node pdf2img.js
'gm' is not recognized as an internal or external command,
operable program or batch file.
child_process.js:533
throw err;
^

Error: Command failed: gm identify -format "%p " "C:\Users\kaushal-pc\Desktop\nodejs\pdf2img/bharati.pdf"
'gm' is not recognized as an internal or external command,
operable program or batch file.

at checkExecSyncError (child_process.js:490:13)
at execSync (child_process.js:530:13)
at C:\Users\kaushal-pc\Desktop\nodejs\node_modules\pdf2img\lib\pdf2img.js:72:23
at fn (C:\Users\kaushal-pc\Desktop\nodejs\node_modules\pdf2img\node_modules\async\lib\async.js:638:34)
at Immediate.<anonymous> (C:\Users\kaushal-pc\Desktop\nodejs\node_modules\pdf2img\node_modules\async\lib\async.js:554:34)
at runCallback (timers.js:651:20)
at tryOnImmediate (timers.js:624:5)
at processImmediate [as _immediateCallback] (timers.js:596:5)

==========================pdf2img.js============================

var fs = require('fs');
var path = require('path');
var pdf2img = require('pdf2img');
var input = __dirname+'/bharati.pdf';
pdf2img.setOptions({
type:'png',
size:1024,
denesity:600,
output:'testx'
});
pdf2img.convert(input,function (err, info) {
if(err)
console.log(err);
else
console.log(info);
});

No error showing, but image files don't open.

I'm on Ubuntu, with xpdf installed. PDF converts but the images isn't working.

var express = require('express');
var router = express.Router();

var multer = require('multer');
var storage = multer.diskStorage({
  destination: function (req, file, cb) {
    cb(null, 'public/uploads/');
  },
  filename: function (req, file, cb) {
    cb(null, file.originalname);
  }
});
var upload = multer({ storage: storage });

var path = require("path");
var fs = require('fs');
var pdf2img = require('pdf2img');

/* POST upload. */
router.post('/', upload.single('pdfGabarito'), function(req,res,next){
  console.log(req.file);
  var input = path.join(__dirname, '../public/uploads/',req.file.originalname);
  //var input = __dirname + '/../public/uploads/'+req.file.originalname;
  console.log(input);
  pdf2img.setOptions({
    type: 'png',                      // png or jpeg, default png
    size: 1024,                       // default 1024
    density: 300,                     // default 600
    outputdir: path.join(__dirname, '../public/uploads/'+req.file.originalname.split('.')[0]+'/'), // mandatory, outputdir must be absolute path
    targetname: req.file.originalname.split('.')[0]              // the prefix for the generated files, optional
  });

  pdf2img.convert(input, function(err, info) {
    if (err) console.log(err)
    else console.log(info);
  });

  res.redirect('/');
});

Image in output is 0 kb

When i try to convert a pdf into a png or jpeg the result is an empty image but no errors in the console. Someone has the same problem or has a solution? thx.

Bug in 1 page PDFs

Line 81 in lib/pdf2img.js :

if (options.page < pageCount.length) {

SHOULD BE

if (options.page <= pageCount.length) {

Move last changes to tag 0.1.1

Those changes (the merged pull request) doesn't arrive here ... and they are not contained in Tag 0.1.1

Could you move those changes from issue#1 to Tag 0.1.1 or create a new one ?

Thank you!

callback not working

my node version is 7.3
what I want to do is convert pdf file to png file
the pdf file inside only have one picture

and the problem is when I run convert method
is not success every time, and the convert callback never working
both success and failed are not, but sometime I can get the right png file.

I try to check the code
and I guess that because the "gm" module "write" method have problem
I try put console.log in write method callback function, and it never work
so that why I guess. any body had same problem ?

pdf file size too large

I put the about 5.4mb pdf convert to png is work. But I put 6.1mb or above pdf is no any response.

Output to base64?

Thanks for this, it works just fine if you follow the instructions for install!
Also note that ImageMagick does NOT work on other platforms than Linux (it works fine in Ubuntu subsystem on Windows 10 though).

I'd like to request a feature to return the created image as base64 instead of saving it to file on disk.
E.g.:

{ result: 'success',
  message: 
   [ { page: 1,
       name: 'test_1',
       size: 17.275,
       content: '/9j/7QBEUGhvdG9zaG9...base64-encoded-image-content...fXNWzvDEeYxxxzj/Coa6Bax//Z'
     },
     { page: 2,
       name: 'test_2',
       size: 24.518,
       content: '/9j/7QBEUGhvdG9zaG9...base64-encoded-image-content...fXNWzvDEeYxxxzj/Coa6Bax//Z'
     }
   ]
}

example throw error.

when I test the example. throw the error. Error: Command failed: gm identify -format "%p " "/Users/renzhiwen/Desktop/node_modules/pdf2img/test/test.pdf".

Error: Command failed: identify -format %n tmp/testpdf.pdf

Hi,i'm using this package and i got the error like this:

F:\parseWord>node pdf2img.js
'identify' �����ڲ����ⲿ���Ҳ���ǿ����еij���
�������ļ�
child_process.js:508
    throw err;
    ^

Error: Command failed: identify -format %n tmp/testpdf.pdf
'identify' �����ڲ����ⲿ���Ҳ���ǿ����еij���
���������ļ���

    at checkExecSyncError (c
```hild_process.js:465:13)
    at execSync (child_process.js:505:13)
    at async.waterfall.pages (F:\parseWord\node_modules\pdf2img\lib\pdf2img.js:47:32)
    at fn (F:\parseWord\node_modules\pdf2img\node_modules\async\lib\async.js:638:34)
    at Immediate._onImmediate (F:\parseWord\node_modules\pdf2img\node_modules\async\lib\async.js:554:34)
    at processImmediate [as _immediateCallback] (timers.js:383:17)

What can i do to figure out this problem?

System requirements

Hi,

I tried using id on a free tier ec2 instance and everything is stuck.

What are the basic requirements for using this library?

Thanks

Is anyone still working on this?

There seem to be a whole bunch of people experiencing the same problem, in that PDF's will not be converted to images.

I have a bunch of 0 byte png files and the error:

    /docfire/node_modules/gm/lib/command.js:228
    proc.stdin.once('error', cb);
    ^
            
    TypeError: Cannot read property 'once' of undefined
        at gm._spawn (/docfire/node_modules/gm/lib/command.js:228:15)
        at /docfire/node_modules/gm/lib/command.js:140:19
        at series (/docfire/node_modules/array-series/index.js:11:36)
        at gm._preprocess (/docfire/node_modules/gm/lib/command.js:177:5)
        at gm.stream (/docfire/node_modules/gm/lib/command.js:138:10)
        at convertPdf2Img (/docfire/node_modules/pdf2img/lib/pdf2img.js:92:6)
        at /docfire/node_modules/pdf2img/lib/pdf2img.js:67:9
        at /docfire/node_modules/async/lib/async.js:246:17
        at /docfire/node_modules/async/lib/async.js:122:13
        at _each (/docfire/node_modules/async/lib/async.js:46:13)

I've modified command.js inserting:

    if ( !(typeof proc == "object"
    && typeof proc.stdin == "object"
    && typeof proc.stdin.once == "function") ) {
    	return cb(new Error("imageMagick, WTF is going on?"))
    }

Just before:

    proc.stdin.once('error', cb); 

Now the error and exception is:

    events.js:163
          throw er; // Unhandled 'error' event
          ^
    
    Error: imageMagick, WTF is going on?
        at gm._spawn (/docfire/node_modules/gm/lib/command.js:231:13)
        at /docfire/node_modules/gm/lib/command.js:140:19
        at series (/docfire/node_modules/array-series/index.js:11:36)
        at gm._preprocess (/docfire/node_modules/gm/lib/command.js:177:5)
        at gm.stream (/docfire/node_modules/gm/lib/command.js:138:10)
        at convertPdf2Img (/docfire/node_modules/pdf2img/lib/pdf2img.js:92:6)
        at /docfire/node_modules/pdf2img/lib/pdf2img.js:67:9
        at /docfire/node_modules/async/lib/async.js:246:17
        at /docfire/node_modules/async/lib/async.js:122:13
        at _each (/docfire/node_modules/async/lib/async.js:46:13)

From all those struggling with the same problem, you might want to try:

https://www.npmjs.com/package/pdf-image

Much easier, and it works!

Specify which pages to convert

Is there a way to do this? I've tried passing an array and a comma-separated string, but I get a page error. Would be a useful feature when only a few pages and needed out of a large pdf (~100 page pfd for me, need 25 pages)

Specific page image

Hi is there a way to pass a parameter to just use a specific page?
I only need 1 image from 1 page.

Thanks in advance.

Republish npm package

Hello :)

Please republish the latest code to the npm package. It seems that the version on the npm is not the latest.

e.g.
github - lib/pdf2img.js line 70 uses gm library

npm - lib/pdf2img.js line 70 uses terminal command for gm

Error running in Win7 x64

Encountered this error while trying to run in my system. Is there a solution for this issue?

`events.js:141
throw er; // Unhandled 'error' event
^

Error: spawn pdfinfo ENOENT
at exports._errnoException (util.js:870:11)

`

Async mode ?

Hey, everything works but i have problem with async mode when i try to convert multiple files.

I'm having this but it converts all pdf at the same time and give a bad result (img from differents pdf in folder) :
`
traitementPDF = function(fichier, callback){
pdf2img.setOptions({
type: 'png', // png or jpg, default jpg
size: 1480, // default 1024
density: 600, // default 600
outputdir: fichier[1], // output folder, default null (if null given, then it will create folder name same as file name)
outputname: fichier[0] // output file name, dafault null (if null given, then it will create image name same as input name)
});

pdf2img.convert(fichier[2], function(err, info) {
  if (err){
    console.log(err)
    callback(false)
  } 
  else {
    console.log(info);
    callback(true)
  }
});

}`

and

uploadedFiles.forEach(function(fichier){ traitementPDF(fichier, function(result){ console.log("Ouvrage: " + fichier[0] + "Repertoire: " + fichier[1] + "\nChemin: " + fichier[2]); console.log("Retour de callback: " + result) }); });

thanks

[Error: Command failed: gm identify -format "%p "

Why do I get this error?
[Error: Command failed: gm identify -format "%p "

I have gm and gs packages installed.

I am running locally on a Windows machine, but istalling GraphicsMajik's exe did not help.
Any suggestions? Ultimately my goal is to get it running in an Azure Function.
Thanks,
Donnie

Not working with pm2 in production

Somehow if i invoke the program directly like node app.js , it works fine.
But if i start it with a process manager like pm2, it says that the native dependency gm is not found.

Any idea why ? Any knows workarounds ?

Cannot set property 'page' of null

I have found a files that fails conversion

TypeError: Cannot set property 'page' of null
    at /[app_path]/node_modules/pdf2img/lib/pdf2img.js:66:23

If I echo out error right after convertPdf2Img has execute it show that the "datasize < 127" check fails, thus passing an error back, but the error is unhandled and crashes the server.

Need some kind of error handling wrapper around

          result.page = page; // <-- This fails as result is not set when an error occures in convertPdf2Img 
          callbackmap(null, result);

Problem example basic document converted from Word, attached
1455011779_56191.pdf
)

Every time an error is passed from convertPdf2Img it causes this unhandled exception to crash the server

Pointers as to what is wrong with the file are also welcome :)

Command failed: identify

Hi, im gettin this error even when i try "matteocontrin" solution...

Uncaught Error: Command failed: identify -format %n C:/Users/Username/Desktop/prueba.pdf
"identify" no se reconoce como un comando interno o externo,
programa o archivo por lotes ejecutable.

Can u help me?

Versions of buildpack

@fitraditya thank you so much for this module. I wanted to know which exact versions of GM and IM should I be using. I'm using it on heroku's node app which runs on Linux 14.04. Would you be kind enough to specify a buildpack for GM and IM with the appropriate version or the correct versions of gm and IM. Much appreciated. Thanks!

Bad argument error

Node: v5.10.0
System: Yosemite
pdf2img: 0.1.2

code:

var target_path = path.resolve(__dirname, '../', '../', '../', 'builds/', 'development/', 'upload/', 'images/');
var input = req.files.file.path;

console.log('input: ' + input);
console.log('target_path: ' + target_path);

pdf2img.setOptions({
    type: 'png',                      // png or jpeg, default png 
    size: 1024,                       // default 1024 
    density: 600,                     // default 600 
    outputdir: target_path
});

pdf2img.convert(input, function(info) {
    console.log(info);
});

Error:

Uncaught Exception:
TypeError: Bad argument
    at TypeError (native)
    at ChildProcess.spawn (internal/child_process.js:274:26)
    at exports.spawn (child_process.js:343:9)
    at PDF.exec (/app/node_modules/pdfinfo/lib/pdfinfo.js:62:17)
    at async.waterfall.pages (/app/node_modules/pdf2img/lib/pdf2img.js:43:18)
    at fn (/app/node_modules/pdf2img/node_modules/async/lib/async.js:638:34)
    at Immediate._onImmediate (/app/node_modules/pdf2img/node_modules/async/lib/async.js:554:34)

Logs:

input: /app/builds/development/upload/images/e874b364713af5f7f7f1bc8e8a924fcb.pdf
target_path: /app/builds/development/upload/images

Output Filename not overridable

As seen in the code, the output filename is hard-wired to "test_" (+pagenumber).

This is somewhat inconvenient and should be configurable in the output options.

This feature is listed in the Todo's

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.