Git Product home page Git Product logo

Comments (24)

enferex avatar enferex commented on September 26, 2024

Scrubbing is only likely to work if you have a valid PDF that pdfresurrect can operate on. How many versions of the PDF are listed, pdfresurrect -q can tell you that, or just running pdfresurrect without any flags.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

Ya i tried "pdfresurrect -q", I have two edited pdf's, One shows me 1 as output and another with 2. But both the pdf's Srubbing output has no effect, returning the same pdf.

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

Scrubbing might be broken. I'll need to do some scrub testing with one of my local PDFs.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

Yeah, sure let me know! if it is working from your side, let me know, happy to share the Pdf's i am using to test.

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

Yeah, sure let me know! if it is working from your side, let me know, happy to share the Pdf's i am using to test.

Thanks, I'll let you know! I'll see if I can get to this over the weekend!

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

Yeah, sure let me know! if it is working from your side, let me know, happy to share the Pdf's i am using to test.

Thanks, I'll let you know! I'll see if I can get to this over the weekend!

Sure @enferex

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

@Hiteshsaai scrubbing is broken, I would only consider it as an experimental feature.

Edit: In some cases it does zero the PDF, but it will likely be unreadable by a viewer.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

is it possible for you to fix it, or is there any other way to retrieve the previous version of pdf from the incremental check ?

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

You do not need the scrubbing feature if you just want to explore previous versions. That's what the '-w' option is for.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

but i want to retrieve the previous version of pdf, which is the unedited version

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

i tried using the "-W" method, i am not able to open the generated pdf, it says the file has been damaged

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

can i share a pdf, can you try it from your side ?

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

but i want to retrieve the previous version of pdf, which is the unedited version

That's what the -w argument is for. If the PDF is corrupt, or pdfresurrect cannot read it, then pdfresurrect cannot handle it. Perhaps pdfresurrect can be extended to handle such a case.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

because, i took a amazon invoice, and edited some values using adobe editor and i used that pdf into the pdfresurrect using -W to produce different versioning, i got the pdf's generated but those where broken, then again i created my own pdf having
"hello world" text in it and i added "How are you" and saved it. Now i used the "-W" method, now the generated pdf Version 1 should have "hello world" and version 2 should have "hello world how are you" , i am able to open the version 2, but again here the version 1 was broken, and could not open it, do you have any idea why and is there any conditions only on which "Pdfresurrect" can work ?

Summary of helloword.pdf:
helloworld.pdf: This PDF contains potential cross reference streams.
helloworld.pdf: An object summary is not available.
---------- helloworld.pdf ----------
Versions: 1

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

If you modify a PDF by hand, then it will likely not work, as that will generate a corrupted document. If you use a tool to modify a PDF, such as Adobe's editor, then the PDF should be legal. In the dump you reported, the PDF has cross-reference streams, so pdfresurrect is not able to support it completely. Additionally, that dump shows that only one version was identified so no historical information will be produced.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

No i am using Adobe Acrobat Pro DC pdf editor to edit the pdf, i am not doing any manual editing, but still facing that issue,and one more thing, but the Pdf RAW has two %EOF , so it means that the pdf has a updated version right ?

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

Two EOF can either mean two versions, or one version where the first EOF is "linearized." A linearized PDF can have two EOF markers for the most recent version.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

is there a way to find if the pdf has a linerized EOF

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

is there a way to find if the pdf has a linerized EOF

You should see the text "/Linearized" near the beginning of the pdf.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

so if a Pdf with Linearized means, it is not a edited or a update pdf right ?

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

so if a Pdf with Linearized means, it is not a edited or a update pdf right ?

I do not know if that is true. I'm fairly certain there are modified linearized PDFs out there, but without looking and finding a sample, I cannot say this with 100% certainty.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

because i have some of the pdf which are e-mails downloaded as pdf by doing (Ctrl + P) and some pdfs are scanned pdf's, all these have linearized and 2 EOF, and the Creation Date and Modification Date has been changed, but these are real Pdf which has not been edited, but still are said to be modified or has 2 EOF, so i was thinking whether to avoid those linearized Pdf's as not edited to avoid false positive.

from pdfresurrect.

enferex avatar enferex commented on September 26, 2024

i was thinking whether to avoid those linearized Pdf's as not edited to avoid false positive.

The number of different versions detected should be reported as 1, that should be what the '-q' flag will report.

from pdfresurrect.

Hiteshsaai avatar Hiteshsaai commented on September 26, 2024

Ya exactly that is what i was thinking of, ya by doing that we can avoid the false positive.

from pdfresurrect.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.