Git Product home page Git Product logo

officetopdf's People

Contributors

vittala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

officetopdf's Issues

Memory leak ?

I have a service which write with golang, it will download file from http link, and then convert to pdf with officeToPdf, i have an issue that the memory will leak. With poolmon.exe i have trace to driver Fltmgr.sys. do any of you has accross this problem ?

Conversion fail with "Call was rejected by callee. (Exception from HRESULT: 0x80010001 (RPC_E_CALL_REJECTED))"

I am running into a problem where conversions are failing with the output message of "Call was rejected by callee. (Exception from HRESULT: 0x80010001 (RPC_E_CALL_REJECTED))". I am using the last version released on a Windows 2012 system running Office 2016 Professional Plus.

I have searched around on this error and see that there it seems to refer to a timing issue.

The majority of our conversions are PPT/PPTX to PDF.

Any thoughts on what this could be and what I may need to address?

Feature: Allow a limit on the number of columns to convert /excel_max_columns columns

Description
Please add a functionality to allow a limit on the number of columns to convert, currently there's a similar switch called /excel_max_rows rows
The suggested syntax is the following:

/excel_max_columns columns

Use case
I've faced the scenario when users fill ALL available columns, the generated PDF had way to many useless pages... i would like to set a limit so Office doesn't collapse

/hidden param seem not work (or as expectedly)

Hi Vittal,

When passing /hidden param through command line, it will appear a pop-up office windows then disappear(like #20 ), converting going well with expected file output, but sometime(hard to recurrence) will occur a office word dialog to ask you save the modified file or not, with this dialog opening, the follow-up converting task is fail and showing another dialog which message like Dialog Box Is Open ,Click Ok....

I have tested not to passing the /hidden, and it seem no flash office window occur anymore.. ???

So I have 2 questions,

  1. is the showing dialog normal? Because of any keyboard input causing the file being modified?
  2. is /hidden had the reversal behaviors?

Thanks for your work, by the way!

Support the =HYPERLINK formula in Excel

When Excel cells use the =HYPERLINK formula, the resulting PDF doesn't make the cell value clickable. We should have an option to run through these cells and set the hyperlink property before conversion

Page size

Hello,
is there a way to convert document that are in a non default page sze?
For example, a docx file in A3 page size is converted to a A4 page sized pdf and his content is cropped...
Thank you

Hangs when started from service

I want to use this tool to convert DOCX files to PDF files in my windows service, but when it runs as service the Word process hangs, uses CPU power, but never exits. If i start the process from the command line, everything works well.

I've also tryed using a different user for the service, or enabling desktop interaction, but that did not help.

I'm using Office 2016.

Is there something that I need to do first in order to make it work from services?

Release version, ROADMAP?

Hello!

Are you going to add a [release] version soon?
It has been more than a year since you released version 1.8,

Currently you only have [pre-release] versions of OfficeToPDF,
image
Best,

Prevent Word field updates

Word (up to 2016) sometimes(?) ignores the option about not updating fields, which is an unchecked Display->Update fields before printing (and misled a long time by missing some text in Advanced->Print->Allow fields containing tracked changes to update before printing)

The option /word_no_field_update doesn't prevent the field update, but it's absence urges OfficeToPDF to update the fields explicitly.

So, currently it's impossible to prevent the field update at all.
The only solution is to unlink all fields, before they could be updated.

There could be a new option /word_prevent_field_updates, which does the same as UpdateDocumentFields but instead of updating the fields it unlinks them.
(Hint: It's use should be combined with /readonly!)

A workaround with AutoOpen-VBA code to unlink the fields looks like this:

Attribute VB_Name = "UnlinkFields"
' NOTE: If macros are deactivated, this module needs to be saved in the Normal.dotm!
' (c) ptar, 2018

' On opening a document, all fields are replaced by their text value (AKA 'unlinked)
' This way, they can't be updated during printing
Sub AutoOpen()
	unlinkFields
End Sub

' Replace all fields by their text value
Public Sub unlinkFields()

	' Word-BUG: ActiveDocument.Fields gives 0, if there are no fields in Main, but e.g. only in the Header :-(
	' That is, we have to search all StoryRanges on our own...
	
	Application.ScreenUpdating = False

	On Error Resume Next

	Dim storyRange As Range
	For Each storyRange In ActiveDocument.StoryRanges
		If (storyRange.Fields.Count > 0) Then
			For Each fld In storyRange.Fields
				fld.Unlink
			Next
		End If
	Next
End Sub

Please consider to integrate this as a new option in OfficeToPDF. Thanks!

Missing horizontal lines in Word 2013

Hi,

I have the following word document which gets converted very accurately by OfficeToPdf in my machine using Word 2016:
https://typhoonhil-my.sharepoint.com/:w:/g/personal/victor_maryama_typhoon-hil_com/EZJLwMxZNRFOqWLTZdxbH5cB-RLDclB-nbGqITSKgEhacg?e=JfOG2L

However, when using Word 2013 in another machine, the horizontal bars below the titles and headers are not present:

https://typhoonhil-my.sharepoint.com/:b:/g/personal/victor_maryama_typhoon-hil_com/EayHGE47y1dGnwo7EUVg0NkBBzGmzmgiR9YtUg_cKZTE3w?e=T97Afm

In fact, in this case, the output is just as if we exported PDF with the minimum size option turned on:
export options

This is the pdf output with the minimum size option:

https://typhoonhil-my.sharepoint.com/:b:/g/personal/victor_maryama_typhoon-hil_com/EUdrcXcWOD5LvEqtdbOyyTYBKkSnGcQrrFpvznmf_XYp1A?e=FSymHx

I tried using the /word_field_quick_update but it did not help.

Thanks!

Add an option filename NameOfFile

In an automated environment, OfficeToPDF might be used on file copies in a temporary folder using unique filenames which differ from the original file name. This breaks e.g. the Word field FILENAME or Excels Header and Footer options if they contain the filename.

Please add a new command line switch
/filename NameOfFile
which sets the internal filename after opening the file, or at least make it look like this.

In WordConverter.cs, there's already a special handling for filename fields, which might be easy to adapt.

I'm not sure if there's such an easy solution for Excel and the other Office programs.

Thanks in advance for implementing this!

Note: If my other issue about preventing field updates is implemented, the Word FILENAME field should be updated, before the field is unlinked!

Feature: Allow excel width/height configuration /excel_single_page

Description
Add a command switch to excel files so they get exported into a single big PDF page, the suggested syntax is the following,

/excel_single_page

Use case
Currently most EXCEL files aren't meant to be printed. So, when they are exported to PDF they are broke apart into many pages (due to print configurations). However the EXCEL will never be printed... the PDF is for view only.

Excel has a configuration for doing this:
image

Error when running through Jenkins

When I run OfficeToPDF on the command line in Windows Server 2016 with Microsoft Office Professional Plus 2019 installed, everything works perfectly.
If instead I let Jenkins run the same command on the same machine, I get the following message: 'Object reference not set to an instance of an object.' The task manager shows the process 'Microsoft Word' with 51% CPU load. It runs indefinitely and I need to stop it manually. It then adds 'Did not convert' to the output.

I tried the versions 1.3.0.0, 1.8.0.0 and 1.8.22.0 which show equal behaviour.

Can you tell me what's going on? What is the difference between me running the command and Jenkins running the command?

These seem to be similar problems, but there is no solution mentioned:
https://stackoverflow.com/questions/24860351/object-reference-not-set-to-an-instance-of-an-object-did-not-convert
https://stackoverflow.com/questions/11796414/convert-ms-office-to-pdf/11796615

Did not convert

Los márgenes derecho e izquierdo, el espacio entre columnas o las sangrías de párrafo son demasiado largos para el ancho de la página en algunas secciones.
Did not convert

I didn't convert the attach file, when I try save as PDF with Word it's OK, but fails with OfficeToPDF.
28.docx

Thanks a lot, in advance.

Protect PDF against printing

I want to to protect the generated PDF against printig. I use this syntax:

OfficeToPDF.exe "WORD CON IMAGEN.docx" xx.pdf /pdf_restrict_print /pdf_owner_pass "pass7"

But, when I open xx.pdf I can print it without entering the password.

Name conflict - Old name: Print_Area

Hello!

Today our team found that some old EXCEL files that cannot be converted due to an old OFFICE bug, you may find more detail here:

OFFICE BUG:

https://answers.microsoft.com/en-us/msoffice/forum/msoffice_excel-mso_mac-mso_365hp/name-conflict-old-name-printarea/4145041c-950c-4923-a4f9-68c459a1ce0e

The attached file is an example of a file that cannot be converted, the issue occurs only when using Office 2016.

EXPECTED RESULT:

The expected result would be either that the file gets converted or that an specific error message gets thrown, something like "Object names conflict".

ENVIROMENT:

OfficeToPDF v1.8.21.0
Office Professional Plus 2016
Windows Server 2012 R2 Standard

Attached Excel file
Cannot be converted.xlsx
errors_thrown.zip
enviroment.zip

Concurrent mode

Hi.
It is possible to use OfficeToPDF in concurrent mode? probably no
Now, if I try to run more than 1 instance it will instantly fail with popup for saving untitled document. Also, until I manually kill WINWORD.exe each next run fall with RPC_E_CALL_REJECTED.

Working in one thread took to long time.

Converted PDFs missing markup

Firstly, really useful piece of software so many thanks!

I am having an issue with markup not appearing in the converted PDF and have some suggested changes:

  • From reading using word.Visible results in a faster render than word.ScreenUpdating but have not checked this carefully.
  • I have turned on inline markup (you may want to make this an option)
  • I switch various view options depending on if markup was requested which seems to fix the problem
  • Using Microsoft Office 365/2016, OfficeToPDF 1.8.23.0

git diff of changes below:

--- a/OfficeToPDF/WordConverter.cs
+++ b/OfficeToPDF/WordConverter.cs
@@ -90,7 +90,7 @@ namespace OfficeToPDF
                         try
                         {
                             // Try to set a property on the object
-                            word.ScreenUpdating = false;
+                            word.Visible = false;
                         }
                         catch (COMException)
                         {
@@ -286,16 +286,18 @@ namespace OfficeToPDF
                     }
                     catch (Exception) { }

-                    // Hide comments
+                    // Show revisons if required
                     try
                     {
+                        bool withMarkup = showMarkup == WdExportItem.wdExportDocumentWithMarkup;
                         word.PrintPreview = false;
+                        docWinView.MarkupMode = WdRevisionsMode.wdInLineRevisions;
                         docWinView.RevisionsView = WdRevisionsView.wdRevisionsViewFinal;
-                        docWinView.ShowRevisionsAndComments = false;
-                        docWinView.ShowComments = false;
-                        docWinView.ShowFormatChanges = false;
-                        docWinView.ShowInkAnnotations = false;
-                        docWinView.ShowInsertionsAndDeletions = false;
+                        docWinView.ShowRevisionsAndComments = withMarkup;
+                        docWinView.ShowComments = withMarkup;
+                        docWinView.ShowFormatChanges = withMarkup;
+                        docWinView.ShowInkAnnotations = withMarkup;
+                        docWinView.ShowInsertionsAndDeletions = withMarkup;
                     }
                     catch (SystemException e) {
                         Console.WriteLine("Failed to set revision settings {0}", e.Message);

RPC_E_CALL_REJECTED after converting powerpoint prevents further conversion of powerpoint

I'm running version 1.8.22.0

When I convert a PowerPoint document to pdf (running on an automation server), with the following options /print /readonly, the conversion succeeds, but I get the following exception and POWERPNT.EXE doesn't close. On subsequent runs, I get the same exception, but the conversion doesn't succeed until I shut down POWERPNT.EXE

The exception: Call was rejected by callee. (Exception from HRESULT: 0x80010001 (RPC_E_CALL_REJECTED))

error code: 3

PowerPoint file (just a dummy file):
testPPnt-3d.pptx

On the rare (very very rare) occasions when I don't get the exception everything runs fine and PowerPoint closes and I can rerun the conversion. The conversion is triggered by an email, with the pptx file attached, that I'm sending manually for testing so there are generally several minutes between tests. I wouldn't expect this to be the case in production.

This script will also be converting other doctypes, but I haven't seen any problems with other docs like Word.

Any help would be very much appreciated.

Headless mode

With LibreOffice, you can use --headless option to convert document with command line without opening a LibreOffice window.
Is it possible to have the same option?

The hidden option is not enough because you see an Office window poping up over the application for one or two second.

Problem starting from TFS Network service

Hi,

When I tried to use OfficeToPDF within a TFS batch proces (runs under Network Service), then I get the following error.

Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80080005 Server execution failed (Exception from HRESULT: 0x80080005 (CO_E_SERVER_EXEC_FAILURE)).

I already gave the Network Service alle the right in the DCOM Config (Component Services/DCOM Config/Microsft Wordt 97 - 2003-document)

Any help is appreciated!

Kind regards,

Adrie

.eml support

Currently OfficeToPDF supports conversion of .msg file to PDF.
Does it support conversion of .eml to PDF also?
If not, would it be possible to do the same?

Error loading type library

Unable to cast COM object of type 'Microsoft.Office.Interop.PowerPoint.ApplicationClass' to interface type 'Microsoft.Office.Interop.PowerPoint._Application'. This operation failed because the QueryInterface call on the COM component for the interface with IID '{91493442-5A91-11CF-8700-00AA0060263B}' failed due to the following error: Error loading type library/DLL. (Exception from HRESULT: 0x80029C4A (TYPE_E_CANTLOADLIBRARY)).

How about Multithreading?

should it be possible to call the exe in a multithreaded environment? for example a server process which does batch converting

Crash

I have MS Office 2016 and .Net 4.7.02 installed. I tried several document conversions and the error is:

Unable to cast COM object of type 'Microsoft.Office.Interop.Word.ApplicationClass' to interface type 'Microsoft.Office.Interop.Word._Application'. This operation failed because the QueryInterface call on the COM component for the interface with IID '{00020970-0000-0000-C000-000000000046}' failed due to the following error: Error loading type library/DLL. (Exception from HRESULT: 0x80029C4A (TYPE_E_CANTLOADLIBRARY)).

Unhandled Exception: System.InvalidCastException: Unable to cast COM object of type 'Microsoft.Office.Interop.Word.ApplicationClass' to interface type 'Microsoft.Office.Interop.Word._Application'. This operation failed because the QueryInterface call on the COM component for the interface with IID '{00020970-0000-0000-C000-000000000046}' failed due to the following error: Error loading type library/DLL. (Exception from HRESULT: 0x80029C4A (TYPE_E_CANTLOADLIBRARY)).
at System.StubHelpers.StubHelpers.GetCOMIPFromRCW(Object objSrc, IntPtr pCPCMD, IntPtr& ppTarget, Boolean& pfNeedsRelease)
at Microsoft.Office.Interop.Word.ApplicationClass.Quit(Object& SaveChanges, Object& OriginalFormat, Object& RouteDocument)
at OfficeToPDF.WordConverter.Convert(String inputFile, String outputFile, Hashtable options)
at OfficeToPDF.Program.Main(String[] args)

Tables become malformed in specific scenarios

There are two scenarios that cause a generated pdf to malform the first table in the document. Documents from before and after conversion are attached. One scenario is an image immediately following another table (highlighted in Issue1.docx/Issue1.pdf), the other is another table containing a cell in its first column that only contains an image. Placing text between the table and the image solves issue 1, placing text in the cell alongside the image solves issue 2.
Issue1.docx
Issue2.pdf
Issue2.docx
Issue1.pdf

Other info:

  • Windows 10 64 bit OS
  • MS Word 2016 MSO (16.0.8730.2046) 32 bit

Feature: Is there a means to identify Office conversion process?

In some conversions the process fails due to concurrent executions, infinite sheets, password protected documents and many other reasons. To prevent that the process is hung indefinitely we have implemented a timeout to close the OfficeToPDF process. The task is working as expected but the Office process "excel.exe" or "winword.exe" sometimes remains open. Could the Process ID of the Office process be print to the output to be able to close that process?

Conversion pptx with officetopdf

Hello,
When I convert a pptx file, it works but I have a return code 3 and powerpoint does not close in the process.
Do you have a solution please.
Sébastien

x86 compatibility

Hello.
I try to use OfficeToPDF tool in Linux (CentOS 7). And use wine for it.
But OfficeToPDF is x86_64 and I've got x86 compatibility problem.
Have you x86 tool version?

Thank you.

Error when trying to use this tool on a large number of PDF files and BAT script

The issue comes when I'm trying to iterate through a group of files and convert them all to PDF. Here is the BAT script I am trying to use:

for %%f in (*.docx) do ( echo %%~nf OfficeToPDF.exe "%%~nf.docx" "%%~nf.pdf" )

The first file generates and then subsequent files do not and the command line give the following error:
Call was rejected by callee. (Exception from HRESULT: 0x80010001 (RPC_E_CALL_REJECTED))
Did not convert

Other times when I run the script it creates a few files then hangs. And sometimes when I run the script MS Word opens with a message that document1 has been modified, would you like to save it.

Let me know your thoughts.

Make it Chocolatey package

Dear OfficeToPDF developers, thank you for this wonderful software!
It would be nice if I can install it with:

 choco install officetopdf -y

Error codes

The error codes received, no longer match the list shown in documentation.
Can you please update the list? I got error codes like 35, 65 ...

OfficeToPDF doesn't work when there's no available printer

OfficeToPDF is returning the message "Did not convert" when there's no available printer.

Steps to reproduce:

  1. The test enviroment must not have any printer installed.
  2. Try to convert any Excel file (i only tested excel) by using the command line: OfficeToPDF /verbose "test.xlsx" "test.pdf"
  3. The error message is shown.

RPC server unavailable

hi, I use OfficeToPDF

and I found this error :
RPC server unavailable (HRESULT: 0x800706BA)
location : Microsoft.Office.Interop.Word.ApplicationClass.Quit(Object& SaveChanges, Object& OriginalFormat, Object& RouteDocument)

OS : windows 8.1
Office : MS Office 2013

This error seems to occur when I try to convert a MS Word file to pdf.

Any help?
thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.