gen2brain / go-fitz Goto Github PK
View Code? Open in Web Editor NEWGolang wrapper for the MuPDF Fitz library
License: GNU Affero General Public License v3.0
Golang wrapper for the MuPDF Fitz library
License: GNU Affero General Public License v3.0
I receive this when attempting to build with this package. This was after switching to go 1.18. I have CGO_ENABLED=1, which is required to make arm64 work (M1)
# github.com/gen2brain/go-fitz
vendor/github.com/gen2brain/go-fitz/fitz.go:6:10: fatal error: 'mupdf/fitz.h' file not found
#include <mupdf/fitz.h>
^~~~~~~~~~~~~~
1 error generated.
make: *** [build] Error 2
The bazel is not able to build
DEBUG: /private/var/tmp/_bazel_mgenov/8cac0968a0c17cc631135ceb621314f4/external/bazel_gazelle/internal/go_repository.bzl:189:18:
com_github_gen2brain_go_fitz:
gazelle: /private/var/tmp/_bazel_mgenov/8cac0968a0c17cc631135ceb621314f4/external/com_github_gen2brain_go_fit/fitz_cgo_extlib_pkgconfig.go:
error reading go file: /private/var/tmp/_bazel_mgenov/8cac0968a0c17cc631135ceb621314f4/external
/com_github_gen2brain_go_fitz/fitz_cgo_extlib_pkgconfig.go: pkg-config not supported: #cgo pkg-config: mupdf
Please consider to add colorspace support,
I would need gray/mono conversion
error: no builtin cmap file: UniGB-UCS2-H
warning: unrecoverable error; ignoring rest of page
err 0
error: no builtin cmap file: UniGB-UCS2-H
warning: unrecoverable error; ignoring rest of page
error: no builtin cmap file: UniGB-UCS2-H
warning: unrecoverable error; ignoring rest of page
HI, I met a problem when I build my go program, here is the detail info:
github.com/gen2brain/[email protected]/libs/libmupdfthird_linux_amd64.a(one.o): relocation R_X86_64_PC32 against symbol `stdout@@GLIBC_2.2.5' can not be used when making a PDE object; recompile with -fPIE
I am confused about it. I will appreciate it if there is any solution given, thx~~
hi, In my app, i'm reading list of 'pdf' files, and list those having specific words.
Is there a way to modify the file itself, let's say I want to see if the file contains the word 'fitz', and if yes, I need to go to the file itself and highlight this word by yello color?
I know how to read the text and find if it contains this word, but how can I go back to the file and highlight it, if possible? thanks
In one of my projects, we are using go-fitz (v1.18.0) for pdf generation. However, recently the build package generation fails when we run the below command -
GOOS=linux GOARCH=amd64 go build -o main
The error that I get is -
go build github.com/gen2brain/go-fitz: build constraints exclude all Go files in /<path>/gen2brain/[email protected]
This used to work before. Has something changed recently in the past few months?
Raising issue here as I couldnt find a solution to this. Please take a look.
Apparently the Image method is not safe for concurrent use when rendering specially big images, ie. 1 MB. Any hints?
whenever I try to run code on a malformed or corrupted pdf file it throws a fatal error which is given below also I have attached the corrupted pdf file
test.pdf
error: cannot find startxref
warning: trying to repair broken xref
warning: repairing PDF document
error: array not closed before end of file
uncaught error: array not closed before end of file
exit status 1
> [6/6] RUN CC=x86_64-w64-mingw32-gcc CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build:
#10 1.090 # runtime/cgo
#10 1.090 gcc_linux_amd64.c: In function '_cgo_sys_thread_start':
#10 1.090 gcc_linux_amd64.c:61:2: error: unknown type name 'sigset_t'; did you mean '_sigset_t'?
#10 1.090 61 | sigset_t ign, oset;
#10 1.090 | ^~~~~~~~
#10 1.090 | _sigset_t
#10 1.090 gcc_linux_amd64.c:66:2: error: implicit declaration of function 'sigfillset' [-Werror=implicit-function-declaration]
#10 1.090 66 | sigfillset(&ign);
#10 1.090 | ^~~~~~~~~~
#10 1.090 gcc_linux_amd64.c:61:16: error: unused variable 'oset' [-Werror=unused-variable]
#10 1.090 61 | sigset_t ign, oset;
#10 1.090 | ^~~~
#10 1.090 cc1: all warnings being treated as errors
------
executor failed running [/bin/sh -c CC=x86_64-w64-mingw32-gcc CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build]: exit code: 2
Hello,
Any plans for implementing more of the API such as for getting and setting metadata? My main need is to extract the table of contents and metadata fields such as title, author, and keywords. I'm moving from Python (PyMuPDF) to Go and this seems like the most developed library. I haven't tried the C bindings but it seems like it might be all there.
This is more of a question rather than issue.
I am using v1.20.0
of go-fitz
and build the docker image using this Dockerfile
FROM golang:1.17-alpine3.14 as builder
RUN apk add --no-cache build-base \
mupdf-dev \
freetype-dev \
harfbuzz-dev \
jbig2dec-dev \
jpeg-dev \
openjpeg-dev \
zlib-dev
WORKDIR /dist
COPY . .
RUN export CGO_LDFLAGS="-lmupdf -lm -lmupdf-third -lfreetype -ljbig2dec -lharfbuzz -ljpeg -lopenjp2 -lz" \
&& go mod download \
&& go build -tags musl -o pdf-transcoder
FROM alpine:3.14
RUN apk add --no-cache mupdf \
freetype \
harfbuzz \
jbig2dec \
jpeg \
openjpeg \
zlib
WORKDIR /app
COPY --from=builder /dist/pdf-transcoder ./
CMD ["./pdf-transcoder"]
Build is successful but when I tried to run it, getting the below error:
cannot create context: incompatible header (1.20.0) and library (1.18.0) versions
2022/06/26 08:20:41 fitz: cannot create context
I am not getting what am I doing wrong? @gen2brain thanks in advance
Hello!
I'm trying to create a bazel pipeline in a project that uses the go-fitz, but I'm getting the following error:
INFO: Build option --define has changed, discarding analysis cache.
INFO: Analyzed 2 targets (0 packages loaded, 7401 targets configured).
INFO: Found 2 targets...
ERROR: /home/eduardo.cardozo/.cache/bazel/_bazel_eduardo.cardozo/298e3f4ef742d30c2aa0ed162984106b/external/com_github_gen2brain_go_fitz/BUILD.bazel:3:11: GoCompilePkg external/com_github_gen2brain_go_fitz/go-fitz.a failed: (Exit 1): builder failed: error executing command bazel-out/k8-opt-exec-2B5CBBC6/bin/external/go_sdk/builder compilepkg -sdk external/go_sdk -installsuffix linux_amd64 -tags extlib,pkgconfig,extlib,pkgconfig -src ... (remaining 29 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox
/usr/bin/ld.gold: error: cannot find -lmupdf_linux_amd64
/usr/bin/ld.gold: error: cannot find -lmupdfthird_linux_amd64
/tmp/rules_go_work-002962491/_cgo_main.o:_cgo_main.c:_cgohack_fz_default_color_params: error: undefined reference to 'fz_default_color_params'
/tmp/rules_go_work-002962491/_cgo_main.o:_cgo_main.c:_cgohack_fz_identity: error: undefined reference to 'fz_identity'
...
Basically, it looks like the libmupdf wasn't being found by the linker gold, which bazel uses to link their built binaries.
Link for and example project:
https://github.com/LuizEduardoCardozo/bazel-go-fitz
You can reproduce it by running the following command
bazel build ...
Does anyone knows how can I fix this?
excuse me?
how to build libmupdf_xxx_xxx.a file from mupdf source code?
We've been encountering some build issues on Travis since the recent commits to go-fitz:
https://travis-ci.com/RTradeLtd/Lens/builds/99836987 with -tags nopie
enabled
$ go vet ./...
# github.com/gen2brain/go-fitz
/usr/bin/ld: ../../gen2brain/go-fitz/libs/libmupdf_linux_amd64.a(colorspace.o): unrecognized relocation (0x2a) in section `.text.fz_init_cached_color_converter'
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
# github.com/RTradeLtd/Lens/vendor/github.com/otiai10/gosseract
tessbridge.cpp: In function ‘int Init(TessBaseAPI, char*, char*, char*, char*)’:
tessbridge.cpp:46:36: warning: ignoring return value of ‘FILE* freopen(const char*, const char*, FILE*)’, declared with attribute warn_unused_result [-Wunused-result]
freopen("/dev/null", "a", stderr);
^
tessbridge.cpp:60:36: warning: ignoring return value of ‘FILE* freopen(const char*, const char*, FILE*)’, declared with attribute warn_unused_result [-Wunused-result]
freopen("/dev/null", "a", stderr);
^
The command "go vet ./..." failed and exited with 2 during .
without tags: https://travis-ci.com/RTradeLtd/Lens/builds/99567591
11.54s$ go get -u github.com/gen2brain/go-fitz
# github.com/gen2brain/go-fitz
/usr/bin/ld: ../../gen2brain/go-fitz/libs/libmupdf_linux_amd64.a(colorspace.o): unrecognized relocation (0x2a) in section `.text.fz_init_cached_color_converter'
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
The command "go get -u github.com/gen2brain/go-fitz" failed and exited with 2 during .
Any help would be appreciated. Thanks!
Hello,
Thanks for the great work,
here is my issue
Code :
package main
import "C"
import "github.com/gen2brain/go-fitz"
//export GetKey
func GetKey() *C.char {
fitz.New("test.pdf")
return C.CString("test")
}
func main() {
}
when building c-shared
go build -buildmode=c-shared -o lib.a main.go
I got this error :
/usr/bin/ld: /home/near/go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_linux_amd64_musl.a(buffer.o): in function `fz_new_buffer':
buffer.c:(.text.fz_new_buffer+0x47): undefined reference to `sigsetjmp'
/usr/bin/ld: /home/near/go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_linux_amd64_musl.a(buffer.o): in function `fz_new_buffer_from_data':
buffer.c:(.text.fz_new_buffer_from_data+0x27): undefined reference to `sigsetjmp'
/usr/bin/ld: /home/near/go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_linux_amd64_musl.a(buffer.o): in function `fz_new_buffer_from_base64':
buffer.c:(.text.fz_new_buffer_from_base64+0xf6): undefined reference to `sigsetjmp'
/usr/bin/ld: /home/near/go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_linux_amd64_musl.a(colorspace.o): in function `fz_cached_color_convert':
colorspace.c:(.text.fz_cached_color_convert+0xa0): undefined reference to `sigsetjmp'
/usr/bin/ld: /home/near/go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_linux_amd64_musl.a(colorspace.o): in function `fz_new_colorspace':
colorspace.c:(.text.fz_new_colorspace+0x6f): undefined reference to `sigsetjmp'
/usr/bin/ld: /home/near/go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_linux_amd64_musl.a(colorspace.o):colorspace.c:(.text.fz_new_icc_colorspace+0x76): more undefined references to `sigsetjmp' follow
collect2: error: ld returned 1 exit status
Pdf can be converted into pictures normally, but there will be err error message. It is unclear how to close it?
error: cannot recognize xref format
warning: trying to repair broken xref
warning: repairing PDF document
Hi there. I'm using go-fitz
within a microservice application with a potentially high number of concurrent network requests. I am aware of the concurrency issues one might have with fitz
, e.g. documented in #4. However, I am creating a new fitz.Document
for each uploaded document, so it seems to be fine that way.
While load testing the application with k6
, I started to observe cryptic error messages from fitz
when there are many concurrent user requests for NumPage()
. Sometimes these message accumulated quickly and my application crashed. Just to show you some of these error messages:
error: expected object number
warning: repairing PDF document
warning: object missing 'endobj' token
error: cannot find object in xref (28 0 R)
warning: cannot load object (28 0 R) into cache
error: invalid key in dict
warning: ignoring broken object (21 0 R)
error: cannot recognize version marker
warning: trying to repair broken xref
warning: repairing PDF document
error: cannot recognize version marker
It appears that something is messing with our input data, but only under heavy load. After some research I found an issue for PyMuPDF
akin to the subject and also this one particular comment:
It seems the problem can be solved, if I prevent Python freeing the area via its garbage collection, as long as the Python object
fitz.Document
lives (i.e. is not closed or deleted). This can be done by recording a reference to the bytes / bytearray object in theDocument
object.
Originally posted by @JorjMcKie in pymupdf/PyMuPDF#173 (comment)
fitz.Document
uses *C.fz_stream
, which is just a memory range over a Go byte slice. So potentially Go may also GC the original byte slice too early, i.e. before NumPage()
has returned. fitz
is then trying to read this memory area and comprehensibly fails or even segfaults. If we add the original byte slice to the fitz.Document
struct, it should prevent this from happening.
I will provide a minimal application and a test PDF file in the comment below.
go version go1.17.5 linux/amd64
go mod vendor
go run main.go
../../vendor/github.com/gen2brain/go-fitz/fitz.go:6:24: fatal error: mupdf/fitz.h: No such file or directory
#include <mupdf/fitz.h>
Am new to golang, and I need to generate a thumbnail from pdf when I upload it. am using go-fitz library with docker and golang.
But am getting challenged on building docker image including ImageMagick
Docker set up
FROM golang:alpine AS build
RUN apk --no-cache add gcc g++ make git
WORKDIR /go/src/app
COPY go.mod .
COPY go.sum .
RUN go mod download
COPY . .
RUN GOOS=linux go build -tags extlib -ldflags="-s -w" -o ./bin/web-app ./main.go
FROM alpine:3.13
RUN apk --no-cache add ca-certificates
WORKDIR /usr/bin
COPY --from=build /go/src/app/bin /go/bin
EXPOSE 2053
ENTRYPOINT /go/bin/web-app --port 2053
/go/pkg/mod/github.com/gen2brain/[email protected]/fitz.go:8:10: fatal error: mupdf/fitz.h: No such file or directory
8 | #include <mupdf/fitz.h>
| ^~~~~~~~~~~~~~
compilation terminated.
Thanks
Following the suggestions in the readme, downloading the package with go get -u github.com/gen2brain/go-fitz
gives me the -fPIC
issue seen here https://gist.github.com/postables/4b603bb0e82ac4203b35dc49a50622e2
downloading the package with go get -u -tags gcc7 github.com/gen2brain/go-fitz
doesn't give me any errors, however running go vet ./...
produces the error seen here https://gist.github.com/postables/9eca025ea57f79d22c64acd945bee2fe the same issue is produced with go build
and go test
uname -a
:
Linux dark 4.15.0-39-generic #42-Ubuntu SMP Tue Oct 23 15:48:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
/etc/lsb-release
:
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Pop!_OS 18.04 LTS"
go version
:
go version go1.11.1 linux/amd64
I have replicated the issue on a separate machine with the following specs:
uname -a
:
Linux ipfs-node-1 4.15.0-38-generic #41-Ubuntu SMP Wed Oct 10 10:59:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
/etc/lsb-release
:
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"
go version
:
go version go1.10.3 linux/amd64
Update:
It appears that running with go vet -tags gcc7 ./...
solves the issue. However, I'm unclear as to how one can get around this issue while using this library on systems which require this work-around
after upgrading to version v1.18.0 my programm crash with intarnal go-fitz error:
warning: cannot load object (19 0 R) into cache
error: aborting process from uncaught error!
in version v0.0.0-20210316172528-f0a07eb93909 his was not
Hello,
I'm running the following Dockerfile on my OSX:
FROM golang:alpine
WORKDIR /app
RUN apk add --no-cache git gcc musl-dev
COPY go.mod go.sum ./
COPY example.go test.pdf ./
RUN go mod download
RUN go build -tags musl -o /example
CMD [ "/example" ]
Which results in the following outcome:
Step 1/8 : FROM golang:alpine
---> f5ae5d299f4c
Step 2/8 : WORKDIR /app
---> Using cache
---> faab50477730
Step 3/8 : RUN apk add --no-cache git gcc musl-dev
---> Using cache
---> 0fa39dfd6012
Step 4/8 : COPY go.mod go.sum ./
---> Using cache
---> 99ba26559e57
Step 5/8 : COPY example.go test.pdf ./
---> Using cache
---> f8076516ab72
Step 6/8 : RUN go mod download
---> Using cache
---> 3f6abaf51608
Step 7/8 : RUN go build -tags musl -o /example
---> Running in 14372ec812da
# github.com/gen2brain/go-fitz
/usr/lib/gcc/aarch64-alpine-linux-musl/11.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: /go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_linux_arm64.a(context.o): in function `fz_new_context_imp':
context.c:(.text.fz_new_context_imp+0x284): undefined reference to `__fprintf_chk'
I have also tried to run it with platform targets:
docker build . --platform=linux/arm64
- doesn't work, same errordocker build . --platform=linux/amd64
- worksI would want it to actually work properly for multiple targets.
Finally, I figured out the problem I initially had with go-iup! If go-fitz is imported, too, then running the executable results in the dreaded Pango error:
Pango-ERROR **: 17:38:29.868: Harfbuzz version too old (1.3.2)
or a segmentation violation. Since you're the maintainer of both packages, it might be worth looking at a way to fix this problem.
To repeat the problem:
package main
import (
"github.com/gen2brain/iup-go/iup"
// _ "github.com/gen2brain/go-fitz"
)
func main() {
iup.Open()
defer iup.Close()
lbl := iup.Label("This is a test label.")
dlg := iup.Dialog(
lbl,
)
dlg.SetAttribute("TITLE", "Label")
iup.Show(dlg)
iup.MainLoop()
}
It works. Uncomment the line importing go-fitz, and the recompiled executable crashes. (My executable shows the Pango error instead of crashing when go-fitz is imported but I wasn't able to repeat that in the minimal example.) I'm putting the issue here although I can't say which package is the real culprit. In any case, they seem to require incompatible Pango versions.
Tested on Linux Mint 20.3, which is based on Ubuntu 20.04.
The following code is modified from the README, simplified to the greatest extent possible.
package main
import "github.com/gen2brain/go-fitz"
func main() {
doc, err := fitz.New("test.pdf")
if err != nil {
panic(err)
}
defer doc.Close()
// Extract pages as images
for n := 0; n < doc.NumPage(); n++ {
_, err := doc.Image(n) // <- Memory leak here
if err != nil {
panic(err)
}
}
}
The above code produced a memory leak at the call to doc.Image
. When executed on a longer PDF where rendering each slide results in an image ranging from 1 to 5 MB (averaging 1.9 MB in size), the program easily uses an entire gigabyte of RAM before 30 seconds has passed.
A similar pattern can be observed when calling doc.SVG
and doc.HTML
, although both are generally speaking lighter on the RAM usage. I haven't observed this leak when calling doc.Text
, but this may just as well be because the operation occurs much faster due to its simplicity.
MuPDF uses a setjmp
based exception handling system, i.e fz_try/fz_always/fz_catch. In order to detect if file is broken or not this needs to be somehow implemented in Go.
In the NewFromMemory function you read the byte slice into a data variable (*C.uchar) and then turn it into a stream. When the job is done that variable lingers in memory. I don't know a lot about C or Cgo integration, but I fixed it in my fork of go-fitz by putting the pointer into the document struct and made sure it got freed in the document close method. When I tried to defer its deletion inside the NewFromMemory function I ran into a panic on the stream.
To test the problem I read a pdf into a byte slice and had a for loop open it with NewFromMemory, dump its text, and then close the document. You'll see the build up in ram immediately.
I figure you might have a better solution than I came up with, because I'm literally guessing at the C commands! ;)
On attempt of:
func Example_NonPDF() {
buf := bytes.NewBufferString("Non pdf")
_, err := fitz.NewFromReader(buf)
// This line never executed. How to caught uncaught error?
fmt.Printf("fitz.NewFromReader error: %v\n", err)
// Output: some error expected
}
this is printed in stdout:
error: cannot recognize version marker
warning: trying to repair broken xref
warning: repairing PDF document
error: no objects found
uncaught error: no objects found
Execution interrupts, no error returned.
I suppose that are cgo errors - how to handle them?
I would like to use this library, however go get -u is failing and I'm unsure how to fix it. I'm experienced with Go, yet I haven't done much with shared libraries and cgo. A sample of errors is below. I assume I just need to recompile them with the flag as suggested, however, I'm not sure where they came from.
/usr/bin/ld: /src/github.com/gen2brain/go-fitz/libs/libmupdfthird_linux_amd64.a(tgt.o): relocation R_X86_64_32 against
.rodata.str1.8' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: /src/github.com/gen2brain/go-fitz/libs/libmupdfthird_linux_amd64.a(inffast.o): relocation R_X86_64_32S against
.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: /src/github.com/gen2brain/go-fitz/libs/libmupdfthird_linux_amd64.a(cmsalpha.o): relocation R_X86_64_32S against.rodata.FormattersAlpha.7892' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: /src/github.com/gen2brain/go-fitz/libs/libmupdfthird_linux_amd64.a(cmscnvrt.o): relocation R_X86_64_32 against
.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: /src/github.com/gen2brain/go-fitz/libs/libmupdfthird_linux_amd64.a(cmsgamma.o): relocation R_X86_64_32 against `.data.DefaultCurves' can not be used when making a shared object; recompile with -fPIC
which version of mupdf does it based
Hi
I'm getting the following issue after following #35
Image looks like
FROM golang:1.17-alpine3.14 AS build_api
WORKDIR /go/src/gitlab.com/repo
COPY . .
RUN apk add --no-cache build-base \
mupdf mupdf-dev \
freetype freetype-dev \
harfbuzz harfbuzz-dev \
jbig2dec jbig2dec-dev \
jpeg jpeg-dev \
openjpeg openjpeg-dev \
zlib zlib-dev && \
go mod download && \
go mod vendor && \
CGO_LDFLAGS="-lmupdf -lm -lmupdf-third -lfreetype -ljbig2dec -lharfbuzz -ljpeg -lopenjp2 -lz" && \
GOOS=linux CGO_ENABLED=1 GOARCH=amd64 go build -o /tmp/fitz scan.go
error: cannot find builtin CJK font
How do I solve it !!!
Can you change file fitz_cgo_extlib.go:
#cgo LDFLAGS: -lmupdf
to
#cgo LDFLAGS: -lmupdf -lm -lmupdfthird
It will run error without mupdfthird.
Thank you very much.
Hi,
I'm running go-fitz on windows, it compiles fine, and runs fine, until it tries to open a PDF. The line in question is:
fmt.Println("and.... getting ready to read the pdf " + filepath.Join(origDirPath, subject) + ".pdf")
doc, err := fitz.New(filepath.Join(origDirPath, subject) + ".pdf")
if err != nil {
fmt.println("error: " + err.Error())
}
The println before fitz.New
prints:
getting ready to read the directory C:\Users\xxxxx\Desktop\pdfs\ML-xxxx_REV1A.pdf
which is a valid path to a PDF.
I get the error however:
A third of the way down in the stack trace you can see interface.go:157 - thats the fitz.New
line above.
Any suggestions as to what may have caused this would be very useful, thanks!
EDIT:
package main
import (
"fmt"
fitz "github.com/gen2brain/go-fitz"
)
func main() {
doc, err := fitz.New("C:\\Users\\xxxx\\Desktop\\pdfs\\ML-MT-PH-xxxx.pdf")
if err != nil {
fmt.Println("rror new " + err.Error())
}
defer doc.Close()
return
}
Returns with:
Segmentation fault
EDIT: Managed to get it to spit out exit status 3221225477
which I think is a buffer overflow....
Note this is Windows 64 (Virtual Machine)
The Go image/png encoder is quite slow. Would you be able to provide an example with mupdf native png writer?
Converting the 4x6 inch size pdf to jpg image using the code as given on readme file for pdf as image. The image is generated easily but its size is 1200 x 1800 pixel. Didn't find any options for size related things. Can you please help with this.
The lib is working fine locally on my golang setup, however when I try to pack it inside a docker container based on Alpine, I got some error logs.
I would like to try to use Mupdf as external lib, I see in the Readme you have some flags to use external lib but for some reason it seems to not work on my machine:
# github.com/gen2brain/go-fitz
/go/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdfthird_linux_amd64.a(one.o): In function `fmtdate':
one.c:(.text.fmtdate+0x76): undefined reference to `__sprintf_chk'
I build my project using
go build -tags extlib
I'm not used to golang build tags so I don't know if I use it correctly, any kind of help would be appreciated.
Hi:
My os is macOs, now i need to compile binary to run on centos7, i use the command GOOS=linux GOARCH=amd64 go build, and there is an error:
go build github.com/gen2brain/go-fitz: build constraints exclude all Go files in /Users/allen/works/go/src/github.com/gen2brain/go-fitz.
Look forward to your reply.
Trying to test it at Windows, but got:
# github.com/gen2brain/go-fitz
C:/Users/hasan/Documents/GoWorkPlace/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_windows_amd64.a(output.o):output.c:(.text$file_tell+0xe): undefined reference to `__imp__ftelli64'
C:/Users/hasan/Documents/GoWorkPlace/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_windows_amd64.a(output.o):output.c:(.text$file_seek+0x14): undefined reference to `__imp__fseeki64'
C:/Users/hasan/Documents/GoWorkPlace/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_windows_amd64.a(stream-open.o):stream-open.c:(.text$seek_file+0x1d): undefined reference to `__imp__fseeki64'
C:/Users/hasan/Documents/GoWorkPlace/pkg/mod/github.com/gen2brain/[email protected]/libs/libmupdf_windows_amd64.a(stream-open.o):stream-open.c:(.text$seek_file+0x2e): undefined reference to `__imp__ftelli64'
collect2.exe: error: ld returned 1 exit status
I'm building a simple microservice based on go-fitz using the official Golang Docker image. When executing the go build
step within my container, I'm getting a bunch of error messages from ld
. For Alpine-based image it looks like this:
# github.com/gen2brain/go-fitz
/usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../x86_64-alpine-linux-musl/bin/ld: /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../lib/libmupdf.so: undefined reference to `FT_Select_Charmap'
...
I tried the Bullseye image instead, but got almost identical errors:
# github.com/gen2brain/go-fitz
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/10/../../../../lib/libmupdf.a(font.o): in function `ft_char_index':
(.text.ft_char_index+0x10): undefined reference to `FT_Get_Char_Index'
...
My Dockerfile for Alpine:
FROM golang:1.16-alpine
RUN apk add --no-cache build-base mupdf mupdf-dev
WORKDIR /app
COPY go.mod ./
COPY go.sum ./
COPY src/*.go ./
RUN go mod download && go build -tags extlib -o /fitz-rest
Bullseye (may also be changed to Buster, the issue still persists):
FROM golang:1.16-bullseye
RUN apt update && apt -y install build-essential mupdf libmupdf-dev
WORKDIR /app
COPY go.mod ./
COPY go.sum ./
COPY src/*.go ./
RUN go mod download && go build -tags extlib -o /fitz-rest
I do understand, that the errors are coming from unresolved dependencies, e.g. FT_Get_Char_Index
obviously originates from the freetype
API. However these dependencies are expected to be brought by the mupdf
and the corresponding dev
packages, and if I look into the container filesystem they are indeed there including header files.
Now I'm running out of ideas how to solve this issue. Maybe you can help me.
P.S. I've already seen #32 but this issue seems to be completely different. The OP of #13 has apparently somehow brought the docker image to work, unfortunately we don't know how exaclty.
Thank you very much for your nice work!
It works for use for PDF
files but throws the following issue for some EPUB
files
error: FT_New_Memory_Face((null)): unknown file format
error: aborting process from uncaught error!
The loading document step seems to work fine , but any operation after that (doc.NumPage(), doc.ImagePNG() )
throws.
Here is a file to recreate the issue. FYI the MuPDF
desktop app works fine for this exact file.
https://drive.google.com/file/d/1Fu3wZ4iablY-35c9MqzIDRqBN5SzrxOe/view?usp=sharing
Some context about the book: language: Japanese
layout: RTL
writing direction: vertical
Would you please give us some pointers to deal with this?
Hi, this is not an issue in the package itself, but something may you be able to help with.
While reading 'pdf' files, some times I stuck with the file being a 'scan' rather than clean 'pdf' in this case it is appearing as image, and can not read it with this package:
My thought I may use another package to read image, like tesseract if failed to read using 'go-fitz'.
Thanks
Hello there
May be this case was already treated but I can't find a way around this. I am trying to compile a basic example using this library but nothing seems to work. this is the command I use
go build -tags "extlib" -o bin/sample main.go
.
Of course I installed mupdf first using homebrew (I am on macos) but the build does not work. I also tried to add env variable such as LIBRARY_PATH
or LDFLAGS
but with no result. the basic example I am trying to compile is the one on the project page. Based on what I have understood the could be a way to build just by using the bundled library..
any leads that can point me to the right direction ?
thanks for the help.
Hello, I'm currently getting the following error on go get
this package:
verifying github.com/gen2brain/[email protected]: checksum mismatch
downloaded: h1:2fa6dEQmSv1utU4Zb8NaY8bjzhfATyiKrs4nffGel7M=
go.sum: h1:vZQRGgQZqHzZRGRnvSUDwz26YD0ZnIyprAvR2C+Hh24=
SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.
For more information, see 'go help module-auth'.
My go version is go version go1.16.13 linux/amd64
. There seems to be some tagging error of some sorts.
/usr/bin/ld: BFD version 2.20.51.0.2-5.44.el6 20100205 internal error, aborting at reloc.c line 443 in bfd_get_reloc_size
/usr/bin/ld: Please report this bug.
这是我的系统信息
Linux version 2.6.32-754.35.1.el6.x86_64 ([email protected]) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC) ) #1 SMP Sat Nov 7 12:42:14 UTC 2020
hello
when pdf convert to jpg,console report:
error: no builtin cmap file: Adobe-GB1-UCS2
warning: unrecoverable error; ignoring rest of page
whether the the encodings character is missing ?
can i set default encoding when lack to support like Adobe-GB1-UCS2
or others ?
or how can fix it
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.