dj2 / ruby-rtf Goto Github PK
View Code? Open in Web Editor NEWA Ruby RTF Library
Home Page: http://github.com/dj2/ruby-rtf
License: Other
A Ruby RTF Library
Home Page: http://github.com/dj2/ruby-rtf
License: Other
Hi,
I was using this gem for parsing rtf file
ruby version is 2.4.0 and rails is 'rails',
'~> 5.1.4'when I execute
doc = RubyRTF::Parser.new.parse(File.open(path).read)`
where "path" being equal to a rtf file in public folder -- Ruby native File loads file.
CompatibilityError in
ruby-rtf-0.0.3/lib/ruby-rtf/parser.rb:187:in
handle_control'`
Any help or reference to any rtf parser without JAVA dependencies I know there are 2 such gems?
Try to parse attached file
test.zip
LibreOffice and all other known to me editors correctly opens it
But parsing it with ruby-rtf result are:
require_relative 'lib/ruby-rtf'
rtf = RubyRTF::Parser.new.parse(File.open('/home/lobashov/temp/test.rtf').read)
rtf.sections.each do |sec|
puts sec[:text]
end
Outpus (ignoring warnings about unknown control):
�đėĔ
ĖāĠĖ
ĄāĔā
�āĕĖ
Tested on ruby-rtf v0.0.5 with ruby 2.7
I could not use the gem. After debugging the problem checked the the code below does not run as it should.
lib/ruby-rtf/parse.rb: 39
while (current_pos < len)
char = src[current_pos]
current_pos += 1
case(char)
when '\\' then
name, val, current_pos = parse_control(src, current_pos)
current_pos = handle_control(name, val, src, current_pos)
I found that src [current_pos] returns an integer, so it never enters the case with the options to char.
After modifying the line 39 for "case (char.chr)" worked properly.
I use ruby 1.8.7 on Windows 7. Would be some incompatibility?
Ran into this while parsing some RTFs in the wild, here's a minimal test case:
{\rtf1 \~}
$ RubyRTF::VERSION
=> "0.0.5"
$ RubyRTF::Parser.new.parse('{\rtf1 \~}')
NoMethodError: undefined method `[]' for nil:NilClass
from [...]/gems/ruby-rtf-0.0.5/lib/ruby-rtf/parser.rb:107:in `parse_control'
Seems like \~
should be a valid directive according to https://www.biblioscape.com/rtf15_spec.htm?
My workaround for now is just to replace instances of \~
with whitespace characters before passing them to RubyRTF::Parser
.
RTF has the concept of hidden text. While the library has a flag for text that has been struckthrough, it doesn't seem to have anything to indicate that certain text is visible or not:
2.4.3 :269 > doc.sections[0]
=> {:text=>"HIDDEN TEXT", :modifiers=>{:justification=>:left, :left_indent=>0.0, :right_indent=>0.0, :font=>#<RubyRTF::Font:0x00007fdbdd024ef8 @family_command=:swiss, @name="Calibri", @alternate_name="", @non_tagged_name="", @panose="020f0502020204030204", @number=31506, @character_set=0, @pitch=:variable, @theme=:himinor>, :font_size=>12.0}}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.