Git Product home page Git Product logo

orbitalquark / scintillua Goto Github PK

View Code? Open in Web Editor NEW
50.0 4.0 18.0 11.49 MB

Scintillua enables Scintilla lexers to be written in Lua, particularly using LPeg. It can also be used as a standalone Lua library for syntax highlighting support.

Home Page: https://orbitalquark.github.io/scintillua

License: MIT License

C++ 3.20% C 0.08% Makefile 0.76% Lua 95.97%
scintilla lua lpeg lexer syntax-highlighting scintillua lexers scite

scintillua's People

Contributors

orbitalquark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

scintillua's Issues

Heredoc formatting in the bash lexer is not correct

Currently vis has a set of patches contributed by @silasdb aiming to improve the formatting of heredocs: fix ending delimiter and a set to recognize <<- here and here.

However even with these patches the formatting is not yet correct for all correct usages of heredoc as it doesn't deal correctly with spaces between << and the delimiter. Moreover the formatting fails when the first delimiter is being quoted to prevent variable expansion. As a result fixing this appears not that trivial.

I've attached a test-file that highlights the above problems as this is easier to show than explain in part based on this tutorial that are worth fixing and probably a useful step to get this working correctly.

File including valid usage of heredoc: test-bash.zip

Update legacy lexers

Scintillua 6.0 considers lexers from Scintillua 5.x to be legacy lexers. While legacy lexers should still function properly, they ought to be migrated using this migration guide.

In addition to the steps outlined in the migration guide, programming language lexers should ideally distinguish between functions, builtin functions, and methods. They should also distinguish between constants and builtin constants, and variables and builtin variables. For example, the Lua lexer does so here and here. The Makefile lexer does so here.

Lexers that use custom tokens should try and pick from the updated list of tags. If that is not possible/reasonable, it may be worth considering adding to that list. Otherwise, it should remain a custom tag name and a style should be set for it for all themes in themes/. That will signal that applications will need to add styling for it.

The following unchecked Scintillua lexers still need to be migrated:

  • Actionscript
  • Ada
  • ANTLR
  • APDL
  • APL
  • Applescript
  • ASM (NASM)
  • ASP
  • AutoIt
  • AWK
  • Batch
  • BibTeX
  • Boo
  • C
  • C++
  • C#
  • ChucK
  • Clojure
  • CMake
  • Coffeescript
  • ConTeXt
  • CSS
  • CUDA
  • D
  • Dart
  • Desktop Entry
  • Diff
  • Django
  • Dockerfile
  • Dot
  • Eiffel
  • Elixir
  • Elm
  • Erlang
  • F#
  • Fantom
  • Faust
  • Fennel
  • Fish
  • Forth
  • Fortran
  • fstab
  • GAP
  • Gemini
  • gettext
  • Gherkin
  • git-rebase
  • Gleam
  • GLSL
  • Gnuplot
  • Go
  • Groovy
  • Gtkrc
  • Hare
  • Haskell
  • HTML
  • Icon
  • IDL
  • Inform
  • ini
  • Io
  • Java
  • Javascript
  • jq
  • JSON
  • JSP
  • Julia
  • LaTeX
  • Ledger
  • LESS
  • LilyPond
  • Lisp
  • Literate Coffeescript
  • Logtalk
  • Lua
  • Makefile
  • Man Page
  • Markdown
  • MATLAB
  • MediaWiki
  • Meson
  • MoonScript
  • Myrddin
  • Nemerle
  • Networkd
  • Nim
  • NSIS
  • Objective-C
  • OCaml
  • Pascal
  • Perl
  • PHP
  • PICO-8
  • Pike
  • PKGBUILD
  • Pony
  • Postscript
  • PowerShell
  • Prolog
  • Properties
  • Pure
  • Python
  • R
  • rc
  • Reason
  • REBOL
  • ReStructuredText
  • Rexx
  • RHTML
  • RouterOS
  • RPM Spec
  • Ruby
  • Ruby on Rails
  • Rust
  • Sass
  • Scala
  • Scheme
  • Shell
  • Smalltalk
  • strace
  • Standard ML
  • SNOBOL4
  • Spin
  • SQL
  • Systemd
  • TaskPaper
  • Tcl
  • TeX
  • Texinfo
  • TOML
  • txt2tags
  • TypeScript
  • Vala
  • vCard
  • Verilog
  • VHDL
  • Visual Basic
  • Windows Script File
  • XML
  • Xs
  • Xtend
  • YAML
  • Zig

html syntax highlight misses on single tags

Using vis, single html tags like hr, input, and meta (etc) aren't getting highlighted.
My lua knowledge is pretty sparse, but here's a work-around that is working for me so far:

From 69243d5a88623f5e7d08eb79b827ad31aa893ece Mon Sep 17 00:00:00 2001
From: jvvv
Date: Mon, 17 Jun 2024 13:16:17 -0400
Subject: [PATCH] fix syntax highlight for html single tags

---
 lua/lexers/html.lua | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lua/lexers/html.lua b/lua/lexers/html.lua
index 9146b63..a0ce2e7 100644
--- a/lua/lexers/html.lua
+++ b/lua/lexers/html.lua
@@ -16,7 +16,7 @@ lex:add_rule('doctype',
 
 -- Tags.
 local paired_tag = lex:tag(lexer.TAG, lex:word_match(lexer.TAG, true))
-local single_tag = lex:tag(lexer.TAG .. '.single', lex:word_match(lexer.TAG .. '.single', true))
+local single_tag = lex:tag(lexer.TAG, lex:word_match(lexer.TAG .. '.single', true))
 local known_tag = paired_tag + single_tag
 local unknown_tag = lex:tag(lexer.TAG .. '.unknown', (lexer.alnum + '-')^1)
 local tag = lex:tag(lexer.TAG .. '.chars', '<' * P('/')^-1) * (known_tag + unknown_tag) * -P(':')
-- 
2.45.2

I'm running vis built from git HEAD on alpinelinux edge.
I compared the html lexers in scintillua and vis, and they do not differ.
Also, I submitted downstream issue with vis: https://github.com/martanne/vis/issues/1196 which may get closed since I was asked to upstream the issue.

Merge vis lexers

Missing lexers from vis should be migrated here instead.

  • dsv
  • gemini
  • git-rebase
  • strace
  • vbscript

Can't disable Lua lexer through properties

With Scintillua installed in SciTE (Sc1, single file), I can't disable a Lua lexer resetting the property lexer.$(file.patterns.name)=name to the original lexilla lexer.
It seems to me that a recent commit in lexilla ScintillaOrg/lexilla@e4817a1 breaks the Scintillua behavior and a default Lua lexer is always created and set after been called by Lexilla::MakeLexer.
Any suggestions on how to get around this ?

Markdown lexer, code_inline not closed for empty backtick pairs

Open textadept 12.4, on the menu go to Tools > Quick Open > Quickly Open Textadept Home.
Select docs/api.md from the filter list and click OK to open it.
Go to line 7533, which is under the heading for textadept.editing.auto_pairs.
On line 7533 there's a pair of empty backticks, after them the text is styled incorrectly, as though the rest of the text is inside a code block. It's easier to see if you're using a dark theme.

The problem is in the code_inline LPEG pattern in lexers/markdown.lua on line 36. Here

local code_inline = lpeg.Cmt(lpeg.C(P('`')^1), function(input, index, bt)

The pattern matches any number of leading backticks and then calls a function to determine where the inline code ends. However it interprets any even number sequence of empty backticks as an un-closed region of inline code. Adding an if statement to handle those edge cases fixes the issue. See the new line in the example fix below.

local code_inline = lpeg.Cmt(lpeg.C(P('`')^1), function(input, index, bt)
  -- `foo`, ``foo``, ``foo`bar``, `foo``bar` are all allowed.
  local _, e = input:find('[^`]' .. bt .. '%f[^`]', index)
  if not e and (#bt % 2 == 0) then return index end  --<<<<< New line <<<<<<
  return (e or #input) + 1
end)

When you test this, other interactions with the code_line and code_block LPEG patterns can make it difficult to determine which pattern is causing which styling, but the above fix does work as intended as far as I can tell.

Thanks for all the time you've spent developing Textadept.

lexer.FOLD_HEADER, lexer.FOLD_BASE and lexer.FOLD_BLANK are nil

lexer.FOLD_HEADER, lexer.FOLD_BASE and lexer.FOLD_BLANK always evaluate to nil because they are never assigned to at the module level in lexer.lua.

They are only assigned to locals within lexer.fold()here

local FOLD_HEADER, FOLD_BLANK = M.FOLD_HEADER or 0x2000, M.FOLD_BLANK or 0x1000

Also this causes an error when using the rest lexer which overrides lexer.fold(). See here

local FOLD_BASE = lexer.FOLD_BASE

lua utf8 library is a lua5.3+ feature

Hi,

In the README you say that only lua5.1 is required for using scintillua
as a standalone library but the utf8 library, in use since (possibly)
3f7febe, is lua5.3+ only. Was this an intentional change?

Correctly formatting nesting variables in bash lexer

Vis currently caries the following patch to correctly format nested strings in bash like the following test case:

${FOO:="${bar}/baz"}

Currently the bash lexer in scintillua will format the last '"}' wrong, so that bug is still present. The fix in vis added the balanced flag to the delimiter_range function, but adding that same flag to the range function that replaces it doesn't work.

I suspect something is missing in the range() function if I compare it to the same part in delimited_range?

Lexer Error for ERB (rhtml) files

Since updating the v12 I've been getting errors when trying to edit ERB files. Reduced it down to the following process:

  1. Create a new blank file. I named mine test.html.erb
  2. Insert an extra blank line
  3. A new tab appears outputting the following error: ...ations/Textadept.app/Contents/Resources/lexers/lexer.lua:1205: grammar has no initial rule

I was able to work around this by creating a lexers/rhtml.lua file in my ~/.textadept directory to override the stock lexer. For now I've just make it use the html lexer via return lexer.load('html') so I at least get that syntax highlighting portion.

As an additional quirk, if tabs are disabled via ui.tabs = false then the entire Textadept does a hard crash. I am doing this on a Mac and haven't tested if this is reproducible on other platforms. I'm guessing it's trying to create that new tab with the error but can't or something? This additional bit sounds more like a bug in the Textadept program proper so let me know if you prefer me to report that separately to that project.

Bash Lexer incorrectly treats '#' as start of comment in variable pattern match

Hi I was reviewing a patchset to update vis to use scintillua 6.2
(devel_scintillua branch here) and I noticed that the bash lexer
can no longer handle pattern matching in variables. For example any text
like this snippet from our configure script:

--prefix=*) PREFIX=${arg#*=} ;;
--exec-prefix=*) EXEC_PREFIX=${arg#*=} ;;
--bindir=*) BINDIR=${arg#*=} ;;
--sharedir=*) SHAREDIR=${arg#*=} ;;
--docdir=*) DOCDIR=${arg#*=} ;;
--mandir=*) MANDIR=${arg#*=} ;;

It seems like the lexer can't find the closing } because the #
is being treated as the start of a comment.

I bisected the issue and it was introduced in b909d43 when the lexer
was updated to the new format. Unfourtantely I am not familiar with how
the lexers are implemented here so I was unable to make a patch but it
seems like there used to be a lexer.range('{', '}', true, false, true)
for variables that is no longer there.

LexLPeg.dll crash

I'm new to Scintillua and I was trying the 5.1 release with SciTE 5.1.1 64bit (single file 64-bit executable). I just added the line import lexers/lpeg to an empty SciTEGlobal.properties and it crashes immediately. I also tried the 32bit version (with SciTE 32), but it doesn't seem to work (doesn't crash, but no styles rendered).
Here are my findings till now:

  • Scintillua 5.1, SciTE 5.1.1: 64bit crash, 32bit doesn't work (both SciTE single executabele and SciTE full)
  • Scintillua 5.0, SciTE 5.0.3: 64bit crash, 32bit doesn't work
  • Scintillua 4.4.5-2, SciTE 4.4.5: 64bit crash, 32bit does work

Can I do some tests to help solve the problem? I would have tried to debug, but I only have a Windows system with VS 2019.

Custom Python styles not working in SciTE

The Python lexer's 'self' and 'decorator' styles are not working properly in SciTE 5.1.4 and Scintillua 5.2. They work when adding rules using lexer.TYPE and lexer.PREPROCESSOR tokens, respectively, but not when adding lexer.styles.type and lexer.styles.preprocessor styles.

latest SciTE v5.3.0 not working with scintillua 5.3 (Windows)

Hi,

Thanks for Scintillua!

SciTE v5.3.0 seems to have broken something?

SciTE v5.2.4 & Scintillua v5.3 worked together perfectly.

before,
2022-09-19 01_13_30-D__Coding_Work_Bezier_C_vec2d_005 h - SciTE

after (simply updating SciTE.exe, Scintilla.dll & Lexilla.dll),
2022-09-19 01_15_37-D__Coding_Work_Bezier_C_vec2d_005 h - SciTE

I'm using SciTE 64bit on Windows 8.1.

best regards,
Rafael.

Markdown lexer: two blockcode errors

Fenced blockcode with tilde (commonmark spec) it's not supported

Fixed this (I'm not a programmer and i don't know how to PR), adding this line after local code block =

local code_block_tilde = lexer.range(lexer.starts_line('~~~'), '\n~~~' * hspace^0 * ('\n' + P(-1)))

And then changed this line to include tilde block type:

lex:add_rule('block_code', token('code', code_line + code_block + code_block_tilde + code_inline))

Blockcode (line, not fenced) error (edge case with lists inside)

Markdown example (indent using Tabs or 4 spaces to get blockcode line):

	This
	* Whole
	+ Paragraph
	- Should be
	a Blockquote

But I get blockcode only first & last line, I think the other lines are recognized as list.

I've tried 2 fix this, but it's tied with lists and fixing one damage the other.


Tested expected behavior with Github and:

Bash lexer: test (`[`, `[[`) flags check doesn't work and is also very slow

One very common shell idiom is to have a test on a single line like
the following:

[ $# -lt 3 ] && ...

Currently this will be missed since the pattern requires a space before
the match (ie. if [ ...). This can be trivially fixed by:

diff --git a/lexers/bash.lua b/lexers/bash.lua
index f1365ae..439347b 100644
--- a/lexers/bash.lua
+++ b/lexers/bash.lua
@@ -71,7 +71,7 @@ local shell_op = '-o'
 local var_op = '-' * S('vR')
 local string_op = '-' * S('zn') + S('!=')^-1 * '=' + S('<>')
 local num_op = '-' * lexer.word_match('eq ne lt le gt ge')
-local in_cond_expr = in_expr{[' [[ '] = ' ]]', [' [ '] = ' ]'}
+local in_cond_expr = in_expr{['[[ '] = ' ]]', ['[ '] = ' ]'}
 local conditional_op = (num_op + file_op + shell_op + var_op + string_op) * #lexer.space *
   in_cond_expr

The slowness problem arises when there are many places in the input data
where the in_expr() function needs to be called. For example when
the input data is the configure script from vis. I think the
main problem is that using match to split up a block of input data
into lines is really inefficient. I'm not sure how to improve it while
maintaining functionality. Perhaps there is a way of using lexer.range()
to make it more efficient.

Syncing changes in the scheme lexer

The scheme lexer in vis currently has a pretty extensive patch on top of the legacy lexer. Although it is clearly an improvement I don't feel confident enough to merge this on top of the current lexer. Any chance you could take a look at how much is actually relevant?

Thanks in advance.

how to set colors ?

This library is great. I have used Scantillua.dll in my application to highlight the LUA script execution syntax. It currently works normally. When I read the manual, I am not sure how to set the color. Can I set it directly using the theme file? Thank you very much if you can reply

Additional highlighting for c++ lexer

I'd like to know your thoughts on some possible additional highlighting options.

Currently the lexer will highlight namespaces if they are part of an STL type (e.g. std::vector). Other editors seem to highlight all namespaces regardless.
So in foo::bar::foobar() foo and bar would be highlighted in a to-be-defined namespace color and foobar would be a normal function highlight. Would this be a possibility for TA as well?

Should values in angle brackets always be highlighted as type? Consider std::vector<Foo>, here Foo is a custom type and could be highlighted as such.

Trying to upgrade vis lexers to scintillua master but it doesn't seem to be valid Lua anymore (or …???)

While trying to update lexers in vis to https://github.com/orbitalquark/scintillua/releases/tag/scintillua_6.1 I have it on a strange issue, where I don’t understand. It seems that all lexers have something similar to this change:

-- local lexer = require('lexer')
-- local lex = lexer.new('ansi_c')
++ local lexer = lexer
++ local lex = lexer.new(...)

I really don’t understand the new code. Is local lexer = lexer even valid Lua? And what does lexer.new(...) actually mean (varargs are suppposed to be used in definition of a function not when calling it, right)?

Is there some Textpad magic involved which is not available for us poor Lua users?

tests.lua fails with Lua 5.3

In #27 it was noted that tests.lua fail with Lua 5.3. @moesasji confirms that and @orbitalquark argues that defining _G.lpeg works. Indeed, the patch below made it work for Lua 5.3. Defining lpeg as a global is equivalent to define _G.lpeg, no?

diff --git a/tests.lua b/tests.lua
index 5e0af5d..bb19711 100644
--- a/tests.lua
+++ b/tests.lua
@@ -5,7 +5,7 @@ package.path = 'lexers/?.lua;'..package.path
 
 local lexer = require('lexer')
 local token, word_match = lexer.token, lexer.word_match
-local lpeg = require('lpeg')
+lpeg = require('lpeg')
 -- Scintilla normally defines these.
 lexer.FOLD_BASE, lexer.FOLD_HEADER, lexer.FOLD_BLANK = 0x400, 0x2000, 0x1000

Syncing changes in ansi_c lexer

As you had probably guessed: I'm trying to sync the lexers used by vis so that both communities can benefit from each-others effort.

Unfortunately adding this commit on top of the current ansi_c lexer is a bit less straight forward due to the range of things it changed on top of the legacy lexer.

Key changes are in how it recognises the numbers as well adding additional non-std compiler keywords. Any change that you could take a look at how to best merge this one without me preparing a pull-request?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.