Comments (12)
Man it seems to be specifically news sites that are the worst offenders when it comes to breaking the URL spec
Fix should be out in the alpha version
from nostrudel.
I'm with you. Let's just add _
and ,
to your last version:
https://regex101.com/r/GOWi8J/4
Thia should fix most issues.
from nostrudel.
This is something that's been on my todo list for a while. @
also breaks links. the issue is how I have the link RegExp written
To fix this I need to search for a better link regexp that supports non-english characters and other symbols
from nostrudel.
Can you post this regex (or code location)? Maybe I can find something.
from nostrudel.
the regexp is located here https://github.com/hzrd149/nostrudel/blob/master/src/helpers/embeds.ts#L56
It needs to be simplified and Unicode support added. however it also needs to avoid false positives as much as possible.
An example of a false positive would be http://sub.example.verylongingaliddomain/index.html
or https://example.com/???test=0
Or even two urls back to back without a space http://example.comhttp://example.com
I haven't figured out a good regexp yet, but I know one has to exist. if not how would other social media sites auto detect links
from nostrudel.
Thanks, I'll try some stuff over the next days.
from nostrudel.
So, I tried some stuff - best I came up with is this:
https?:\/\/([\w \.-]+\.\w+)(\S*)
Check it here: https://regex101.com/r/J4hHHn/1
http://sub.example.verylongingaliddomain/index.html
can be valid as there are TLDs like ".cancerresearch" which is already 15 chars long.
https://example.com/???test=0
seems to be a special case. You can prevent it but I guess it's not worth the effort.
http://example.comhttp://example.com
must be valid because of this: https://example.com?host=https://example.com
What do you think?
from nostrudel.
That's a good start, although the use of \S
(not white space) picks up some of characters like )
or ,
after the url which pare pretty common when putting the url in brackets
also \w
includes _
which technically is invalid for domains
I replace the use of \w
with a-zA-Z0-9
so it would not include _
and \S
with \p{Letter}\p{Number}
which should include any Unicode letter or number characters (not just English) https://www.regular-expressions.info/unicode.html#category
I also added a lot more example of URLs and false positives
https://regex101.com/r/GOWi8J/1
What do you think? can you think of any other strange URL formats that might need to be considered?
from nostrudel.
I think you're right on the domain part with \w
, but we need \S*
for the path, because it's perfectly fine to use ();,._[]
etc. in URL params.
Please check https://regex101.com/r/GOWi8J/2 - I added just two real links, which aren't working.
When I put back the \S
part, all links work, but markdown or comma separated links wont. -> https://regex101.com/r/GOWi8J/3
I'll try to find a way to make at least markdown work (but it's not supported in Nostr anyways, right?).
from nostrudel.
Fixing both links is pretty easy, just needed to add _
and ,
to the list of accepted characters. although it will break the ,
separated URLs ( but thats find because github dose not event support that )
I'm not sure about using ();,._[]
in URL params though, I know they can be used, but I believing they have to be escaped. either way Ive seen more markdown and links surrounded by ()
than I've seen those characters used in URL params.
I'm hesitant to use \S
is because it covers too much and it think it would be better for a few links to be broken then to have it select some of the text after the link
Test
https://example.com,https://example.com
https://example.com)
from nostrudel.
forgot to close this issue, but the fix for this was released a few days ago
from nostrudel.
I have to reopen this. Several news sites use tildes in image links. Can we include "~" in the regex?
Sample:
https://nostrudel.ninja/#/n/note1fquun6a9hjcsv0lcd8fafx53zqepqwf5xm6arez7sdzxuscpz28sq79c2g
from nostrudel.
Related Issues (20)
- a-tag is missing when replying to a long-form reply
- Avoid back-dating HOT 1
- "Public Key Color" is easy to attack
- Muting someone isn't removing all their posts in global
- Custom emoji reactions are displayed as shortcodes when viewed from other clients HOT 1
- Allow "searching" by nip05 HOT 1
- Custom emojis don't get rendered in the notifications
- Relay picker doesn't show "my" relays
- Make reading threads easier
- Allow me to increase the DM input field's size when authoring a message
- Zap event not being emitted after zapping DM - using nsecbunker
- Profiles not loading HOT 1
- Following tab on profile view errors HOT 1
- simple issue to fix HOT 1
- replies under notes like coracle (if is not too much to ask, this is literaly the only thing missing for nostrudel to be perfect in my opinion)
- display settings should be machine specific HOT 1
- Perpetually "Connecting..." when trying to login with nostr address via nsec.app HOT 1
- Feature Request: Look first at follows list when searching users HOT 1
- feat(nip-64): Display Chess (PGN) messages HOT 4
- Feature request - NIP46 UX
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nostrudel.