Git Product home page Git Product logo

markdown-link-extractor's Introduction

markdown-link-extractor

Extracts links from markdown texts.

Installation

$ npm install --save markdown-link-extractor

API

markdownLinkExtractor(markdown)

Parameters:

  • markdown text in markdown format.

Returns:

  • an array containing the URLs from the links found.

Examples

const { readFileSync } = require('fs');
const markdownLinkExtractor = require('markdown-link-extractor');

const markdown = readFileSync('README.md', {encoding: 'utf8'});

const links = markdownLinkExtractor(markdown);
links.forEach(link => console.log(link));

Upgrading to v4.0.0

  • anchor link extraction no longer supported

Code that looked like this:

const { links } = markdownLinkExtractor(str);

Should change to this:

const links = markdownLinkExtractor(str);

Upgrading to v3.0.0

  • extended mode no longer supported
  • embedded image size parameters in ![]() no longer supported

Testing

npm test

License

See LICENSE.md

markdown-link-extractor's People

Contributors

aepfli avatar dependabot[bot] avatar dklimpel avatar nicolasmassart avatar nschonni avatar scrum avatar smellems avatar tcort avatar timmkrause avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

markdown-link-extractor's Issues

Does not extract URLs from HTML

HTML hyperlink can be part of markdown.

It would be nice to handle cases such as:

<a href='http://foo.bar'>foo</b>

Expected:

[
  "http://foo.bar"
]

RangeError: Maximum call stack size exceeded - for content with emojis

Marked version:
"marked": "^2.0.5"

Depended by this package: https://github.com/tcort/markdown-link-extractor

Describe the bug
Getting an error for content with emojis: RangeError: Maximum call stack size exceeded

To Reproduce
Steps to reproduce the behavior:

We are using the package markdown-link-extractor for the following sample content to extract links and the RangeError is thrown by marked

Hi, Patrick! ๐Ÿ‘‹
Did you hear that group activities like posting a discussion in a group can earn you 10x credits than the regular offer? Learn more about the special offers to look forward to this 10.10 Promo, from October 10 to 13 only!

Check the topic: **[๐Ÿ“ฃ Promo Catalog & Many Ways to Earn Bigger Credits this 10.10 Promo!](https://1pt.ee/c/ea2c93)**

Stack trace

RangeError: Maximum call stack size exceeded\nPlease report this to https://github.com/markedjs/marked.\n    at String.match (<anonymous>)\n    at Tokenizer.link (/home/bountee/bundle/programs/server/npm/node_modules/markdown-link-extractor/index.js:14:31)\n    at Tokenizer.tokenizer.<computed> (/home/bountee/bundle/programs/server/npm/node_modules/marked/src/marked.js:165:45)\n    at Tokenizer.tokenizer.<computed> (/home/bountee/bundle/programs/server/npm/node_modules/marked/src/marked.js:167:31)\n    at Tokenizer.tokenizer.<computed> 

Expected behavior
No error

Originally posted here:
markedjs/marked#2220

Marked.js doesn't parse links in front matter headers correctly

Description of the issue

As indicated in tcort/markdown-link-check#128 the parsing of links in front matter YAML is buggy and returns all the characters even after the end of the link, so it includes quotes (as quotes are ok in YAML to delimitate string values).
This seems to be a choice on the Marked.js side not to support this: markedjs/marked#485

Solving leads

We first need to check if latest Marked.js behaves in a better way.

Then there's two options:

  1. exclude the front matter header parsing from Marked.js parsing and parse it separately for links
  2. switch to a parser that handles front matter and would provide the correct result

1st option is clearly the easiest in my opinion as we don't know the effect of switching to a new parser on existing user projects.

Expectations

Markdown-link-extractor is expected to extract for all the links in markdown files including those in a front matter header.

Linked issue

#7 also asks for links to be extracted from html code included in markdown. This is the same kind of request. Maybe both could be handled at the same time?

Need to update dependency

Running an npm audit with this package installed suggests the dependency marked needs upgrading to >=4.0.10

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.