laptobbe / tsmarkdownparser Goto Github PK

License: MIT License

Objective-C 96.76% Ruby 2.19% Swift 1.05%

tsmarkdownparser's Introduction

TSMarkdownParser

TSMarkdownParser is a markdown to NSAttributedString parser for iOS implemented using NSRegularExpressions. It supports many of the standard tags layed out by John Gruber on his site Daring Fireball. It is also very extendable via Regular Expressions making it easy to add your own custom tags or a totally different parsing syntax if you like.

Supported tags

Below is a list of tags supported by the parser out of the box, to add your own tags see "Adding custom parsing"

Escaping
\`
`code`
``code``

Headings
# H1
## H2
### H3
#### H4
##### H5
###### H6

Lists
* item
** item
+ item
++ item
- item
-- item

Quotes
> text
>> text

Images
![Alternative text](image.png)

URL
[Link text](https://www.example.net)

Autodetection
https://www.example.net

Emphasis
*Em*
_Em_
**Strong**
__Strong__

Requirements

TSMarkdownParser 2.x requires Xcode 7 or newer.

Installation

TSMarkdownParser is distributed via CocoaPods

pod 'TSMarkdownParser'

alternativly you can clone the project and build the static library setup in the project, or drag the source files into you project.

Usage

The standardParser class method provides a new instance of the parser configured to parse the tags listed above. You can also just create a new instance of TSMarkdownParser and add your own parsing. See "Adding custom parsing" for information on how to do this.

NSAttributedString *string = [[TSMarkdownParser standardParser] attributedStringFromMarkdown:markdown];

Customizing appearance

You can configure how the markdown is to be displayed by changing the different properties on a TSMarkdownParser instance. Alternatively you could implement the parsing yourself and add custom attributes to the attributed string. You can also alter the attributed string returned from the parser.

Adding custom parsing

Below is an example of how parsing of the bold tag is implemented. You can add your own parsing using the same addParsingRuleWithRegularExpression:block: method. You can add a parsing rule to the standardParser or to your own instance of the parser. If you want to use any of the configuration properties within makesure you use a weak reference to the parser so you don't create a retain cycle.

NSRegularExpression *boldParsing = [NSRegularExpression regularExpressionWithPattern:@"(\\*\\*|__)(.+?)(\\1)" options:kNilOptions error:nil];
__weak TSMarkdownParser *weakSelf = self;
[self addParsingRuleWithRegularExpression:boldParsing block:^(NSTextCheckingResult *match, NSMutableAttributedString *attributedString) {
    [attributedString deleteCharactersInRange:[match rangeAtIndex:3]];
    [attributedString addAttributes:weakSelf.strongAttributes range:[match rangeAtIndex:2];
    [attributedString deleteCharactersInRange:[match rangeAtIndex:1]];
}];

License

TSMarkdownParser is distributed under a MIT licence, see the licence file for more info.

tsmarkdownparser's People

Stargazers

Watchers

tsmarkdownparser's Issues

deprecation warning for iOS/tvOS: `stringByAddingPercentEscapesUsingEncoding:`

- WARN  | [tvOS] xcodebuild:  TSMarkdownParser/TSMarkdownParser/TSMarkdownParser.m:253:77: warning: 'stringByAddingPercentEscapesUsingEncoding:' is deprecated: first deprecated in tvOS 9.0 - Use -stringByAddingPercentEncodingWithAllowedCharacters: instead, which always uses the recommended UTF-8 encoding, and which encodes for a specific URL component or subcomponent since each URL component or subcomponent has different rules for what characters are valid. [-Wdeprecated-declarations]

But because we are trying to do some error handling here, we would need to manually decompose the string to be aware of the components of the URL to percent-escape.

Double formatting doesn't parse

If you have both strong and emphasis on the same text, i.e.:

"Here is **some text that is both _bold and italic_**"

Then it should be able to parse it as follows:

Here is some text that is both bold and italic

But right now it parses it as

Here is some text that is both bold and italic

How to support plain urls (without markdown syntax)?

When the text has [links like this](http://example.com) everything works fine.

But in the case the text is just a link: http://example.com then the link isn't detected, like it would be on GitHub for example. Is this easy to add?

Inject the bullet point character

Right now the bullet point character is set to a fixed "•\t". I would suggest to use a public variable here so other bullet points could be used.

tvOS broken due to unavailable font

Commit da3aea3 introduced a bug that prevents TSMarkdownParser from working on tvOS. The font "Georgia-Italic", or Georgia in general, is not available, so tvOS apps crash when instantiating a TSMarkdownParser.

Specifically, [UIFont fontWithName:] returns nil, but nil cannot be the value in an NSDictionary.

The standard parser doesn't support nested lists.

The following markdown doesn't render correctly.

- 1
  - 1.1
- 2

1
- 1.1
2

The regular expression needs updating, so I'll submit a pull request.

Naming question for new classes

For next major version, many pre-built parsers will be available. Which names would be best, @laptobbe ?

TSStandardParser, TSGithubParser, TSStackOverflowParser, ...
TSMStandardParser, TSMGithubParser, TSMStackOverflowParser, ...
TSMarkdownStandardParser, TSMarkdownGithubParser, TSMarkdownStackOverflowParser, ...

URLs containing # are invalid

Because of #22, the following code:

[link](http://example.com/path#section)

Generates the following url:

http://example.com/path%23section

Which resolves to a different path than path, leading to an invalid URL.

Since strings are converted to NSURL behind the hood, maybe it would be better to ensure that a valid URL string is passed -- and potentially show a warning -- instead of automatically alter the URL with unknown consequences.

Parsing issue with multiple links

I'm using your library and I came across some markdown that crashed the parser:

Markdown with a [link](http://google.com) and [phone number](tel:+441234567890).\n\nNew paragraph!

What was happening was the first link was being matched but the regex was too greedy and also matched all the text in between the start of the first link and the end of the last link. When the URL was parsed it was obviously invalid and hance url was nil. Once we attempted to add the attribute with the nil url value it crashed the application.

The regex I came up with to fix this was:

static NSString *CNMarkdownLinkRegex = @"(?<!\\!)\\[.*?\\]\\(\\S*\\)";

The change is the addition of the ? inside the \\[.*?\\] part. This means it will lazily match, rather than greedily consuming as much as it can. I also changed the link part to use \S because URL's should not contain whitespace. Ideally we'd only match for valid URL characters.

Em regex fails for 1 character words

Current Em regex is:

static NSString *const TSMarkdownEmRegex        = @"(?<=[^\\*_]|^)(\\*|_)[^\\*_]+[^\\*_\\n]+(\\*|_)(?=[^\\*_]|$)";

But it fails for "*a* nice *boy*":

UIFont *font = [UIFont italicSystemFontOfSize:12];
NSAttributedString *attributedString = [self.standardParser attributedStringFromMarkdown:@"*a* nice *boy*"];
XCTAssertNotEqualObjects([attributedString attribute:NSFontAttributeName atIndex:1 effectiveRange:NULL], font);

Why not using the regex "([\\*|_]{1}).+?\\1" instead?

Feature request: ignore spaces before a list bullet (was: does not parse bullets very well)

Seems to have issues with spacing
Ex. does not work:
Draw a large face of a clock and place in the numbers \n\n + position the hands for 5 minutes after 11 o'clock \n\n + On your clock, label "L" for the long hand and "S" for the short hand

removing the spaces before the + makes it work. P.S. this should be some text followed by 2 bullet points.

Also had issues when using a * for bullet points.dot have an example handy but it seems to confuse it with emphasis which is also a *

Links are always blue

It does not matter what color I supply to _linkAttributes, the URL stays blue, e.g.

_linkAttributes = @{ NSForegroundColorAttributeName: [UIColor greenColor],
NSUnderlineStyleAttributeName: @(NSUnderlineStyleSingle) };

➔ Blue

New Pod release with last commits

Hi,

we're using this library for an appleTV app and pointing directly to the latest commit of the repo to be using the latest improvements like this: 01be140
Are you planning to create a new Pod release?

Thanks

Crash reported parsing markdown

We have had a significant number of crashes reported from the field via Crashlytics. The crash is:
Fatal Exception: NSInvalidArgumentException
*** -[__NSPlaceholderDictionary initWithObjects:forKeys:count:]: attempt to insert nil object from objects[0]

and the call stack looks like:
Fatal Exception: NSInvalidArgumentException
0 CoreFoundation 0x7fffc5cb72b3 (Missing)
1 libobjc.A.dylib 0x7fffdaace48d (Missing)
2 CoreFoundation 0x7fffc5bb43f0 (Missing)
3 CoreFoundation 0x7fffc5bb425b (Missing)
4 TSMarkdownParser 0x10a1707ad (Missing)
5 TSMarkdownParser 0x10a170991 (Missing)
6 Self Service 0x109af1fde specialized SSHistoryContentCellView.(content.didset).(closure #1) (SSHistoryContentCellView.swift:37)

Here is the SSHistoryContentCellView code in question:

              let attrDescription = TSMarkdownParser.standard().attributedString(fromMarkdown: self.content.jssDescription ?? "", attributes: [NSAttributedStringKey.foregroundColor.rawValue : NSColor.descriptionTextColor()])
                self.descriptionLabel.isHidden = attrDescription.length == 0
                self.descriptionLabel.textStorage?.setAttributedString(attrDescription)
                self.descriptionHeightConstraint.constant = self.self.descriptionLabel.heightForString()

Looking at the pod code and our own code, the only usage of NSDictionary is for the attributes. But I am passing in a constant for the attributes dictionary, so that should never go null. I am at a loss of what might be causing this, and I am unable to reproduce it in-house.

Carthage Framework Not Working

I added TSMarkdownParser to my project using Carthage. The problem is that I'm getting "Could not build module TSMarkdownParser" when I try to include it. The issue it's giving is that TSMarkdownParser.h has TSBaseParser.h as an import but the header file is not publicly available.

The error it gives there is "TSBaseParser.h file not found"

It assumes that this isn't there because that file is not public. Might just be a matter of adding that file to the public section of the framework?

CodeEscaping is conflicting with Escaping

Standard parser has both CodeEscaping (with `) and Escaping (with ), and current implementation makes them conflict with each other:

parsing @"\\a" works as expected
parsing @"\a" doesn't work as expected

In Simulator the markdown parser has a max character limit

Using swift

let markdownParser = TSMarkdownParser.standardParser()
markdownParser.attributedStringFromMarkdown(markdownString)

if the markdownString is too long, in simulator it doesn't parse the markdown... this isn't critical but I'm wondering if this is a arch issue

Sometimes app crashes in TSMarkdownParser init

Sometimes app crashes in TSMarkdownParser init:

Fatal Exception: NSInvalidArgumentException
0  CoreFoundation                 0x2584985b __exceptionPreprocess
1  libobjc.A.dylib                0x37422dff objc_exception_throw
2  CoreFoundation                 0x2576734b -[__NSPlaceholderDictionary initWithObjects:forKeys:count:]
3  CoreFoundation                 0x257671cf +[NSDictionary dictionaryWithObjects:forKeys:count:]
4  Service                        0x1dd60f -[TSMarkdownParser init] (TSMarkdownParser.m:35)
5  Service                        0x1dd921 +[TSMarkdownParser standardParser] (TSMarkdownParser.m:55)

From the code I can see that it is possible that it crashes in:

_quoteAttributes = @[@{NSFontAttributeName: [UIFont fontWithName:@"HelveticaNeue-Italic" size:defaultSize]}];

    _monospaceAttributes = @{ NSFontAttributeName: [UIFont fontWithName:@"Courier New" size:defaultSize],
                              NSForegroundColorAttributeName: [UIColor colorWithRed:0.95 green:0.54 blue:0.55 alpha:1] };

On stackoverflow there is a discussion re HelveticaNeue-Italic.
http://stackoverflow.com/questions/19527962/what-happened-to-helveticaneue-italic-on-ios-7-0-3

I can see these crashes on iOS 9.1.0 and even 9.3.1, but cannot replicate on my iPads or emulator
Maybe it will be safer to add something like:

    UIFont *quoteFont = [UIFont fontWithName:@"HelveticaNeue-Italic" size:defaultSize];
    if (quoteFont == nil) {
        quoteFont = [UIFont italicSystemFontOfSize:defaultSize];
    }
    if (quoteFont == nil) {
        quoteFont = [UIFont systemFontOfSize:defaultSize];
    }
    _quoteAttributes = @[@{NSFontAttributeName: quoteFont}];

Incorrect handling of URL with parentheses in the title text

I found that when Markdown contains something like this:
[Something something (something)](http://someurl.com)

The processing code finds the first open paren rather than the last one. A simple fix for this I made in my copy was to change line 195 of TSMarkdownParser.m to use NSBackwardsSearch when doing the search for the open parenthesis character.

_ in image name does not work

the above image filename does not work, the _ is recognized as a metacharacter and stripped out.

Feature request: Github release with travis for Carthage

Adding some configuration to the .tavis.yml like this https://github.com/Carthage/Carthage#use-travis-ci-to-upload-your-tagged-prebuild-frameworks would make this even better.

SPM Release?

We are moving away from cocoa pods to spm. So we need a Package.swift in here.
Can you guys create a release with the Package.swift file in the tag?

Link parsing conflicts with autodetection when the link text happens to be recognized as something to link as well

Consider the following markdown text:

[[email protected]](mailto://[email protected])

In the above case, it appears that auto-detection overrides the original link (which does correctly identify the URL as mailto://[email protected] with the text of the link ie ([email protected]).

[Mr. Foo](mailto://[email protected])

therefore works.

Not sure how this should be fixed, but one thought is that if a link text already has a link associated with it, then don't override with autodetect link... There may be more complicated use cases like:

[send a mail to [email protected]](mailto://[email protected])

ie the link text range and the autodetect link text range may be different. I am going to attempt to fix this, but would like to hear feedback on the above suggestion or other ideas on a solution.

Reference-style links

The parser currently does not support reference-style links. Is there any plan to support that syntax?

Feature request: Mandatory space after a Header/List markup

Let's assume we want to be able to generate both examples below in Markdown:

Example 1
this is a star '*'

Example 2

this is a star '*'

The initial asterisk will need to be interpreted differently. This is solved by http://spec.commonmark.org/ by requiring the use of a space if you want a list item:

Example 1 in Markdown
*this* is a star '*'

Example 2 in Markdown
* this is a star '*'

The same goes for initial + or - signs to allow the use of explicite maths signs when you don't put a space:
+1
-1

In Markdown:
+ a list
+not a list

And the same goes again for headers, to allow the use of explicite sharp sign (examples from commonmark) when you don't put a space:
#0

foobar

In Markdown:
# a header
#not a header

As a general rule, I think we should require a space after any block markup. Note: it would break current implementation.

Parsing is extremely fragile?

I have a little snippet for you to test and I am curious to know if you have any insights about why the parsing breaks down so easily before I go in and try to fix it up.

This markdown below should render a simple while loop

        ``\nwhile (YES) {\n    printf(\"Help, I am stuck in an infinite loop!\\n\");\n}\n``

But it does not render using your parser. Interestingly enough though this does:

       `` while (YES) { \n printf(\"Help, I am stuck in an infinite loop! \\n\"); } ``

I found that the initial newline and the last two newline chars are breaking it.

Any ideas?

Parsing issue with links in the end of the string

The parser crashes when there's a link in the end of the string, like in @"Hello\n Men att [Pär](http://www.google.com/)".

I just wrote a test case and I'm working on a fix.

Feature request: compatibility with NSAttributedStringKey

Swift 4 is using NSAttributedStringKey instead of String for attributes names, which causes some compatibility issues with TSMarkdownParser.

Multiple new line characters are ignored

This markdown:

# Header\nParagraph

will generate correct output:

Header

Paragraph

However, when we use \n\n instead of \n:

# Header\n\nParagraph

the output will be incorrect:

Header Paragraph

I tried looking at source code and propose some fix or workaround, but I am not sure where TSMarkdownParser is generating NSAttributedString paragraphs from markdown.

Issue: Crash in MacOS 10.10

The following code works on 10.11 - 10.13 but crashes in 10.10. Any thoughts?

let attrDescription = TSMarkdownParser.standard().attributedString(fromMarkdown: content.jssDescription ?? "", attributes: [NSForegroundColorAttributeName : NSColor.descriptionTextColor()])

descriptionLabel.textStorage?.setAttributedString(attrDescription)

I'm attaching a crash report
com.jamfsoftware.selfservice.mac_issue_31_crash_403910d2cd4246e4b6b74e53054674a5_e9557cad5d0611e7811256847afe9799_0_v2.txt

Incorrect handling of line breaks

In Markdown, paragraphs are separated by two line breaks. Two spaces at the end of a line produces a newline (or rather a <br /> tag in HTML)
For example (I've replaced spaces with • for clarity):

First•paragraph,•first•line.
First•paragraph,•still•first•line.

Second•paragraph,•first•line.••
Second•paragraph,•second•line.

The expected output is:

<p>First•paragraph,•first•line.
First•paragraph,•still•first•line.</p>

<p>Second•paragraph,•first•line.<br />
Second•paragraph,•second•line.</p>

The output produced by TSMarkdownParser is something more like:

<p>First•paragraph,•first•line.</p>
<p>First•paragraph,•still•first•line.</p>
<p></p>
<p>Second•paragraph,•first•line.</p>
<p>Second•paragraph,•second•line.</p>

Swift Compatibility: NSString * to NSAttributedStringKey *

As of swift and objc NSAttributedStringKey differences, there it is typedef of NSString for Obj-C and struct not subclassed from String in swift.

Just declaring explicitly required type will allow to use it in swift wihtout issues.

@property (nonatomic, strong, nullable) NSDictionary<NSString *, id> *defaultAttributes;
@property (nonatomic, strong, nullable) NSDictionary<NSAttributedStringKey *, id> *defaultAttributes;

Release 1.0.10 with Paragraph Formatting Support

Would be wonderful if you have the time.

Thanks!

groomsy

Get image from a custom bundle

Feature request: Get image from a custom bundle.

[UIImage imageNamed:@"markdown" inBundle:[NSBundle bundleForClass:[self class]] compatibleWithTraitCollection:nil];

Bold at the beginning of the line is converted to list symbol

If a string starts with an emphasized or strong text, it's converted to list symbol. E.g. "bold text" ends up as "• bold* text"

It looks like moving addListParsingWithFormattingBlock to the very end of standardParser method fixes this. I'm not very familiar with markdown, so I'm not sure if it doesn't introduce some new issues.

Escaped periods are not parsed

If you have a markdown string with an escaped period i.e. \. it should parse it to a period:

This
.

Not
.

There may be other issues with other escaped characters.

Feature request: support for an escaping sequence

Thank you for bringing customised Markdown to iOS.

Can you also add support to partially disable Markdown interpretation somehow? Suggested way of doing it would be by using an escape sequence, like '\x' (C-like), '&x;' (xml-like) or '%x' (url-like). I personally recommend the first option, using a backslash, as we do not want to conflict with HTML support.

I will quote the example from http://genius.com/3057216 : "Type M*A*S*H to get MAS*H".

To make the regexp simple, suggested behaviour would just be:

backslash + any character = character without markdown interpretation

Add a changelog

I couldn't find a way to know what's new in 1.0.20 without looking at the diff (which is often verbose and not easy to understand for people who don't know the codebase).

Would you mind adding a changelog or add tag descriptions for each release?

Feature request: attributedString to markdown

Complex feature, but it would be nice to be able to convert an attributedString back to Markdown.

Apply NSParagraphStyleAttributeName to list items via listAttributes

I'm trying to add consistent indentation to ordered and unordered list items coming down in a Markdown string. If I change the defaultAttributes property on the TSMarkdownParser instance, adding an NSParagraphStyleAttributeName, the indentation settings work but are applied to all of the Markdown text. What I'm hoping to do is only apply the NSParagraphStyleAttributeName to the text that's associated with a Markdown list.

Pared down code here:

NSMutableParagraphStyle *listStyle;
listStyle = [[NSParagraphStyle defaultParagraphStyle] mutableCopy];
[listStyle setDefaultTabInterval:18];
[listStyle setFirstLineHeadIndent:0];
[listStyle setHeadIndent:18];
[listStyle setLineSpacing:7.5];
[listStyle setTabStops:@[ [[NSTextTab alloc] initWithTextAlignment:NSTextAlignmentLeft location:18 options:nil] ]];
NSString *markdown = @"## Title\n\n- Item 1\n- Item 2\n- Item 3\n\n1. Item 1\n2. Item 2\n3. Item 3\n\nLorem ipsum dolor sit amet, consectetur adipiscing elit. Sed interdum tortor odio, vitae porttitor lacus dapibus non. Etiam sodales euismod sapien a porta. Vivamus lectus ipsum, ultrices vel efficitur eget, sollicitudin in libero. Duis posuere felis velit, sit amet sodales nulla pharetra gravida. Aenean non justo venenatis, auctor nibh ut, hendrerit tortor.";
TSMarkdownParser *parser = [TSMarkdownParser standardParser];

parser.defaultAttributes = @{
    // This works, but affects all of the text, including text that isn't part of a list.
    // NSParagraphStyleAttributeName: listStyle
};
parser.listAttributes = @[
    // This does not affect the styling of the list items. Perhaps it's getting overridden?
    @{ NSParagraphStyleAttributeName: listStyle }
];

NSAttributedString *string = [parser attributedStringFromMarkdown:markdown];

Hopefully this explanation makes sense. I'm an Objective-C newb.

feature request add remote url to images

Here is the code:
if(image == nil)
{
NSURL *url = [NSURL URLWithString:imagePath];
NSData *imageData = [NSData dataWithContentsOfURL:url];
image = [UIImage imageWithData:imageData];
}

add the above code in the addImageParsingWithImageFormattingBlock after
UIImage *image = [UIImage imageNamed:imagePath];

Basically this checks for the remote url path if the existing path is not a local path.