Git Product home page Git Product logo

Comments (27)

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
I wanted this too and created a patch for it. The problems I saw were:

1. Constants used for options => No way to use this as a library without 
compiling
from source.

2. Needed to be able to set options from configuration file.

So what this patch does is the following:

1. New class, MarkdownOptions, which contains all the options that were 
previously
constants. The class is immutable, the options can be set explicitly at 
construction
time, or it can use the defaults that were used before, or use the defaults +
overrides from configuration files. The config options are placed in 
appSettings and
take the form Markdown.PropertyName, e.g. Markdown.AutoHyperlink. (Maybe later 
this
should use its own config section but that seemed overkill for now).

Having the options per instance creates some other changes as well. The regular
expressions that use the options are now per instance, not static anymore. The 
ones
that don't use any options are still static. 

2. Realistically for most cases you only need a single instance, and probably 
most of
the time you'd want defaults + config overrides. So there is a static property,
Default, on the Markdown class so you can just go Markdown.Default.Transform(s) 
if
that's all you want.

3. The Markdown class is immutable so if you want to change options later you 
must
create a new instance. This avoids having to keep track of option changes to
recompile regular expressions. Immutable == good :)

And that's about it. Added bin/obj to the ignore list as well.

I'll probably look at the failing unit tests later this week if I have a chance.

Original comment by [email protected] on 4 Jan 2010 at 11:26

Attachments:

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Whoops, accidentally left some debugging code in there. Here's a clean version 
of the
patch. Sorry about that.

Original comment by [email protected] on 4 Jan 2010 at 12:19

Attachments:

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
I appreciate the contribution, but after applying the patch locally I am seeing 
a 15%
increase in benchmark times.

I assume this is because the affected regular expressions are not static?

I had always intended this to be used as a library, compiled from source code 
.. what
specific scenario did you have where you cannot do that?

Original comment by [email protected] on 5 Jan 2010 at 1:14

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
ouch -- on top of the 15% overall performance decline, the single call 
performance
got 10 times worse:

before patch:

input string length: 11075
1 iteration in 16 ms
input string length: 88607
1 iteration in 196 ms
input string length: 354431
1 iteration in 1635 ms


after patch:

1 iteration in 174 ms
input string length: 88607
1 iteration in 345 ms
input string length: 354431
1 iteration in 2000 ms

Original comment by [email protected] on 5 Jan 2010 at 1:27

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Performance aside, I think this is valuable because it would allow people to 
use the 
library without needing to compile it.  We could conceivably distribute a .dl 
users link 
into their project instead.  This is a pretty standard model for distributing 
library 
code.

Second, and this is what actually triggered me to log this bug, is that we 
could 
actually test the functionality of the options flags.  Right now it's 
impossible to test 
whether they work without recompiling the code.  I would hate to have to 
recompile 
the code a bunch of times as part of an automated test suite.

All that said, a 15% performance degration is pretty serious.  But I think some 
of that 
may be an artifact of how the tests are executed (do the tests create many 
instances 
of the markdown class? do we expect users to?).

Original comment by [email protected] on 5 Jan 2010 at 3:10

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
I am not sure a dinky little markdown.cs justifies an entire DLL.. does it?

I understand the desire for configuration, but a 15% (and up to 10x, depending 
on how
you call it) performance penalty seems like an extremely steep cost for
configurability not everyone will need.

Original comment by [email protected] on 5 Jan 2010 at 3:31

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Yes, I agree.

We should not do this unless we find a way to do it without sacrificing 
performance.

The .dll bit I'm not overly concerned about, but in order to write unit tests 
for the 
options functionality, we need this.

Original comment by [email protected] on 5 Jan 2010 at 3:38

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
The performance issue is weird, since the regular expressions themselves didn't
change at all, they only went from being static to instance variables. And they 
are
compiled in the constructor which is executed outside the benchmark itself. 

All in all I'm seeing a slightly worse performance on most tests (0.0x ms per
iteration) except for the really weard single call case. I'll look at that and 
see if
I can see what the issue is.

As for whether this makes sense or not, well, personally I like to pull in
dependencies for my projects as .dll's, not source code. But yes, maybe it's 
overkill
for a single Markdown.cs.

Original comment by [email protected] on 5 Jan 2010 at 8:08

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Hmmm, after running some tests it seems that the single call performance is only
significantly better in the static case when the other benchmarks have been run
before. If I comment out the first three benchmarks (that run thousands of 
times) and
only run the last three (single iteration ones) I get these results:

WITH OPTIONS:
input string length: 11075
1 iteration in 134 ms

ORIGINAL:
input string length: 11075
1 iteration in 132 ms
DIFFERENCE: 1,52%
-----------------------------------------------

WITH OPTIONS:
input string length: 88607
1 iteration in 445 ms

ORIGINAL:
input string length: 88607
1 iteration in 399 ms
DIFFERENCE: 11,53%
-----------------------------------------------

WITH OPTIONS:
input string length: 354431
1 iteration in 4134 ms

ORIGINAL:
input string length: 354431
1 iteration in 4028 ms
DIFFERENCE: 2,63%
-----------------------------------------------

As far as I can tell this has something to do with how the Regex implementation
caches commonly used patterns, perhaps even compiling them behind the scenes? 
Most of
the regexes used are compiled but not all. 

Anyway, I can accept that the change is not worth it from a performance 
standpoint.
But I would still argue that if the class only has static members then the class
should be static as well. Creating many instances of a class that has no 
instance
data seems useless. And perhaps config options could be read in the static
initialization?



Original comment by [email protected] on 5 Jan 2010 at 12:06

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
let's try this again.. I made the entire class static, can we tackle the
configuration once more? new patch?

Original comment by [email protected] on 6 Jan 2010 at 1:21

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
ok, I reverted the all-static change -- as you pointed out, this isn't 
thread-safe

Original comment by [email protected] on 6 Jan 2010 at 11:19

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Well, here is a new patch for this. The difference is that the configuration is
static, that is, you configure the options with a static method, the regular
expressions are all static so essentially every instance you create of the 
Markdown
class will use the same options. It still gives you the ability to read the 
options
from a config file, and reconfigure it later through code 
(Markdown.Configure(new
MarkdownOptions(...)), which will just recompile all the static regular 
expressions.

As for performance, it seems comparable with the older code. I ran the 
benchmarks a
few times, sometimes the newer is a bit slower, sometimes a bit faster. It's a 
couple
of milliseconds here and there.

Original comment by [email protected] on 7 Jan 2010 at 8:08

Attachments:

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
speed wise it looks OK (excellent!), but I'm not crazy about the way the 
regexes are
moved so far away from the code they are relevant to. 

It kind of utterly destroys the physical ordering and logical locality of the 
code..
and makes it a PITA to work on. Because when you're troubleshooting, say,
DoAutoLinks() .. and you want to check _autolinkBare, it's wayyyyyy up at the 
top of
the file. In a big jumble of code that has no relationship to each other.

It kind of stops being enjoyable to work on, for me, at that point.

Original comment by [email protected] on 7 Jan 2010 at 8:44

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
ok, I made changes in r88 that should move us closer -- we just need a way to 
rebuild
all the regular expressions when those properties change .. perhaps reflection 
or
something?

we think only .TabWidth will be problematic. (I made .NestingDepth private, as 
to me
that is 100% an implementation detail and should never REALLY need to be 
changed.. we
might even be able to use the proprietary .NET regex stack balancing extensions 
to
remove it entirely, eventually..)

The rest of the options, with the sole exception of .TabWidth, should work, as 
they
don't rely on being compiled into the regular expressions .. take a look and 
see what
you think?

Original comment by [email protected] on 7 Jan 2010 at 10:18

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Well I would say the easiest way to get around that would be to just have a 
single
CompileXXX method for each of the regexes that need the options. Just have the 
method
be where the static initialization used to be, and then have one main
CompileRegularExpressions that calls all the others. Then you can just as 
easily look
at the regex when you need to.

Original comment by [email protected] on 7 Jan 2010 at 11:07

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Ok, I just totally missed your point that everything except _tabwidth and 
_nestDepth
doesn't rely on the regexes being compiled. Then I'd just say leave those two as
constants and have the rest as properties, that's fine. _tabWidth is arguably 
more
appropriate as a constant anyway.

Original comment by [email protected] on 7 Jan 2010 at 11:28

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
yes, I'd say go ahead and adapt your cool load-config-from-file for this 
version -- I
think all the config properties should work with the exception of TabWidth

Original comment by [email protected] on 7 Jan 2010 at 12:49

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
Ok, I've created a new patch. Actually, since removing nestDepth and tabWidth 
from
the equation the whole thing just became a lot simpler. All the other config 
things
were only used in instance methods, so I could just make them normal properties 
not
static ones. So, you can create a new object, change its properties and do what 
you
want without affecting anything else.

Also added a constructor that takes in a bool loadOptionsFromConfigFile, the 
default
is still not to do it so no changes have to be made to existing tests or code.

And finally added unit tests to make sure the configuration is read from 
appsettings
and that changing the properties on an instance has the desired effect. 

No performance problems compared to a clean copy from SVN.

Original comment by [email protected] on 8 Jan 2010 at 8:46

Attachments:

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
[deleted comment]

from markdownsharp.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 27, 2024
ok checked in as r95

Original comment by [email protected] on 11 Jan 2010 at 2:10

  • Changed state: Fixed

from markdownsharp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.