Git Product home page Git Product logo

Comments (6)

andreineculau avatar andreineculau commented on August 9, 2024

generating XML that should be fine and rather easy - generation is a low-level hack atm (look in the dev branch). And I guess you can either go md -> xml or json -> xml, but I'd prefer the former since md is the source.

PS: I went for having the markdown as primary-source - it will be humans making additions, and I don't plan on adding more information that the "title, description, link" rule-of-thumb, so the markdown will be rather easy to parse by brute regexps.

from know-your-http-well.

andreineculau avatar andreineculau commented on August 9, 2024

I refactored a bit the master&dev branch to reflect the intention better

from know-your-http-well.

dret avatar dret commented on August 9, 2024

"let's people do what they want and some regexes will parse that into robust structures" is among the more famous last words before something went down in flames. of course entirely your decision, but i think i'd rather stay away from writing regexes that probably break every now and then.
over at https://github.com/dret/HTML5-overview i have decided to go the opposite route and start from XML and drive MD from that (still need to work on that... :-), but of course that's also because i am an XML guy and have no issues with editing XML, which is something that maybe many people just don't want to do.
anyway, great initiative, and good luck!

from know-your-http-well.

andreineculau avatar andreineculau commented on August 9, 2024

Shame on me for expecting a boring reference to Now you have two problems :)

FWIW I have obviously started with the same reasoning locally (YML actually, not JSON; no visible commits) but I quickly switched to this "primitive" alternative. Just to lay down some thoughts leading to this outcome:

  1. a project switching from structured data to MD&regexes -> never (say never). It just feels stupid. But if I ever sense that the current setup is creating grief, be sure I will switch to structured data. Not sure if I will go through the trouble of trying a Markdown2AST parser first.
  2. I wanted to make use of github's MD rendering
  3. I wanted the (github's rendered) MD to always be the-latest-version because that's what people will read
  4. it's ok for the structured data to be out-of-date because it is for machines and they will be targeting a tag/hash, so they'll be out-of-date anyways.

Nice project you have as well, and
repeat after me: This data is "beautifully, unapologetically XML" :)

from know-your-http-well.

dret avatar dret commented on August 9, 2024

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 would be the most appropriate reference here. i'll be happily managing my XML over at https://github.com/dret/HTML5-overview and contribute to HTTP in cthulhu markup ;-)
if you're still mildly interested: for HTML5, the XML is the master, but refreshing really is nothing more than running the xml2ms.xslt XSLT, which takes around 0.1 sec on my machine. done, all MDs refreshed, and no brittle regex magic required for anything. and i think it's more the other way around: if you provide an easily consumable starting point, you might find others (such as myself) using it to do interesting things. if you don't, these things are simply less likely to happen. so waiting for them to happen and then making the switch is kind of backwards.

from know-your-http-well.

andreineculau avatar andreineculau commented on August 9, 2024

FWIW

the most appropriate reference here

I'm not using regex to parse HTML. I'm using regex to parse some very simple MD (specifically rows only, meaning column=pipe delimited tokens).

if you provide an easily consumable starting point

But I do - it's not MD, it's JSON atm. That's what is intended as a published package. I don't expect anyone else to consume MD.

from know-your-http-well.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.