Git Product home page Git Product logo

Comments (7)

ljwolf avatar ljwolf commented on July 19, 2024

huh, wild. Neat package.

IIRC, @jreades had suggested:

@include{resource = https://imamarkdownnotebook.com/txt.md, 
start = # Inserting data into lists,
stop = #
**options
}
@include{resource = ../me_too.ipynb, 
start = ## Including raw notebooks by converting them to markdown first
stop = ###
}

where
resource supported a URL or a local file address.
start was a specific target tag in the markdown.
stop was a more general stop parse criteria, either a specific section number, a relative/computed criteria, like next header at equal level, next h1 tag, etc.

Thinking about this, I like that syntax a lot. The semantics of stop and start, like how valid options are encoded, is the tricky part of it.

What seems simplest is case- and leading-space-insensitive string matching for start. This would be default, and match on everything after the equals to the newline. This'd make it easy to do the generic "Grab this section" action, and avoids repetitive typing of quotes.

Harder targets, like, substring/subsection match, multiline target, or raw regexp, could be handled by a special starting delimiter, like =sub, =multi, =re, maybe?

If you use the same semantics for stop, then, you get automatic "to next X" behavior. Like, in the statements above, the first include would go from the level-1 header "inserting data into lists" to the next level-1 header. The second include would go from the level-2 header "including raw notebooks by converting them to markdown first", to the next level-3 header.

The stuff in options might make more sense if compared to supported preprocessing options in notedown or something like knitr/rmarkdown.

from geopyter.

jreades avatar jreades commented on July 19, 2024

Having given it some more thought, I am definitely leaning towards including (probably a better term than importing!) on the basis of structure, not start/stop… A little bit like CSS selectors:

@include {
‘resource’ = ‘…’,
‘select’ = ‘h1.Lists'
}

or

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists h3.Let’s Code"
}

So in the first case you would just get everything sitting under the ‘#Lists’ header, whereas in the second you would get whatever comes under the “###Let’s Code” header that is itself under the ‘#Lists' section. That allows you to disambiguate subsections with the same name (e.g. #Dictionaries … ###Let’s Code) and also means that you don’t need to think about start/stop semantics, just “Grab everything at this level or ‘below’ structurally”. And if stuff gets moved around inside the main files your imports don’t fail either!

In the long run this could be extended with a suppression syntax like:
‘deselect’ = “h3.Let’s Code”

Such that the ###Let’s Code would be dropped (or ‘suppress’ed) from the #Lists import.

And, still sticking with the CSS ‘metaphor’:

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists h3.Let’s Code, h1.Dictionaries h3.Let’s Code"
}

That would select multiple subsections from the resource.

jon

On 20 Sep 2016, at 19:42, Levi John Wolf [email protected] wrote:

huh, wild. Neat package.

IIRC, @jreades https://github.com/jreades had suggested:

@include{resource = https://imamarkdownnotebook.com/txt.md,
start = # Inserting data into lists,
stop = #
**options
}
@include{resource = ../me_too.ipynb,
start = ## Including raw notebooks by converting them to markdown first
stop = ###
}
where
resource supported a URL or a local file address.
start was a specific target tag in the markdown.
stop was a more general stop parse criteria, either a specific section number, a relative/computed criteria, like next header at equal level, next h1 tag, etc.

Thinking about this, I like that syntax a lot. The semantics of stop and start, like how valid options are encoded, is the tricky part of it.

from geopyter.

jreades avatar jreades commented on July 19, 2024

Also, there’s this interesting section on customising the cell metadata associated with a notebook (which could definitely be useful for selection and/or formatting):

https://nbconvert.readthedocs.io/en/latest/customizing.html#Templates-that-use-cell-metadata

from geopyter.

sjsrey avatar sjsrey commented on July 19, 2024

Let's say the source atom has something like the following structure

h1 Lists
h3 Let's Code
h3 Other

h1 Dictionaries
h3 Let's Code
h3 Other

then a suppression syntax could be something like

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists -h3.Let’s Code, h1.Dictionaries h3.Let’s Code"
}

would result in the selection:

h1 Lists
h3 Other

h1 Dictionaries
h3 Let's Code

whereas

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists, h1.Dictionaries, h3 Let's Code"
}

h1 Lists
h3 Let's Code
h3 Other

h1 Dictionaries
h3 Let's Code

and

@include {
‘resource’ = ‘…’,
‘select’ = "-h1.Lists, h3 Other, h1.Dictionaries, h3 Let's Code"
}

gets:

h3 Let's Code

h1 Dictionaries
h3 Let's Code

In other words the rules would be

  1. If you specify only the parent, you get the parent and all children
  2. If you specify a parent and a child you get only the parent and the specified child (other children are suppressed.
  3. If you specify a parent and negate a child, you get the parent and any other children but not the negated child
  4. If you negate a parent you must specify one or more children to include. If you want to suppress a given level (h1) you simply do not include it in select then all children are omitted as well.
  5. If you had a parent with say 5 h3s and you wanted 4 of the h3s but not the parent it would be something like: select=-h1 -h3.Not wanted

from geopyter.

jreades avatar jreades commented on July 19, 2024

Sent from my iPad

On 21 Sep 2016, at 13:42, Sergio Rey [email protected] wrote:

In other words the rules would be

If you specify only the parent, you get the parent and all children
If you specify a parent and a child you get only the parent and the specified child (other children are suppressed.
If you specify a parent and negate a child, you get the parent and any other children but not the negated child
If you negate a parent you must specify one or more children to include. If you want to suppress a given level (h1) you simply do not include it in select then all children are omitted as well.
If you had a parent with say 5 h3s and you wanted 4 of the h3s but not the parent it would be something like: select=-h1 -h3.Not wanted

Yes, that sounds good to me.

One thing that I think you've actually got right in your mental model and just typed put differently (and just to be particular) has to do with the placement of commas. Let's say you have your document:

#Lists
###Let's code

#Dictionaries
###Let's code

#Lists-of-Lists
###Let's code

If the bit you want is only the Let's Code in Lists then your selection statement should be "h1.Lists h3.Let's Code" with no intervening comma. The comma distinguishes between 'statements', so if you had "h1.Lists, h3.Let's Code" then I would expect that to include everything under #Lists and all three ###Let's Code sections regardless of where they are in the notebook. That style gives maximum flexibility and specificity. I guess it also means we need to look out for potentially duplicate included sections...

Jon

from geopyter.

sjsrey avatar sjsrey commented on July 19, 2024

Good catch, I was overlooking that kind of flexibility that using the comma to delimit statements brings.

from geopyter.

sjsrey avatar sjsrey commented on July 19, 2024

After some exploration, it seems parsing the notebooks is pretty straightforward (See #8 and here )

Because of this, I think having a template approach where the template notebook is the skeleton that has cells with the @include syntax makes a lot of sense.

Fleshing this out, the question of what cell type we should use for the @includes comes to mind.

If we use raw and specify the include as a dict, then using the json module inside a parser would handle this. But maybe we should split this issue up as this tread might be getting to horizontal?

Going to begin to split this off into separate issues

  • #11 Specification of @include (basically what were were discussing here before this split)
  • #10 Reading, writing, sub-setting notebooks
  • #9 Structure of the atom notebooks

Let's keep this open, and add any separate issues into the previous list. Once the granularity is clear we can close this one.

from geopyter.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.