fomid's People
fomid's Issues
Pop-up/Collapsible annotations in PDFs (e.g. structured proofs, links to math concepts)
Here and here are discussion on how to implement collapsible/expandable sections in LaTeX. To do that with arbitrary content will require some fiddling with tikz, to avoid the collapsed section taking up empty space in the pdf, but will enable structured proofs. Not sure how well that "nests", will need investigation.
Here is a discussion how to do tooltips/hoverboxes in TeX, which we might exploit for showing definitions on hovering over notations and symbol references.
Variable bindings in binding operators
Problem: FLaTeX needs support for binders. Bound variables can represent simple variables (e.g. \sum_{n\in N}n
, where N
is a "type"), but also sequences (e.g. \forall{x_1,...,x_n}...
, see #7) or more complicated patterns (e.g. \sum_{(a,b)\in S}a^2\cdot b^2
).
In the latter case, MMT probably has to take care of converting this to something tangible.
Related to #3
Special `\symdef`-variants for constants/functions/predicates
Problem:
Lots of boilerplate, multiple arities for different notations of the same symbol.
Example (SMGloM): A frequent pattern is of the form (see #3)
\symdef[name=foo]{fooOp}{op}
\symdef{foo}[2]{\fooOp(#1,#2)}
where \fooOp
allows to reference the operator foo rather than the application of foo to arguments.
=> two LOC for a very common pattern. More of those, if variants are defined that only differ in the operator symbol itself, e.g.
\notation[variant=x]{timesOp}{\times}
\notation[variant=x]{times}[2]{#1 \timesOp[x] #2}
\notation[variant=star]{timesOp}{*}
\notation[variant=star]{times}[2]{#1 \timesOp[star] #2}
Question: are the more patterns like this? Out-of-the-box support for operator-macros or something more general?
Also: often the operator notation is just an application operator applied to "dummy arguments" (usually \cdot
), e.g. \foo\cdot\cdot
- is this the default operator notation? (MMT: dummy arguments are lambda abstracted away) Then the dummy arguments need to be changeable, e.g. Hom(-,A)
is more common than Hom(\cdot,A)
(TODO)
Multi-sentence statements
Example
"Let a an A with b a B(a) such that P(a,b). If Q(a,b) and there is some c with R(a,b,c), then every D(a,b) is an E(a,b)"
=> \forall a\in A,b\in B(a). P(a,b) \to \Q(a,b) \to (\exists c\in C. R(a,b,c)) \to \forall d \in D(a,b). d\in E(a,b)
Idea: Multiple declarations in a structural feature that elaborates into a single proposition / definition. Retains both presentations.
Additional idea: LaTeX/LaTeXML can implement such features generically, by adding a <span>
whose property
-attribute reflects the name of the structural feature. That way, we can add additional features by providing a structural feature and a LaTeX macro in additional packages, istead of hardcoding them in the FLaTeX-package.
Natural language annotations in general
Example sentence
"A natural number
$N$ is even, if its square is even."
- Let
\operator
a semantic macro that takes n arguments.
e.g.:\implication
, n=2. - In mathmode,
\operator
behaves as usual (e.g.$\implication{A}{B}$
expands to (ignoring semantics)$A \Rightarrow B$
- see https://github.com/KWARC/FoMID/issues/3) - Let p=(p_1,...,p_m) a phrase that represents an application of
\operator
to arguments arg_1,...,arg_n
e.g. p = "A natural number $N$ is even, if its square is even."
arg_1="its square is even",
arg_2="A natural number $N$ is even"
Requirements
-
\operator
in text mode needs n arguments. The n arguments are subsequences of p in arbitrary order, interleaved with subsequences that do not represent an argument.
e.g. first argument of\implication
is "its square is even",
second argument is "A natural number $N$ is even", and
", if" does not represent an argument - these are considered verbalization components of\operator
. - arguments can be invisible.
e.g. the second argument of\natpow
in "its square" is 2, which is not explicitly represented in the text. - each argument has to contain a uniquely identifiable subsequence that represents the actual term
e.g. in "A natural number $N$", the argument to\even
is "$N$", not "natural number"! - variables: If declared, not compositional. If not declared, the scope needs determining ("is this still the same
$N$ as this one?"),
its quantification (universally/existentially) and "type" (optionally) provided, and all of these ideally at most once, even if the variable
occurs multiple times.
Ideas
- Macro
\operator
takes n{}
arguments, representing the operator arguments, interleaved with optional[]
arguments in between (including leading and trailing) representing verbalization components.
e.g.\even{its square}[is even]
=> "its square" is the argument, "is even" is just a verbalization component. - An argument prefixed by
*[
k]
indicates that what follows is the k-th argument rather than the next one.
e.g.\implication*[2]{A natural number $N$ is even}[, if]{its square is even}
indicates that the second argument comes first. - An argument prefixed by
*
is invisible
e.g.\natpow{its}[ square]*{2}
==\natpow{its}*{2}[ square]
==\natpow*[2]*{2}{its}[ square]
=> \implication*[2]{\even{A \NaturalNumbers[natural number] $N$}[ is even]}[, if]{\even{\natpow{its}*{2}[ square]}[ is even]}
.
Remaining questions
- Both "A natural number $N$" and "its" represent
OMV(N)
, whereN
has type\NaturalNumbers
and is universally quantified.
How to represent this? (see https://github.com/KWARC/FoMID/issues/5 ) - Binding applications (see https://github.com/KWARC/FoMID/issues/8 )
- Sequences (see https://github.com/KWARC/FoMID/issues/7 )
XHTML/LaTeXML/FLOMDoc
XHTML/LaTeXML
- For declarations we can generate
<script type="application/
(xml | json)">
nodes (for xml: Requires prefixing all inner nodes withxhtml:
, but they're not checked against the rnc and stripped away in post processing anyway). - For annotations, we probably need at most an annotation type (OMS, OMA etc.) and a URI/ID/name. For OMS, OMV etc. a URI will do, for OMAs. Applications of complex terms need an apply-operator anyway, so the head of an OMA is always a Symbol (?). For more complex attributes, we can add a
<script>
as the first child that holds complex terms etc. - In text mode, we can use
<div>
or<span>
with attributesresource
(URI) andproperty
(annotation type). - In math mode, the only valid attribute that survives post processing in LaTeXML is
class
- might even be "semantically valid" in that both annotation type and a URI could even meaningfully be used as CSS classes. Then we can addproperty
andresource
orid
as (as usual space separated) class attributes. Complex terms (if necessary) can be moved to a<script>
after the<math>
node and linked to via an id in the classes (inelegant, but functional, I guess).
Examples
Concept | property | resource |
---|---|---|
Theory | stex:theory | URI |
OMA | stex:oma | URI of head symbol notation |
OMV | stex.omv | URI of variable declaration if declared, name otherwise |
Changing notations
On the latex side: a macro to set a default notation option, globally (for the whole document) or locally.
On the viewer side: we can give notations (including arbitrary value combinations for variant
, lang
, arity
, see #3) unique URIs and annotate with those, so that MMT can parse out how a symbolic expression (i.e. MathML) came about. Conversely, \symdef
/\notation
s can write their notations as MathML into the xhtml (in a <script>
), so MMT should be able to easily substitute notations when serving a document, or even custom notations input via a latex field(?).
Question: how do we treat multiple notations for the same symbol on the MMT side? Multiple notations should be easily implementable, but what if new notations are added later on in separate modules - introduce defined constants? Keep FLaTeX notations separately in the flatex-Plugin for MMT?
Sequences and sequence arguments for flexary (associative) operators
Problem: We need systematic treatment of sequences and flexary arguments, in particular with respect to ellipses.
e.g. a_1 + ... + a_n
, or a_0 + a_2 + a_4 + ... + a_n
.
Needs to be compatible with earlier declarations, e.g. Let $(a_i)_{i\in I}$ a sequence in $A$ such that...
.
Sequence declarations?
Problem gets even trickier if you think of e.g. matrices like
\begin{pmatrix}
a_{11} & \cdots & a_{1n} \\
\vdots & \ddots & \vdots \\
a_{n1} & \cdots & a_{nn}
\end{pmatrix}
...how do we treat this systematically?
Variables (scopes, quantification,...)
Possible approach 1: Variable declarations
\vardecl[name=N,type={\NaturalNumbers}]{varN}{N}
(syntax analogous to\symdef
, see #3) introduces a macro\varN
for the variableN
, and behaves exactly like a semantic macro,
i.e. in mathmode,$\varN$
simply yields "N" (its notation), in textmode it takes 0{}
-arguments and one optional[]
-argument:
=> \implication*[2]{\even{\varN[A \NaturalNumbers[natural number] $\varN$]}[ is even]}[, if]{\even{\natpow{\varN[its]}*{2}[ square]}[ is even]}
Possible approach 2: Scope environments
Can optionally declare variables, e.g. \varscope
([name=
name, type=
t, notation=
not,
(universal
|existential
)]
)*{
...}
- with everything optional. If an occuring variable is not "declared in a \varscope
, its scope is the most inner \varscope
in which it occurs. A variable occurence is whatever LaTeXML's math parser considers a variable, or a \var{
name}
(for not explicitly declared variables in text, e.g. "its" in the example), or \name
if a variable with name name is declared in the current \varscope
, e.g.:
\varscope{\implication*[2]{\even{\var{N][A \NaturalNumbers[natural number] $N$]}[ is even]}[, if]{\even{\natpow{\var{N}[its]}*{2}[ square]}[ is even]}}
or
\varscope[name=varN, type=\NaturalNumbers, not=N]{\implication*[2]{\even{\varN[A \NaturalNumbers[natural number] $N$]}[ is even]}[, if]{\even{\natpow{\varN[its]}*{2}[ square]}[ is even]}}
Advantage: compatible with fewer explicit annotations, scope still has to be provided, though
Disadvantage: might require an additional \apply
-operator for applications of (function) variables, but we probably can't avoid that anyway.
Deprecating signature modules
Instead of having a signature module foo
and multiple natural language modules foo.en
, foo.de
as in SMGLoM, the library should not have to rely on "purely formal" modules. Instead, the formal content of all NL moduls should either coincide (entails duplication, but if definitions are extracted from annotated natural language, it will do that to some extent anyway), or ideally, \symdecl
s should only occur in some default language (probably english, given the expected coverage), and that module is included/extended by other NL modules.
TeX will have to be smarter about NL modules - e.g. pick the "default" language (according to babel package?) automatically, resolve e.g. ?foo
to ?foo.<lang>
automatically
Translation / Alignments
- We might want a syntax for alignments in LaTeX
- We might want users to be able to add new alignmentes on MathHub
- => We probably need a better syntax and interfaces for users to specify/provide alignments including their arguments
- MMT needs a better way to store/retrive/know about alignments => ULO, relational? (Work in progress, once relational is reimplemented as ULO/rdf)? Where/how should they be stored in the future to be persistent (if provided by a standalone user interface?)
(commutative) diagrams
...need annotations for tikz/tikzcd/xy? All of them? Some of them?
Might require annotations for nodes and arrows as "universally" and "existentially" quantified for the usual categorial definitions
(Algebraic) structures
Problem
Structured concepts (e.g. groups, topological spaces, probability spaces etc.) in informal mathematics act as both "types" ("Let G a group") as well as bundles of declarations (a group is a tuple (G,o,e,⁻¹)). They can be "instantiated" with an arbitrary number of explicit concepts (e.g. "ℤ is a group", "(ℤ,+) is a group", "(ℤ,+,0,-) is a group",...)
=> They need to be both symbols (to act as types) as well as theories (for extension/inheritance: "A group is a monoid such that...")
Ideas
Obvious solution for symbol/theory duality: Mod-types => structures as records
Inheritance/Instantiation: Parametric theories: fields are theory parameters, declared individually, treated like variables (OMVs), e.g. (prototypical syntax:)
\begin{theory}[name=Magma]
\parameter[type=\set]{universe}{M}
\parameter[type={\funtype{\universe,\universe}{\universe}}]{op}[2]{#1 \circ #2}
\end{theory}
\begin{theory}[name=Monoid]
\parameter[type=\set]{universe}{M}
\parameter[type={\funtype{\universe,\universe}{\universe}}]{op}[2]{#1 \circ #2}
\parameter[type=\universe]{unit}{e}
\extendmodule[universe=\universe,op=\op]{Magma}
(axioms)
\end{theory}
where \parameter
follows the syntax of \symdef
(see #3) and \extendmodule[
...]{
T}
behaves like \importmodule
, except that it \let
s the parameter declarations in T to their assignments in the optional arguments (if given).
Note: this is probably best implemented by having the macro that stores all declarations in a model contain the unexpanded macro definitions, the actual semantic-macro definition macros (\notation
,\symdef
etc.) expanding notations, and \protect
ing the inner-most semantic-macro expansion (the one that we need e.g. LaTeXML bindings for, which has access to the full URI of a symbol). That way, \extendmodule[
foo=\bar]
really should automatically replace \foo
by \bar
everywhere in the relevant scope, and similar for nested \extendmodule
s.
Instantiation
Question: Do we need to distinguish between the structure and the universe? Are there "structures" that don't have a set universe as its primary component? Does something like Let \mathcal G = (G,\circ,e,{}^{-1})
ever occur?
Either way: \instantiate[
...]{
T}{
name }
creates a macro \name
. MMT can set the type of \name
as Mod(T) and define it as a record with the provided assignments or undefined "dummy" constants for the missing ones. \name[
p]
yields the parameter declaration p in module T, which (analogous to \extendmodule
) is being \let
as \foo
if p=\foo
is part of the optional arguments of \instantiate
. MMT can then lambda-abstract away the not-provided assignments (i.e. dummy constants), which corresponds to universally quantified variables or (equivalently) proof arguments (in the case of axioms) in the current scope. \instantiate
then basically behaves like \extendmodule
in tex, except that the names of the parameters are \name[*p*]
instead of \name
, and MMT can handle them differently.
If we additionally let \name
without parameters be \name[p]
for the first parameter p in T, and (in some cases) the macroname allowed to be empty, then this allows for both e.g. Let \instantiate{Monoid}{monoidM} $\monoidM$ a group with operation $\monoidM[op]{\cdot}{\cdot}
as well as
\istantiate[universe=\IntegerNumbers,op=\natplus,unit=0]{Monoid}{}
(introducing proof obligations for the monoid axioms, by abstracting away the missing declarations)
Question: It occurs to me, that we may then not even make a distinction between \parameter
and \symdef
- the only difference is then between \importmodule
- which is an import on the MMT side - and \extendmodule
and \instiate
- which are an implicit structure and a named structure (or somewhat equivalently a record) respectively.
MMT needs to take care that all recursively included modules in the structures are assigned to plain includes.
Realms
FLaTeX/FLOMDoc needs support for realms, see here.
Examples:
- Groups. Face: theory of groups. Pillars: e.g. "Monoid with inverses" and "associative loop". This is fully solved by
\extendmodule
, see #2 - where the theory of groups would extend both monoid and loop => No two separate pillar modules even needed...? - Real Numbers. Face: Theory of archimedian topologically closed totally ordered fields. Pillars: e.g. Dedekind cuts, equivalence classes on Cauchy sequences. Maybe
\extendmodule
also solves these instances?
Question: What do we need on the MMT side to support this systematically?
Guided tours
Easy to implement in theory: Take the topologically sorted list of dependencies of a concept/document and present to reader.
Desirable: Have a user permanently mark modules as trivial/known, so that they are omitted from guided tours for that specific user. Easy to do in an IDE, on MathHub that would require accounts, I guess?
Implicit/default arguments/notations with differing arities
e.g. the general linear group of dimension n over a field K is usually just denoted as GL(n), omitting the field. Should we expect users to provide them each time?
Alternatively: Semantic abbreviations (I think sTeX has an \abbrdef
, but it's not documented afaik?), possibly with an \implicit
macro that signals MMT that an implicit argument occurs here, e.g.
\symdef{gln}[2]{GL(#1,#2)}
\abbrdef{glni}[1]{\gln{\implicit}{#1}}{GL(#1)}
Then MMT can attempt to do type inference and search the context for the most likely value. We can also allow for
local \abbrdef
s that explicitly state a fixed argument for the current section, in which case we might want to mark these as not-exported in \includemodule
s.
verbalizations with arguments
We need a general scheme for verbalizations of functional concepts. Marcel Dreier's B.Sc Thesis has some thoughts on this, which are not quite implemented yet.
But the is more than discussed there. I have been stumbling over the treatment of "$n$-ary" again, which I had marked up as \defi[name=nary]{$n$-ary}
or as $n$\defi[name=arity]{ary}
. All of this muddles the fact that the word ary
is functional and takes $n$
as an argument. Other examples are $\mathal{C}$-derivable
and $C^\infty$-manifold
or $n$ to $m$ relation
.
Library structuring
Desirable features:
- Multilinguality
- Multiple differing definitions for the same concepts (e.g. using realms, see #12)
- Remarks, Examples
- Glossary-like entry points? On MathHub, should be followed by definitions, remarks and examples
- Provenance/Confidence? ("Definition published here", "proof published here", "proof implemented in system X here", "added by user Y")
=> SMGloM structure as smglom/
area/
conceptname.
language.tex
likely won't do anymore. Archive structure is questionable anyway, but also the addition of several "types" of entries and multiple entries for "the same" concepts put this into question. Also, once the library grows, the individual archives will grow by largely irrelevant (for a given user) content. E.g. everyone needs basic algebra, no one needs highly specialized concepts in higher commutative algebra.
=> GIT integration in MMT could maybe solve that by LMH specifically cherry picking (dependency trees of) individual modules? In that case, the "backend" organizational structures becomes largely irrelevant
Semantic Search
Should largely work out of the box using MathWebSearch, but we might need to figure out how to generate an index from FLOMDoc-XHTML, that maybe also makes use of semantic information?
(I notice that the word semantic is somewhat overlaoded here: Semantic serach in the sense of MathWebSearch operates on syntax trees only, so it's more syntactic search in the logical sense and only semantic in the sense of "syntax-tree aware")
Symbolic notations
\symdecl[name=foo,type=
tp]{bar}
introduces a new symbol with name foo
and macro \bar
. If no name is given, the name is bar
. The (optional) type is an arbitrary symbolic expression.
\notation
... introduces a new notation for a symbol (more details below). \symdef[
...]{bar}
... combines \symdecl
and \notation
.
Notation schema
\notation[prec=
p;
p1x
...x
pn,args=
args,
vt=
name]{foo}[
n]{
not}
...
- n is the (optional) arity, subsumed by:
- args is a list of
i
s,a
s andb
s. If given, the length of * args* is the arity. If no args is given, the arity is n and all arguments are normal (i
). If no n is given, then n=0.i
represents a normal argumenta
represents an associative argument listb
represents a bound variable argument
- p is the operator precedence, the pi are the individual argument precedences. Default: pi=p, p=0, except if n=0, in which case p=
\infprec
(see bracketing below). foo
is the name or macroname or URI of the symbol that gets the new notation.- not is the actual notation, with the usual
#
i representing the ith argument. \notation
takes an additional argument for each associative argument list, that provides a notation for the concatenation of two list elements (i.e. a binary macro).- vt is one of (
variant
,lang
,arity
), name is a string identifying the notation variant. A notation for\foo
withvariant=bar
is used via\foo[variant=bar]
or\foo[bar]
. Variants in the three dimensions (variant
,lang
,arity
) can be composed (if a notation for the composition exist), e.g.\foo[bar,lang=de]
.variant
is generic,lang
is used for language-specifc variants, arity is for variants of operators where commond "default" arguments are often suppressed -- e.g. least common multiple lcm(n,m) is called kleinstes gemeinsames Vielfaches in german, and correspondingly represented as kgV(n,m).
Question: how to handle binders and bounc variable arguments (separate issue: #8)
Example:
\notation{plus}[2]{#1 + #2}
: \plus{a}{b}
yields a+b
\notation[args=a]{plus}{#1}{#1 + #2}
: \plus{a,b,c,d}
yields a+b+c+d
\notation[args=ai]{sumlethan}{#1 < #2}{#1+#2}
: \sumlethan{a,b,c}{A}
yields a+b+c<A
Automatic bracketing
Automatic bracketing works by comparing a downwards precedence p_d with an upwards precedence p_u. The initial downward precedence is p_d:=-\infprec
.
A semantic macro \foo{
a1}
...{
an}
sets the current downward precedence for each argument ai to its argument precedence p_d:=pi in the ith position. When expanding a semantic macro \bar
, the operator precedence p of \bar
is taken as upwards precedence p_u=p. Compare p_d and p_u. If p_u≤p_d, parentheses are wrapped around \bar
, otherwise not.
Example:
\notation[prec=500]{plus}[2]{#1 + #2}
\notation[prec=600]{times}[2]{#1 * #2}
In both cases, argument precedences are operator precedences.
Consider \times{a}{\plus{b}{c}}
. Then \times
compares its precedence 600
with the (initial) downwards precedence -\infprec
, where \infprec
:=1000000. No parentheses are inserted. \times
then sets the downwards precedence to 600
and expands its arguments.
\plus
compares its precedende 500
with the downwards precedence 600
and inserts parentheses.
Result: a*(b+c)
Conversely: \plus{a}{\times{b,c}}
. Here, \plus
sets the downwards precedence to 500
. \times
compares its operator precedence 600
and does not insert parenthesis. Hence: a+b*c
.
The default precedence \infprec
for 0-ary semantic macros means no parentheses are ever automatically wrapped around constant symbols.
Question: do we need argument precedences in MMT?
Customization
The algorithm uses \notation@lparen
and \notation@rparen
as opening and closing brackets. By default, \notation@lparen:=(
and \notation@rparen=)
. In displaymode, \left\notation@lparen
and \right\notation@rparen
are used.
\dobrackets{...}
wraps brackets around its arguments and is used by the bracketing algorithm, hence allows for forcing brackets. \withbrackets{a}{b}{t}
sets \notation@lparen:=a
and \notation@rparen:=b
in t
<- allows customizing which brackets to use.
Question: Multiple notations in MMT, see #11
Typechecking expressions / Translating to foundations
Question: Can we typecheck FLaTeX expressions based on weak type annotations directly or do we need to translate them to a foundation?
Obviously, translation to e.g. MitM allows for type checking, but also requires decent alignment coverage.
Additionally/Alternatively: Maybe we can use realms (see #12) for this?
E.g. both a module for set theory and a module for type theory could provide pillars for an interface theory for implication, the former instantiates it with a set of pairs, the latter by a primitive constructor with inference rules. Then an expression containing only symbols for which pillars in type theory (or symbols which have definitions <- recurse) exist could be trivially translated to LF/MitM/LATIN2. "Type theory modules" in FLaTeX/FLOMDoc could be trivially aligned with LF/LATIN2/MitM concepts.
Similarly, a module for natural numbers could have a pillar for the Von Neumann construction, indicating how to translate it to ZF (or a weaker set theory - we only need infinity, extensionality and restricted comprehension, I think?), whereas the peano-axiomatic face-theory naturally yields a "type theoretic" pillar.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.