#40 discusses the identification of satellite adjectives, which in PWN differ from the identification scheme used by other kinds of synsets. in mill
we tried to make this identification scheme more regular by having the satellite adjectives use the same scheme as every other kind of synset (which ultimately led to the issue discussed in #40).
a more radical idea is to stop trying to have satellite adjectives behave like other synsets, and have the other synsets behave like satellite adjectives. what follows is a proposal for a new wordsense ID scheme.
lexical ids have no meaning whatsoever; they are solely an ad hoc way of preventing ID clashes, because the combination (lexical form, lexicographer file)
is not enough to uniquely determine a wordsense. we could get rid of lexical ids by generalizing a version of the ID scheme formerly used by adjective satellites, which can be uniquely identified by (lexical form, head synset)
.
nouns and verbs could be identified by (lexfile, lexical form, hypernym)
(or hyponym?)
pertainyms (adjective or adverbs) could be identified by which wordsense they pertain to (plus lexfile and lexical form).
all in all, we define a 'core' relation for each 'kind' of wordsense/synset, and use the relation's target + the lexical form of the source to identify the wordsense/synset. naturally, mill
would have to be able to verify the uniqueness of this naming scheme. naturally, we wouldn't have to identify the core target beyond its lexical form unless that's not sufficient to satisfy the uniqueness constraint.
but this is all very radical, so I don't know if it should be implemented.