Comments (8)
Hi @masakiowari, thanks for this. First thing to check is whether the parser creates a valid CCG tree, you can do this with the following code:
tree = parser.sentence2tree("your sentence")
if tree is None:
print("Failure")
else:
print(tree.deriv())
If the sentence has a CCG derivation, then this is probably a lambeq problem. However, if the sentence fails to parse, then this is a problem of the DepCCG parser. Let us know the result.
from lambeq.
Hello! Thank you very much.
For "the bad" sentence, the result is as follows
the code,
import sys
sys.path.append("/lambeq")
sys.path.append("/depccg")
from lambeq import DepCCGParser
from discopy import grammar
tree = DepCCGParser.sentence2tree("上品な 表現 を する")
if tree is None:
print("Failure")
else:
print(tree.deriv())
give an output as
Traceback (most recent call last):
File "c:\Users\bi21008\Downloads\depccg-master\import sys.py", line 16, in
parser = DepCCGParser(verbose='suppress')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\bi21008\Downloads\depccg-master\lambeq\text2diagram\depccg_parser.py", line 155, in init
raise ValueError('DepCCGParser only supports
ValueError: DepCCGParser only supports "progress" level of verbosity. suppress
was given.`
For the "good" sentence like "これはテストです",
the similar code gives the following output
これ >> Id(n) @ は >> Id(n @ n.r) @ テスト >> Id(n @ n.r) @ です >> Id(n @ n.r @ n.r @ s) @ Cup (n.l, n) >> Id(n @ n.r @ s) @ Cup(n.l, n) >> Cup(n, n.r) @ Id(s)
As you say that this result may suggest the problem is with the parser.
However, when we directly use Depccg parser as is written on the following page:
https://github.com/masashi-y/depccg
the bad sentece like "上品な 表現 を する" get a proper output like
ID=2, Prob=-53.02713191278883
{< S[mod=nm,form=base,fin=t] {< S[mod=nm,form=base,fin=f] {< NP[case=nc,mod=nm,fin=f] {NP[case=nc,mod=nm,fin=f] 上品な/上品な/} {NP[case=nc,mod=nm,fin=f]\NP[case=nc,mod=nm,fin=f] 表現/表現/}} {< S[mod=nm,form=base,fin=f]\NP[case=nc,mod=nm,fin=f] {< NP[case=nc,mod=nm,fin=f] {< NP[case=nc,mod=nm,fin=f] {NP[case=nc,mod=nm,fin=f] を/を/} {NP[case=nc,mod=nm,fin=f]\NP[case=nc,mod=nm,fin=f] する/する/}} {S[mod=nm,form=base,fin=t]\S[mod=nm,form=base,fin=f] 。/。/**}}
Here, we may emphasize that the "bad sentence" means the sentence for which depccg + Lambeq do give an output.
these bad sentences are grammatically correct in Japanese.
from lambeq.
By the way, we have modified "ja.py" of depccg to work depccg+Lambeq well.
The above error is those we get even after this modification.
modification of ja.py is as follows:
for example
def generalized_backward_composition1(x: Category, y: Category) -> Optional[CombinatorResult]:
uni = Unification("b\c", "a\b")
if uni(x, y):
result = x if _is_modifier(y) else uni['a'] | uni['c']
return CombinatorResult(
cat=result,
op_string="bx",
op_symbol="<B1",
head_is_left=False,
)
return None
is modified as
def generalized_backward_composition1(x: Category, y: Category) -> Optional[CombinatorResult]:
uni = Unification("b\c", "a\b")
if uni(x, y):
result = x if _is_modifier(y) else uni['a'] | uni['c']
return CombinatorResult(
cat=result,
op_string="bc",
op_symbol="<B",
head_is_left=False,
)
return None
So, we applied the change #op_string bx -> bc, op_symbol <B1 -> <BC, here
Similarly, we applied the following modification on ja.py
on "def generalized_backward_composition2"
#op_string bx -> gbc, op_symbol <B2 -> <Bⁿ
on "def generalized_backward_composition3"
#op_string bx -> gbc, op_symbol <B3 -> <Bⁿ
on "def generalized_backward_composition4"
#op_string bx -> gbc, op_symbol <B4 -> <Bⁿ
on "def generalized_forward_composition1"
#op_string fx -> gfc, op_symbol >Bx1 -> >Bⁿ
on "def generalized_forward_composition2"
#op_string fx -> gfc, op_symbol >Bx2 -> >Bⁿ
on "def generalized_forward_composition3"
#op_string fx -> gfc, op_symbol >Bx3 -> >Bⁿ
That is all.
Before, this modification much more error occurred.
The problem on "Adjectival verbs" are that remain even after this modification.
from lambeq.
@masakiowari Hi, I can't seem to replicate this issue. Could you run this code fragment on your system and show us the full output please?
from lambeq import DepCCGParser
parser = DepCCGParser(lang='ja')
sentences = [
'感動的な映画を見る',
'曖昧な表現をする',
'静かな海を見る',
'健康な男性が歩く',
'親切な男性がいる',
'元気な男性が歩く',
'上品な表現をする',
'きれいな海を見る',
'健やかな男性が歩く',
'和やかな雰囲気を感じる',
'穏やかな笑顔を浮かべる',
'正直な男性がいる',
'有名な男性がいる',
'にぎやかな雰囲気を感じる',
'特別な表現をする',
'複雑な表現をする',
'まじめな男性がいる',
'下手な表現をする',
'便利な本を買う',
'朗らかな笑顔を浮かべる',
'幸せな笑顔を浮かべる',
'好きなスープを食べる',
'無理な計画を立てる',
'暇な男性がいる',
'必要な計画を立てる',
'邪魔なものをどかす',
'変な表現をする',
'自由な表現をする'
]
for sentence in sentences:
print(parser.sentence2tree(sentence))
Thank you.
from lambeq.
@ianyfan Thank you very much for your great suggestion!
now, we have reconstructed our environment of lambeq and depccg without using the modified ja.py file.
Now, we can work on the sentences like
'感動的な映画を見る',
'曖昧な表現をする',
'静かな海を見る',
without errors.
Now, we can treat Adjectival verbs.
Unfortunately, however, it seems that there still exit sentences which cannot be treated.
e.g. "ボブはおいしくないカレーが嫌いではない"
ent with the modified ja.py file.
This sentence can be treated in the old environm
from lambeq.
Hi, I've had a look and the issue seems to due to depccg returning a parse that cannot be drawn under standard CCG rules.
From your initial list of sentences, there are 4 that lambeq cannot draw:
- 親切な男性がいる
- 正直な男性がいる
- まじめな男性がいる
- 暇な男性がいる
They all have the same issue. For example, for the first sentence, depccg returns a parse that contains this problematic sub-parse:
親切 な
----- -----
S S\S 男性
-----------(BA) -----
S N
---------------------(UNK)
N
depccg tells us which rule it uses at each step, e.g. BA for backwards application. For the bottom rule, depccg provides the rule "other" which clearly isn't a standard CCG rule. Therefore, we cannot draw this tree as a diagram.
The example in your comment "ボブはおいしくないカレーが嫌いではない" has a different issue, where depccg tries to perform backwards cross composition (BX) on the types S\N + S\S -> S\N
which are not valid types to perform backwards cross composition on, which results in an error when trying to draw the diagram.
So I'm afraid I'm not sure if we can help you on the lambeq side; this seems to be an issue with how depccg parses these sentences.
I hope that helps. Let me know if you have any more questions.
from lambeq.
@ianyfan , Thank you very much for your detailed explanation.
Now, I perfectly understand the reason for this problem.
So, Lambeq can only understand the standard ccg, and depccg-ja sometimes outputs something which does not obey the standard ccg.
Now, the possible solution may be to modify depccg such that it only output standard ccg.
I will try to solve the problem along this direction.
from lambeq.
We'll convert this to a Discussion since it might be useful for other users as well.
from lambeq.
Related Issues (20)
- Method or class for composing more than 1 free wires into one HOT 1
- Add more tutorials and example notebooks in the documentation HOT 2
- Ansatz for performing amplitude encoding - Enhancement HOT 4
- Bobcat fails with extra space tokens HOT 3
- BobCat fails to parse with extra addition of "the" to a sentence. HOT 2
- lambeq pytest: No module named lambeq.version HOT 2
- IQPAnsatz: shape error as changing number of qubits for atomic types HOT 4
- Lambeq installation Error HOT 2
- Error whem training Classical Pipeline with Spider Ansatz HOT 4
- Key error in Accuracy function HOT 2
- PicklingError HOT 6
- Anastz Customization HOT 5
- TypeError when construct quantum circuits for multi-classification task HOT 11
- Python 3.12 Type Error in Mac Environment when Loading Library HOT 5
- PennyLane training problem HOT 2
- parameterization tutorial example failing HOT 5
- Implement ASCII drawing for all lambeq diagrams HOT 2
- Improve RemoveCupsRewriter
- Add frames in lambeq HOT 1
- Make PytorchModel work with quantum circuits HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lambeq.