Comments (4)
Since a program could not be learned from the examples given, usually, more examples will not help. Since normally all programs expressible in the DSL which satisfy the examples are learned, no programs learned means that there are no programs in the DSL that satisfy all of the examples and adding more examples would only further constraint the learning problem. (I say "normally" because using the escape hatches of the learning procedure you could write your own non-monotonic learning sub-procedure... but that's generally a bad idea because of the confusion you bring up.) As you say, this means that the grammar would likely have to be extended to express the desired operation.
We know the error reporting is poor and it's an issue we intend to address.
If you are comfortable sharing your data, it would be helpful to see your inputs, both to determine if it is in fact not expressible and, if so, help us know how we might want to extend the language to cover your scenario. You can e-mail me at [email protected]
if you don't want to share it publicly.
from prose.
@danpere Thanks for the feedback. I am not sure, but I think one of the problems I might be running into is that there are (at least) two date formats mingled in the documents yyyymmdd and mm/dd/yyyy, either of which could be the accepted 'output' and the grammar may be failing to generalize across them.
from prose.
For clarification, you are using the Extraction.Text
language? (That's the sample that has that exact text as the error message.)
The differing formats might be the issue. Extraction.Text
usually ends up being able to use context when the formats are different, but that might not apply in this case. Also, there is a regular expression internally for matching "dates" which is fairly flexible, but it can't cover everything. Extraction.Text
does not currently support conditionals, but one way to work around that is to make multiple fields for the different date formats/contexts. @vuminhle may be able to give more tips on getting it to work on difficult scenarios.
from prose.
@danpere has covered all the main points.
If there is no program, it means that your task cannot be expressed in the current grammar. We could have given you back the problematic examples (or a maximal subset of working examples), but I'm not sure if that information is useful. Furthermore, there may be more than one variations of such sets.
We do give more indicative messages if your examples are conflicting or duplicating.
As you rightly observed, we can solve this by extending the grammar to support the task. @danpere mentioned learning conditional, which basically partitions your inputs into different clusters (each of which shares the same format) and learns a program for each of them. This is on-going work.
Which API did you use? Did you extract a substring out of a string, or a sequence of substrings out of a string?
It would be great if you can share one or two lines of your (anonymized) data, together with the fields you are extracting, so that we can analyze what is going on.
from prose.
Related Issues (20)
- PowerShell Extract JSON HOT 9
- Python Libraries for PROSE HOT 1
- Is there an ignore case option? HOT 1
- PROSE in commercial software HOT 1
- Semantic Transformation with Multiple Lookups HOT 5
- Setting recursion limits of grammars HOT 2
- DslAuthoringTutorial Confused by `substring-` prefix in folders. HOT 1
- Use framework HOT 7
- Tutorial error HOT 2
- Disjunctive spec clarification question HOT 4
- Error in Building ProseSample.sln HOT 2
- Net Core Build Fail HOT 6
- How to make grammar recursive HOT 5
- Parse humanReadable program string HOT 4
- Using Filter with Kth HOT 5
- Error in creating my own DSL
- Failed to run the tutorial on Mac M1 HOT 4
- Data Sets shared on google drive are missing HOT 5
- Test case not run
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from prose.