samtecspg / articulate Goto Github PK
View Code? Open in Web Editor NEWA platform for building conversational interfaces with intelligent agents (chatbots)
Home Page: http://spg.ai/projects/articulate/
License: Apache License 2.0
A platform for building conversational interfaces with intelligent agents (chatbots)
Home Page: http://spg.ai/projects/articulate/
License: Apache License 2.0
These are open bugs in the UI that need to be fixed
In order to solve problems with system entities, we added an use original attribute to the slots object in the scenario.
This value will allow us to indicate that we want to use the original value rather than the parsed value of a recognized entity.
Nonetheless, this shouldn't so rigid. Sometimes maybe we would like to use the original value sometimes maybe not. I was thinking that it could be better if rather than having an attribute in the slots, we make use of a notation like api.ai does. For example if we want to use the original value of an slot then we can specify that like `The {transaction} for {date.original} are...".
On the other part, we have hardcoded right now in the ui every slot as non using the original value, because there exists some UI work to do. The first part of this issue is optional, this later statement isn't.
The uniqueness of the elements is being managed by the_id
field in ES. That means that agents, domains, intents, scenarios, and entities are being identified by the _id
field in the API.
The output of the parse or converse endpoint is hard to read given the GUIDs present in the output aren't developer friendly.
The idea here is two implement a beautify parameter the developer can set to true in order to view the output of the method with the names of the elements instead of the _id
field. This approach is useful for debugging purposes, but, the developer must use those _id
s to implement his webhook.
The other option is to use the field names for the output and the programming of webhooks. This could be solved by overloading the _id
in ES with the name of the element or, adding a new field like API.ai does for identify the action. If we choose the last option, the entity and scenario would also need a system name, given the current name is more a user label that could have spaces or any other format.
Rasa included a regex_features
section in their training data. We want to make use of this feature to improve the recognition of entities.
The idea here is that in the front end the user will have an optional field to enter a regex that helps the model to improve the recognition of the entity.
In the API we are going to store that as an attribute of the entity. When we create the training set we need to take that into account to send it over to Rasa.
For more information go to: https://rasa-nlu.readthedocs.io/en/latest/dataformat.html#regular-expression-features
From Aug 17 on language-api, @dcalvom entered:
Right now we are using Rasa and Duckling as our parsers. Both run inside an Async parallel process. Once finished both results are merged into a final result. The idea of this change is to allow this endpoint to use whatever ML engine the developer decide to integrate to its parser.
This selection of ML engines should be flexible for each developer.
Multi language support
Responsive platform
A component called input warning already exists in the UI project. Use this component to validate each form of the UI in order to prevent bad requests to the API.
Related to this functionality is the error management. API returns different errors depending if some field is missing or bad data was sent. The idea is to show the proper alert to the user based on the error returned by the API.
Implement the screens that would allow the user to edit data.
Here is some food for thought. Whenever you add or update an intent, the whole model will be retrained. It could happen depending on the size of the agent, that the training lasts some time. We need to think about how we are going to present that training time to the user.
I created a scenario like this:
{
"agent": "Samson Test",
"domain": "Corporate",
"intent": "Acronym Lookup",
"scenarioName": "Acronym Lookup",
"slots": [
{
"slotName": "Acronym",
"entity": "Acronyms",
"isList": false,
"isRequired": true,
"textPrompts": [
"Which acronym do you want me to lookup?",
"I think you want me to look up an acronym for you, but I am not sure which one, can you tell me?"
],
"useWebhook": false
}
],
"intentResponses": [
"I will lookup the meaning of {Acronym} for you.",
"Give me one second and I will get that for you."
],
"useWebhook": false,
"webhookUrl": "http://localhost:7500"
}
but yet when I converse I get this:
{
"timestamp": "2017-11-26T15:40:29.169Z",
"currentContext": {
"id": "1",
"name": "Acronym Lookup",
"scenario": "Acronym Lookup",
"slots": {
"Acronyms": "SOG",
"sys": {
"spacy_org": "SOG"
}
}
},
"timezone": "America/Kentucky/Louisville",
"textResponse": "I will lookup the meaning of {Acronym} for you."
}
Notice that under slots it is Acronyms instead of the slot name I defined which was Acronym.
We need to show confirm dialogs for the deletion.
Consider the time of retraining after intent is deleted.
From Aug 17 on language-api, @dcalvom entered:
We are trying to avoid having to look up the _id first, but rather just be able to use the name of the parsed intent for lookup. One example of how a call can look is this:
/agent/samson/domain/corporate/intent/acronyms/scenario
Besides the improvement in style, this would make more developer friendly calls and also would help in the future for the distribution of data.
Developers would like to check the output of the converse endpoint to check what has been sent by the system for debugging purposes.
Currently, we are just training domains whenever an intent is added, updated or deleted.
As we move to name references. Whenever a domain name, agent name or entity name gets updated it is going to affect the references to the trained models.
I was thinking about creating an endpoint at domain/{id}/train
to call manual training for a given domain. This endpoint then can be called when an agent, domain or entity is updated or deleted.
This is a missing feature that is present in the wireframes. It helps the developers to navigate back and forward in the UI.
The following are console warning or errors found in the test of the agent_context feature.
Warning: Failed prop type: Invalid prop `value` supplied to `StringCell`.
Warning: A component contains an input of type checkbox with both checked and defaultChecked props. Input elements must be either controlled or uncontrolled (specify either the checked prop, or the defaultChecked prop, but not both). Decide between using a controlled or uncontrolled input element and remove one of these props. More info: https://fb.me/react-controlled-components
Uncaught TypeError: Cannot read property 'value' of null
VM11500:1 GET http://127.0.0.1:8000/domain/default/intent 400 (Bad Request)
Uncaught (in promise) TypeError: Cannot read property 'toJS' of undefined
Warning: flattenChildren(...): Encountered two children with the same key, `newEntity`. Child keys must be unique; when two children share a key, only the first child will be used.
warning.js:33 Warning: Failed prop type: Invalid prop `currentAgent` supplied to `ConversationBar`.
From May 23 on language-api, @wrathagom entered:
API.ai had the idea of context. We're looking for something similar here, but in a much more targeted manner.
Internal to the NLU agent we are looking for something that can have basic back and forth communication to gather missing entities or clarify unknown ones.
This one still needs a lot of design :)
Jun 14 @dcalvom replied
Researching on this...
An issue has been opened in RASA repo RasaHQ/rasa#423
Jun 14 @wrathagom replied
I'm already in talks with the RASA guys to try and get early access to their dialogue manager. Supposed to know something by end of this week.
Jul 3 @wrathagom replied
Just updating in case I/we forget. We've got access to the Rasa DM Repo, so this one is on hold pending research of their offerings.
Aug 4 @wrathagom replied
This is all we've been doing for the past two days, I will add a link here once I transcribe as much of our conversation as I can.
Though theimplementation
of this is going to get split into multiple/smaller issues.
From May 16 on language-api, @wrathagom entered:
At the moment we are just going to kick off training at each POST/PUT that modifies a model. But this is less than ideal. We should have control over when we train such that all modifications can be completed.
We also need to implement a training test and feedback loop. When a model is first created we make sure that we can parse all intents and entities supplied in the training data. Whenever that model is modified we need to make sure we still can pass it.
From Aug 31 on language-api, @wrathagom entered:
/parse
endpoint with full intent and entity names so that I can avoid having to make the calls to figure out what the UID values are.This parameter will modify the output to show names instead UUID in the result of the parse endpoint
We are just managing one context. We need to create a persistent storage of the context, this is highly related to the issue #24, but the important thing here is to add an identifier that let the system identify the user that is interacting with the agent.
Define sample agents to provide developers with a base for their new agents.
The entity colors are used to identify entities in the UI. The color used by an entity is randomly selected when the screen loads.
We need to add an attribute to the entity that is color. This is going to be the hex value of a color from the material color palette. The goal is that each entity in the agent use a different color.
Check the wireframe that exists for this feature.
Using an a
element will cause the page to refresh and dropping the current state, instead by using the Link
component the action will be delegated to react-router
and this will update the state adn re-render the correct page. See this example.
From Sept 7 on language-api, @wrathagom entered:
I had a domain with only 2 intents, when I deleted one of them I don't believe that the domain classifier got re-trained.
Then when I tried to delete the 2nd the domain classifier tried to re-train, but failed because it only had one domain.
I'm still trying to verify this and reproduce it.
From Sept 5 on language-api, @wrathagom entered:
Right now trying to use /converse with just a single domain fails. The API returns:
{
"statusCode": 500,
"message": "An internal server error occurred",
"error": "Internal Server Error"
}
and the error message is:
nlu_1 | Debug: internal, implementation, error
nlu_1 | TypeError: Uncaught error: Cannot read property 'intent_ranking' of undefined
nlu_1 | at getBestRasaResult (/usr/src/app/modules/agent/tools/respond.agent.tool.js:52:48)
nlu_1 | at module.exports (/usr/src/app/modules/agent/tools/respond.agent.tool.js:147:28)
nlu_1 | at /usr/src/app/node_modules/async/dist/async.js:1268:19
nlu_1 | at nextTask (/usr/src/app/node_modules/async/dist/async.js:5274:14)
nlu_1 | at next (/usr/src/app/node_modules/async/dist/async.js:5281:9)
nlu_1 | at /usr/src/app/node_modules/async/dist/async.js:906:16
nlu_1 | at Async.parallel (/usr/src/app/modules/agent/controllers/converse.agent.controller.js:25:24)
nlu_1 | at /usr/src/app/node_modules/async/dist/async.js:3838:9
nlu_1 | at /usr/src/app/node_modules/async/dist/async.js:421:16
nlu_1 | at iterateeCallback (/usr/src/app/node_modules/async/dist/async.js:928:24)
It seems like the domain classifier should only kick in when domain.count > 1
more Oct 24
@dcalvom this one too, I was thinking you fixed it, but I can test if needed?
Reply from @dcalvom on Oct 24
Mmmm, I think it requires some testing
Enable the feature of delete for tags in user sayings, slots, and responses.
The I tried to parse against an agent I got this message
{
"message": "There aren't domains in the database please create a domain first",
"statusCode": 400,
"error": "Bad Request"
}
which wasn't too helpful since there were indeed domains, the problem was that I had mis-copied the agent Id. So the error should have been
{
"message": "There aren't domains in this agent please create a domain first",
"statusCode": 400,
"error": "Bad Request"
}
Right now I can create the same scenario over and over. I can POST the below payload and I just keep getting a higher and higher ID returned.
{
"agent": "Samson Test",
"domain": "Corporate",
"intent": "Acronym Lookup",
"scenarioName": "Acronym Lookup",
"slots": [
{
"slotName": "Acronym",
"entity": "Acronyms",
"isList": false,
"isRequired": true,
"textPrompts": [
"Which acronym do you want me to lookup?",
"I think you want me to look up an acronym for you, but I am not sure which one, can you tell me?"
],
"useWebhook": false
}
],
"intentResponses": [
"I will lookup the meaning of {Acronym} for you.",
"Give me one second and I will get that for you."
],
"useWebhook": false,
"webhookUrl": "http://localhost:7500"
}
Whenever the user is on a screen that has a form and tries to navigate outside that screen a message of "Are you sure you want to leave?" should be shown.
Knowing where we're heading we'll need converse and potentially parse to support POST
requests. Let's go ahead and add them.
Don't worry about anything else I talked about, those will come in a different issue. Let's just add the endpoints for now.
Is necessary to implement an import/export endpoint and functionality in the UI to let developers gain speed in the process of managing their agents.
There exist some cloud services that manage users like firebase and Auth0.
Auth0 offers free services for open source projects.
Have an open source project? Get Auth0 for free with our Open Source ProgramLEARN MORE
Once the authentication issue is solved, then we need to manage the sessions in the DB for two purposes. The first one to allow the platform to be multi-user, and the second one to enable session management in the converse side.
It should be nice to recognize the entities in the text of the user sayings to help users with the task of tag entities.
From Aug 9 on language-api, @wrathagom entered:
This one shouldn't be a high priority, but I don't want to get too far behind the latest release of Rasa.
I do know that they've changed the training data format quite a bit for entities and synonyms.
Whenever a resource is created the result is a plain text with the json response from the service.
The idea is to show some component that the reflects the loading status and also some messages. For this, I was thinking on using the toast component and the preloader component.
https://react-materialize.github.io/#/toast
https://react-materialize.github.io/#/preloader
From Aug 17 on language-api, @dcalvom entered:
The current slots structure allows the slots to be used as lists. This would be useful in cases like pizza toppings or if you are asking for multiple machines at the same time or scenarios like that.
This feature hasn't been implemented in the converse API to manage the response. We need to make an implementation of this if we keep with this structure.
Right now where are mostly calling an action, wait for the response and then either update the UI or redirect.
There are a few cases where another saga must be call and I had to perform some work arounds but there are some cases where this just created am unexpected behavior.
According to the redux page we could use redux-thunk
or other libraries.
Redux Sagas is supposed to be able to handle this but so far I haven't been successful doing this.
The purpose of this issue is to investigate how to perform this with sagas or other libraries. Also part of the implementation should stop the rest of the actions after an error.
Here are the scenarios that I found so far:
/
(currently the redirect doesn't work and sometime the agent list doesn't populate)From Sept 5 on language-api, @wrathagom entered:
Right now we can have two (or more) of any item (agents/domains/intents/entities) with the same name. This is because we are keying off of IDs instead of the name. Should we go through the effort to make agents unique, then domains inside agents unique, intents inside domains, etc.
more Oct 1
@malave how did you implement both names and _ids to be unique? We may need to do something similar here. Was it a lot of work? You only have two (maybe 3) indices, but @dcalvom has 5 or 6.
We moved the api from ES to redis. We need then to adjust some changes that were done in order to improve the api and also to take advantage of redis features.
The history of the context is being managed in memory. This was developed in this way for test purposes.
Now that we test it works, is time to move this to a better place. A whole discussion of a parallel DB is taking place right now given that ES haves a delay that could affect the behavior of the agent.
We need to talk with @milutz, @wrathagom and @ewoo to define what we are going to do to storage the context.
If you only use an option for the dropdown lists that implement the Input
component of react materialize a bug will raise.
For example if you leave just one option in the timezone dropdown the bug will be shown.
We need to fix this because it also affects when no agents had been created and you enter to other screens.
A temporal fix was done by adding more options to the timezone dropdown. The problem in the others screen is fixed whenever you add a new agent.
From Sept 4 on language-api, @dcalvom entered:
Currently the test data in the repo isn't complete. By adding better test data users would be able to use test the API more easily.
Some system entities that are returned by Rasa and Duckling are already managed, but, this is not totally implemented yet.
We need to finish that implementation on the API side and after that design the UI for those entities.
Consider these system entities need to be shown in the tagging process of user sayings.
This issue still requires design.
The idea is that the current + Create Agent
button in the side navbar that is used to create agents becomes a dropdown list where you can pick from a list of agents. Whenever you pick the agent, all the context change with the data related to that agent.
Whenever the user changes the agent, a new screen with the details of the agent is shown. The details are the same as the ones present in the create agent screen.
Whit this change all the dropdowns need to be removed in the create screens of the domain, intent, entity and, webhook.
The UI form for agents includes language, timezone, and description. Language is not too important right now as we just support English, but we can leave that as is not a big effort given that is already implemented.
So, this issue consists in adding the language, timezone, and description attribute to the agent in order to use it whenever a parse endpoint is used.
The most part of this changes are on the API side, but a minimum UI work is required.
From Aug 17 on language-api, @wrathagom entered:
We shouldn't let a user create a domain with fewer than 2 intents.
Aug 17 @dcalvom replied:
That isn't a Rasa constraint? Is there something we can do...this is just my first thought, I'm going to research
Aug 17 @wrathagom replied:
It's a Spacy Requirement. A model with less than two intents is kind of pointless...
I'm saying our API shouldn't let it make it to Rasa, we know Rasa will error out so we should prevent the creation of domain with only a single intent.
Aug 17 @dcalvom replied:
Got it!
Sept 4 @wrathagom replied:
@dcalvom, do you handle this one in your new logic? Essentially only re-training if the number of intents is > 1?
Sept 4 @dcalvom replied:
Yes I did, errors are handled, but we need to handle domains with just one intent in the converse and parse endpoint
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.