oklahomer / go-sarah Goto Github PK
View Code? Open in Web Editor NEWSimple yet customizable bot framework written in Go.
Home Page: https://pkg.go.dev/github.com/oklahomer/go-sarah/v4
License: MIT License
Simple yet customizable bot framework written in Go.
Home Page: https://pkg.go.dev/github.com/oklahomer/go-sarah/v4
License: MIT License
Currently Runner creates one channel per Bot to let Bot pass incoming messages to Runner. Runner receives messages via this channel and enqueue job to worker, where there is a chance to block if the size of worker job queue is not enough.
Raising the # of workers and queue length can be a tentative solution, but does not solve the fundamental issue of blocking channel. Employ a non-blocking mechanism similar to the one implemented in #10.
Runner, not Bot, should preferably provide this kind of safety mechanism since Bot may have multiple implementations by multiple developers.
Basic design principle is as below:
Currently Runner.Run
is a non-blocking method, but change this to be a blocking one that blocks till all registered Bots stop.
From caller's perspective, code should be somewhat like below:
runnerStop := make(chan struct{})
go func() {
runner.Run(ctx)
runnerStop <- struct{}{}
}()
sig := make(chan os.Signal, 1)
signal.Notify(sig, os.Interrupt)
signal.Notify(sig, syscall.SIGTERM)
// block til signal comes
select {
case <- sig:
// Stop by interruption
case <- runnerStop:
// Stop because all bots stopped
}
Command interface obligates developers to implement InputExample
method that returns string representation of input example or help message. This, however, is not utilized since neither Bot interface nor defaultBot/Adapter provide a way to receive message from users to ask for "help," and return help message.
Below spec should be included:
MAY
receive help message, and return list of input examples for registered Commands.MUST
receive help message and return list of input examples for registered Commands IF
supplied Adapter implementation requires to do so.
Commands
, iterate over them to list up all input examples, and send them back to user. Adapter implementation should not depends on Commands
type; instead receive slice of input examples as part of Output via Adapter.SendMessage
.New to Linux
Current constructor for defaultLogger
is defined for internal use as below:
func newDefaultLogger() Logger {
return &defaultLogger{
logger: log.New(os.Stdout, "sarah ", log.LstdFlags|log.Llongfile),
}
}
This is called on package load and the returned Logger is set as default.
To modify the behavior, developer may implement a struct that satisfy Logger
interface and set its instance via SetLogger
, which is a bit cumbersome task when switching wrapped standard logger.
Define func NewWithStandardLogger(l *log.Logger) Logger
that receives go's standard logger and set it to defaultLogger via &defaultLogger{logger: l}
.
CommandBuilder and its related registration methods are provided; Command implementation and its registration methods are not.
The easier and common way to register command is by using CommandBuilder
-- provide required arguments step by step and let Runner
builds it --, but if the developer prefers to define her own Command
implementation type and initialize this at her preferred timing all by herself, there is no reason to prohibit this.
Add Runner.RegisterCommand
or similar named instance method, which signature is of func (*Runner) RegisterCommand(BotType, Command)
. This method is to add given instance of Command
implementation to stored Bot
instance that corresponds to the given BotType
.
Travis CI is failing due to a depending library's Go 1.6 support removal. There is a possible workaround to remove the dependency and still support Go 1.6. However Go 1.11 is coming and it seems reasonable to just remove 1.6 support. After all the depending library is located under "x/" which is *semi-*standard and this is removing 1.6 support.
Ever since go-sarah's initial release two years ago, this has been a huge help to the author's ChatOps. During these years, some minor improvements and bug fixes mostly without interface changes were introduced. However, a desire to add some improvements that involve drastic interface changes is still not fulfilled, leaving the author in a dilemma of either maintaining the API or adding breaking changes.
Version 2 development is one such project to introduce API changes to improve go-sarah as a whole. Those developers who wish to keep using the previous APIs should use v1.X releases. Issues and pull requests with a v2 label should be included to v2 release unless otherwise closed individually.
In this project, sarah.Runner
takes care of subordinate Bot's critical state. A sarah.Bot
implementation calls a function to escalate its critical state to sarah.Runner
so sarah.Runner
can safely cancel failing sarah.Bot
's context and notify such state to administrators via one or more sarah.Alerter
implementations.
Lines 565 to 584 in a7408dc
In this way, sarah.Bot
's implementation can be separated from its state handling and alert sending; sarah.Bot
's lifecycle can be solely managed by sarah.Runner
. So the responsibility of sarah.Bot
implementation and its developer is minimal.
However only BotNonContinuableError
is received and handled by sarah.Runner
; Other noteworthy states are not handled or notified to developers. Define an error type that represents a noteworthy state change and let sarah.Runner
handle this. This handling should not cancel sarah.Bot
context but only notify such event to administrators.
Sarah provides a mechanism to update configuration struct for both Command
and ScheduledTask
when corresponding configuration file is updated. Race condition may occur in below situations:
Command
or ScheduledTask
replaces existing one.The first one occurs because updated configuration values are set to existing config instance; Config struct is not cloned or instantiated every time live update runs. This is by design. CommandPropsBuilder.ConfigurableFunc
and ScheduledTaskPropsBuilder.ConfigurableFunc
are designed to receive config instance instead of reflect.Type
so a struct instance with non-zero value field can be passed. e.g. 1) NewConfig
returns config with pre-set default value and other fields are read from configuration file on Command
(re-)build. 2) A DB connection is set to config struct beforehand and is expected to stay while other fields with json
/yaml
tags are updated on live-update. So cloning or instantiating config struct on live update is not an option in this case. In addition, cloning or deep copying can be complicated when field value is IO related object such as file handle. A locking mechanism seems appropriate.
The Second case is rather a typical read/write collision. Commands
type is merely an extension of slice, and therefore locking mechanism is required when replacing its element.
Is there an xmpp adapter for sarah - I can't see one.
Writing one would probably be a little beyond my current competence with go but I may have a go OR (more likely) pay a bounty for this. @oklahomer would you be interested?
I assume the task would involved taking the /slack folder and creating an /xmpp package with the same api but using an xmpp library rather than golack - so we then just use the adapter like snippet below - but everything else in Sarah would work the same as with the slack adapter. Do I understand this correctly?
func setupXmpp(config *slack.Config, storage sarah.UserContextStorage) (sarah.Bot, error) {
adapter, err := xmpp.NewAdapter(config)
if err != nil {
return nil, err
}
return sarah.NewBot(adapter, sarah.BotWithStorage(storage))
}
Define an interface that alerts critical Bot state and let Runner have one or more.
Below interface or similar one should do the work:
type AlertSender interface {
Send(BotType, error)
}
And Runner can have more than one AlertSender by declaring something like[]AlertSender
to utilize all kinds of means possible to notify administrator.
To start with, implement one for LINE Notify.
It has minimal interface, but is yet powerful notification tool.
https://notify-bot.line.me/en/
Not a big issue since usually main process also ends when Runner stops, but one of the purposes of this project is to prep author to write go-ish cocde, which means this has to be fixed as part of a training.
On runConfigWatcher, a new goroutine is created and this listens to Runner context's context.Done for exit cue.
When supervising directory is added via dirWatcher.watch, a new goroutine is created each time that listens to each bot context.Done. When bot context is canceled, this sends signal so dirWatcher can unsubscribe to Bot specific configuration directory. Sent signal is digested by the goroutine created in runConfigWatcher that exits on Runner stop.
Since Bots' contexts are subsets of Runner context, Runner context cancelation is propagated to all bot contexts, which lead to simultaneous unsubscription cueings and goroutine exits.
Below is a simplified code with some comments.
func (dw *dirWatcher) receiveEvent(runnerCtx context.Context) {
...
for {
select {
case <-runnerCtx.Done():
// Exits on Runner cancelation
dw.fsWatcher.Close()
log.Info("stop subscribing to file system event due to context cancel")
return
case botType := <-dw.cancel:
// Comes when bot context is canceled
}
}
}
func (dw *dirWatcher) watch(botCtx context.Context, botType BotType, path string, callback func(string)) error {
...
go func() {
<-botCtx.Done()
log.Infof("stop directory watch due to context cancel: %s.", botType)
// Tries to send unsubscription cue, but the Runner context may be closed at this point, and hence the target channel may block.
dw.cancel <- botType
}()
...
}
Currently user context is stored in-memory with go-cache where key is the user ID and value is the function to be executed on next user input. This cache mechanism is easy to use, but has several pros and cons.
Cached values are stored in a simple map and its value type is interface{}
. Since Go supports first-class functions, developer can simply put arbitrary function that satisfies func(context.Context, Input) (*CommandResponse, error)
interface. This function can be built on-the-fly on user access and may contain local variables that were declared outside of its scope.
Since this is stored in process memory space, the cached user contexts can not be shared among multiple processes. This is vital when bot is running on multiple processes. e.g. Bot is serving HTTP server and is balanced over multiple servers. In most cases bot is running on single process as a "Client" to receive user inputs via HTTP streaming, WebSocket or other form of single connection. But some messenger platform sends user inputs via HTTP requests to bot server. To scale bot server architecture, bot may consists of multiple servers and hence cache must be shared over multiple processes.
To share cache over multiple processes, developer should be able to store cache in KVS such as Redis where key is the user id and value contains function id and function argument.
Currently worker reports worker queue length via worker.superviseQueueLength with log.Info
. Provide an notifier interface and the ability to set/switch notifier so developers can define their desired behavior. e.g. export worker queue length to prometheus.
New.*Config returns config struct with default setting values.
Developers may feed this config struct's pointer to json.Unmarshal or yaml.Unmarshal to update one or more settings. Default setting stays for a given key if corresponding setting is not provided by json/yaml input.
Two steps -- sarah.NewRunner()
and Runner.Run()
-- are required to initialize and execute sarah.Runner
with the current implementation, which introduces some potential error-proneness and cumbersomeness.
First, because initialization of Runner
and execution is separated, there is a risk that developers may call Run()
method multiple times without the care of Runner
's state. Sarah's worker solves this problem with a simple approach as below:
Lines 95 to 98 in a7408dc
Run()
does not belong to a Worker
instance, but belong to workers
package as a global function. When Run()
is called, this package initializes a new worker instance, runs this and returns this as a workers.Worker
interface. This simplifies initialization process, and furthermore, this prohibits developers from calling execution method of an already running worker. sarah.Runner
can also benefit from employing the same approach to lower the potential error-proneness and cumbersomeness.
One more modification is to stash the RunnerOption
s in a sarah
's package scope variable and refer them on execution. Currently, developers must pass options to sarah.NewRunner()
explicitly, so the code before sarah.Runner
initialization tends to become relatively longer. The use of sarah.RunnerOptions
to stash a group of sarah.RunnerOption
helped this, but there still was room to improve:
Lines 106 to 140 in a7408dc
However, this also leads to a new limitation. A single process cannot initialize and run multiple sarah.Runner
with different settings because RunnerOption
s are going to be stored in a package scope variable and referenced on runner execution. The author thinks this is acceptable because multiple runner executions can simply be achieved by running multiple processes with one runner in each of them.
In exchange for such minor limitation, option registration can be much easier as below:
package hello
func init() {
// Directly register sarah.CommandProps
sarah.RegisterCommandProps(
sarah.NewCommandPropsBuilder().
BotType(slack.SLACK).
Identifier("hello").
InputExample(".hello").
MatchPattern(regexp.MustCompile(`\.hello`)).
Func(func(_ context.Context, _ sarah.Input) (*sarah.CommandResponse, error) {
return slack.NewStringResponse("Hello!"), nil
}).
MustBuild(),
)
}
package main
import (
// Since sarah.CommandProps is directly registered in hello package, this works by simply importing this.
_ "hello"
)
func main() {
config := sarah.NewConfig()
sarah.Run(context.BackGround(), config)
}
When Bot runs, a goroutine that receives Bot critical error is created. This goroutine neither listens to Bot context nor Runner contexts, so this never exit. This is the original issue.
One more thing to concern is that when critical error is given and this goroutine calls stopBot()
, the running Bot receives context.Done
and tries to finish all going tasks. At this point stopping Bot may have a chance to feed another critical error to this channel. If the listening goroutine is already exited, the listening channel blocks and hence Bot cancelation may block accordingly.
Call to *regexp.Regexp
method locks with sync.Mutex
, and hence has performance impact on concurrent execution. Use Regexp.Copy
to create copied *regexp.Regexp
instance where concurrent executions from multiple goroutines are expected.
e.g. Command.Match
Current sarah.Input
interface is defined as below:
// Input defines interface that each incoming message must satisfy.
// Each Bot/Adapter implementation may define customized Input implementation for each messaging content.
//
// See slack.MessageInput.
type Input interface {
// SenderKey returns the text form of sender identifier.
// This value can be used internally as a key to store the sender's conversational context in UserContextStorage.
// Generally, When connecting chat service has the concept of group or chat room,
// this sender key should contain the group/room identifier along with user identifier
// so the user's conversational context is only applied in the exact same group/room.
//
// e.g. senderKey := fmt.Sprintf("%d_%d", roomID, userID)
SenderKey() string
// Message returns the text form of user input.
// This may return empty string when this Input implementation represents non-text payload such as photo,
// video clip or file.
Message() string
// SentAt returns the timestamp when the message is sent.
// This may return a message reception time if the connecting chat service does not provide one.
// e.g. XMPP server only provides timestamp as part of XEP-0203 when delayed message is delivered.
SentAt() time.Time
// ReplyTo returns the sender's address or location to be used to reply message.
// This may be passed to Bot.SendMessage() as part of Output value to specify the sending destination.
// This typically contains chat room, member id or mail address.
// e.g. JID of XMPP server/client.
ReplyTo() OutputDestination
}
This was meant to be a representation of an incoming event in general, which means the implementation could be a text message, photo, video or any other form that may or may not include text. As a matter of fact, recent chat services employ many messaging events that do not include any text message, and indeed the document for Message()
says "This may return empty string when this Input implementation represents non-text payload such as photo, video clip or file." Then this could be more natural to remove Message()
as the incoming event may not be representing any text-based event.
However sarah.Command
will have to add an extra effort of type assertions to see if the incoming event is a text-based one, and then extract text message from the event. This should be planned with care.
In current implementation, Command may or may not return *CommandResponse. Current defaultBot.Respond checks against the returning value of Command, and if the response is nil it returns immediately with no further execution; If the response is given, this expects CommandResponse.Content to be non-nil and proceed to handle nil-able CommandResponse.Next.
This is designed in such way based on the below assumption:
If Command tries to return CommandResponse.Next to let user continue with current user context, command should also return some message content via CommandResponse.Content so that user can be aware of this behavior.
However a request was given asking for the ability to return nil CommandResponse.Content with non-nil CommandResponse.Next. In this way, bot returns no message to user, but still let the user stay in the current "user context" and let her continue from where CommandRepsonse.Next specifies. Author thinks this may damage user experience since the user sees no message while she stays in a context specified by currently executed Command, but there is no reason to deny this request as a bot framework's functionality. Developers should be aware of the risk, though.
Plan is to add sarah.NewSuppressedResponseWithNext where Next, ContextualFunc, is the only argument to indicate nil response with continuing context. This may be added right under sarah package since it has no typed CommandResponse.Content, and hence can be used by any Bot/Command implementation.
func NewSuppressedResponseWithNext(next ContextualFunc) *CommandResponse {
return &CommandResponse{
Content: nil
Next: next,
}
}
Instead of simply employing original github.com/robfig/cron, this project uses forked and customized version of this. The customization was required to assign identifier for each registered cron job to remove or replace on live configuration update. This customization however lowers maintenancibility, and hence a suitable replacement with well-maintained equivalent package is wanted.
The replacing package must have abilities to:
When two or more packages meet above requirements and have equal popularity, the one with minimal features and simple implementation should have higher priority.
For God's sake, please.
Not for everybody, but basically for author himself.
This is to make its API organized.
With the current scheduler implementation, a logger is not explicitly set for robfig/cron
. It required Golang's standard logger -- *log.Logger -- to be set in the past, but Sarah's logger did not necessarily hold a standard logger instance as this was a generalized interface and the implementation could vary. Therefore, go-sarah
could not pass a developer's preferred logging method to robfig/cron
.
Lines 54 to 59 in af6e976
The change introduced in robfig/cron#161 enables developers to pass cron.Logger
implementation. With this change, go-sarah
can implement cron.Logger
interface that wraps go-sarah
's logger and proxy calls to the underlying logger.
Note that a later commit -- robfig/cron@43863da -- changed the cron.Logger
interface. This change also provides a guide as below:
robfig/cron@43863da#diff-a20b1b3b4b2bca5bef2b853a6e3f19def513381f9c7ee68d6979f0435c885dcfR161-R168
Consider passing a log function with cron.VerbosePrintfLogger
.
Rename the directory and let each example project have its own go.mod file to better manage project-specific dependencies.
With the current implementation, Match()
is called against a user's input to see if the sarah.Command
should be executed. Not only checking if the command invocation text such as "/hello" is included in the input, but this also works as an easy authentication mechanism because Match()
can internally check who the input sender is and what chat group the event takes place. e.g. Only return true to execute the command when administrator user inputs a "/reboot" command in #admin group; return false when other users input the command in other groups.
On the other hand, when help command is given and return input examples to users, go-sarah
currently returns input example for all registered commands. So even though certain commands are hidden from users for execution, input example can be displayed to all users in all groups. A workaround is to modify the example text to include some excuses such as "This command reboot chatbot instance. This is only available for administrator users." However this is not helpful when a command should be entirely hidden from users in some groups. e.g. NSFW commands should never be exposed in public group.
Its design should be defined with care because this involves sarah.Command
implementations given by developers and those built with sarah.CommandProps
.
A new error handling library "xerrors" is now available and is going to be incorporated with 1.13 as a standard library. Although some other libraries including "pkg/errors" enables developers to propagate an original error value in a hierarchical manner, go-sarah
's author has been wondering if such non-standard library should be employed. This project is a third party library for most developers and it is usually not preferred that such library includes additional dependencies. Now that "xerrors" is officially confirmed to be a standard library in the near future and is ported to older Golang versions, the author believes this is safe to employ such library to express errors in a more informative manner.
Use xerrors.Errorf("some message: %w", err)
and xerrors.New("some message")
for error initialization instead of fmt.Errorf()
and errors.New()
where it is appropriate.
The name SentAt()
is more closely related to the act of message sending. If this name can be replaced with a more general one such as TimeStamp()
, TimeStamper
interface can be introduced. Such an interface may be used to represent an event with a particular time of occurrence. This could also be easy for developers because any event that associates with a timestamp can be treated in the same way to retrieve it.
With current definitions for some Config
structs, fields are typed as time.Duration
to express time intervals. The underlying type of time.Duration
is merely an int64 and its minimal unit is nano-second, so JSON/YAML mapping is always painful.
type Example struct {
RetryInterval time.Duration `json:"retry_interval"`
}
{
"retry_interval": 1000000000 // 1 sec!
}
With above example, although the type definition correctly reflects author's intention, the JSON structure and its value significantly lack readability. A human-readable value such as the one time.ParsDuration
accepts should also be allowed.
Never. This breaks backwards compatibility.
To convert time.ParseDuration
-friendly value to time.Duration
, let Config
structs implement json.Unmarshaler
and yaml.Unmarshaler
. Some improvements may be applied, but below example seems to work.
type Config struct {
Token string `json:"token" yaml:"token"`
RequestTimeout time.Duration `json:"timeout" yaml:"timeout"`
}
func (config *Config) UnmarshalJSON(raw []byte) error {
tmp := &struct {
Token string `json:"token"`
RequestTimeout json.Number `json:"timeout,Number"`
}{}
err := json.Unmarshal(raw, tmp)
if err != nil {
return err
}
config.Token = tmp.Token
i, err := strconv.Atoi(tmp.RequestTimeout.String())
if err == nil {
config.RequestTimeout = time.Duration(i)
} else {
duration, err := time.ParseDuration(tmp.RequestTimeout.String())
if err != nil {
return fmt.Errorf("failed to parse timeout field: %s", err.Error())
}
config.RequestTimeout = duration
}
return nil
}
func (config *Config) UnmarshalYAML(unmarshal func(interface{}) error) error {
tmp := &struct {
Token string `yaml:"token"`
RequestTimeout string `yaml:"timeout"`
}{}
err := unmarshal(tmp)
if err != nil {
return err
}
unmarshal(tmp)
config.Token = tmp.Token
i, err := strconv.Atoi(tmp.RequestTimeout)
if err == nil {
config.RequestTimeout = time.Duration(i)
} else {
duration, err := time.ParseDuration(tmp.RequestTimeout)
if err != nil {
return fmt.Errorf("failed to parse timeout field: %s", err.Error())
}
config.RequestTimeout = duration
}
return nil
}
Currently log levels are defined as Debug, Info, Warn, and Error. However there is no option to filter outputs by given level.
Currently, an interface called watchers.Watcher
and its implementation is provided to subscribe to the changes of files under a predefined directory. When a Command
or ScheduledTask
's configuration file under the directory is updated, this implementation calls a callback function so the latest content of the file is read, mapped to a struct and then a corresponding Command
/ScheduledTask
is updated.
With the distributed server architecture, however, file-based configuration management is sometimes less preferred. Multiple servers may subscribe to a centralized configuration management system and reflect changes in a more real-time manner. HashiCorp's Consul, LINE's Central Dogma or some similar system can be used for such purpose.
The current wathcers.Watcher
should be redefined so the current implementation and some other implementations that subscribe to such systems introduced above can co-exist.
Create Mattermost adapter so that the project can be used on premises with Mattermost.
This may be easy through an adaption of existing go-mattermost integrations.
i.e. https://github.com/mattermost/mattermost-bot-sample-golang
Other than the core functionality, go-sarah hosts some general utilities.
Those are now located at https://github.com/oklahomer/go-kasumi.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.