marceloboeira / bojack Goto Github PK
View Code? Open in Web Editor NEW๐ด The unreliable key-value store
Home Page: http://medium.com/@marceloboeira/why-you-should-build-your-own-nosql-database-9bbba42039f5
License: MIT License
๐ด The unreliable key-value store
Home Page: http://medium.com/@marceloboeira/why-you-should-build-your-own-nosql-database-9bbba42039f5
License: MIT License
I'm having trouble compiling bojack for my system:
Linux lawliet 4.8.0-51-generic #54~16.04.1-Ubuntu SMP Wed Apr 26 16:00:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Error in src/bojack/bootstrap.cr:3: instantiating 'BoJack::CLI:Class#run(Array(String))'
BoJack::CLI.run(ARGV)
^~~
in src/bojack/cli.cr:10: instantiating 'Commander::Command:Class#new()'
cli = Commander::Command.new do |command|
^~~
in src/bojack/cli.cr:10: instantiating 'Commander::Command:Class#new()'
cli = Commander::Command.new do |command|
^~~
in src/bojack/cli.cr:71: instantiating 'Commander::Commands#add()'
command.commands.add do |command|
^~~
in src/bojack/cli.cr:71: instantiating 'Commander::Commands#add()'
command.commands.add do |command|
^~~
in src/bojack/cli.cr:91: instantiating 'BoJack::Console:Class#new(String, (Int32 | Int64))'
BoJack::Console.new(options.string["hostname"], options.int["port"]).start
^~~
instance variable '@client' of BoJack::Console must be BoJack::Client, not Nil
Error: instance variable '@client' is initialized inside a begin-rescue, so it can potentially be left uninitialized if an exception is raised and rescued
Makefile:6: recipe for target 'build' failed
make: *** [build] Error 1
The compiler alerts about a problem with:
@client = BoJack::Client.new(@hostname, @PORT) ## src/bojack/console.cr ##
Since it was initialized inside a begin, the compiler understands that it can eventually be used without being initialized, it pretty much ignores having an exit -1 in the rescue.
Only to solve the problem and move on, I changed the line @client = BoJack::Client.new(@hostname, @PORT) to BoJack::Client.new(@hostname, @PORT) in order to first test the connection and after the rescue, assuming that at that point the request was successful, I created the variable with the same parameters used in the test: @client = BoJack::Client.new(@hostname, @PORT)
For example:
def initialize(@hostname : String = "127.0.0.1", @port : Int8 | Int16 | Int32 | Int64 = 5000)
begin
BoJack::Client.new(@hostname, @port)
rescue exception
puts exception.message
exit -1
end
@client = BoJack::Client.new(@hostname, @port)
end
This solved my problem despite knowing that it isn't the best solution.
So, anyway, I hope this helps in some way.
The set command needs 2 params, key and value. Any attempt to execute it without the params result on this:
e.g.: set a
Server log
Unhandled exception:
Missing hash key: :value (KeyError)
[4516704689] *raise<KeyError>:NoReturn +81
[4516754682] *Hash(Symbol, Array(String) | String)@Hash(K, V)#[]<Symbol>:(Array(String) | String) +3818
[4516734926] ~procProc(Nil)@./src/bojack/server.cr:29 +9566
[4516567462] *Fiber#run:(Int64 | Nil) +54
Already working on it
Hey @marceloboeira
I understood what you tried to reach on 555a0e4 commit. You wanted to left the command to identify itself, hold his keyword avoiding the long switch
condition. I think it makes the things a little magic and you may forget to add it when you are creating a new command. (it has already happened :P a8d3403)
The common implementation of a Factory Pattern generally uses a Map to hold all the instances. I think it makes the algorithm cleaner and better cause cost O(1) instead of the O(n). Also, it makes the factory hold the whole logic. Thus if you need to know which keyword calls a specific command is easier to find in one place.
So I propose the follow implementation:
COMMANDS = {
"get" => Bojack::Commands::Get,
"set" => Bojack::Commands::Set,
"delete" => Bojack::Commands::Delete,
"size" => Bojack::Commands::Size
}
def self.from(keyword) : Bojack::Commands::Command?
COMMANDS[keyword].new if COMMANDS.exists?(keyword)
nil
end
What do you think @mauricioabreu, @hugoabonizio ?
Create a BoJack::Logger
singleton class, that accepts the same options as the STD logger, however holds an instance of the Logger which can be accessed over every part of the project without the need of injecting the dependency for every component that needs access to it.
Probably this class knows about the formatter, and the file outputs, implemented on #22 #23 #26.
This way all this code goes to a unique class, instead of the workarounds we have currently. ๐
Error reading file: Connection reset by peer (Errno)
0x103ae3250: *raise<Errno>:NoReturn at ??
0x103b321c6: *TCPSocket+@IO::FileDescriptor#unbuffered_read<Slice(UInt8)>:Int64 at ??
0x103b3133d: *TCPSocket+@IO#gets:(String | Nil) at ??
0x103b3101f: ~procProc(Nil)@src/bojack/event_loop/message.cr:10 at ??
0x103ae5dcd: *Fiber#run:(IO::FileDescriptor | Nil) at ??
I am 100% for using waffle, but would like your suggestion/input on that.
Please: @cristianoliveira / @mauricioabreu / @hugoabonizio / @joaocv3
If you are not familiar with, please: http://waffle.io
In order to orchestrate the flow and prevent any unhandled error, it is important to create levels of possible errors.
BoJack::Exceptions::Runtime
-> Runtime exception should return an error for the client, but does not affect any other client or requests. Examples: Invalid/Missing parameters,
BoJack::Exceptions::Fatal
-> Unexpected low level exceptions, close/kill signals. This one closes the connection with the current socket and report the error.
Use a global request timeouts setup
That would prevent any connection lock because of a Request
is stuck.
Currently:
Main Loop
-> Message
-> Channel(Request)
Channel Loop
-> Request
-> Command
-> Response
After:
Main Loop
-> Message
-> Channel(Request)
Channel Loop
-> Timeout Handler
-> Request
-> Command
-> Response
Then the loop on the channel waits for the Request
to finish or raise an exception if runs out of time.
Initially, the timeout setup will be defined only on server startup and will be shared among all connections:
bojack server [params] --timeout <value>
At some point it might be interesting to have a specific timeout per client, making it more flexible.
bojack console --timeout <value>
Or, as a command:
timeout <value>
@hugoabonizio @mauricioabreu @joaocv3 any thouths?
Hi! First of all, congratulations on this project, I love BoJack Horseman show too!
I was writing a client and I think I found a bug on the server when the connections isn't properly closed. The server keeps trying to read from socket inside the loop and the CPU usage gets 100%. Probably break the loop when the request is nil will fix this problem.
After compiled, BoJack looks for the logo file in the current folder, if not available raises an exception.
Solution: Don't use a file to read the logo from, use from the source file.
Currently the logger format is not very nice. We should make it something more human readable.
[bojack][host:port][%DATE%][%LEVEL%] Message
It is a suggestion, so I ask for the contributors to manifest they thoughts for the default logger message pattern.
This is already a Crystal::Logger
feature, to define the severity, but we must provide a public API in order to define the severity of the log output.
e.g.:
bojack server --log-level debug
|| bojack server --log-level 0
bojack server --log-level info
|| bojack server --log-level 1
bojack server --log-level warn
|| bojack server --log-level 2
bojack server --log-level error
|| bojack server --log-level 3
bojack server --log-level fatal
|| bojack server --log-level 4
Reference: https://crystal-lang.org/api/0.18.7/Logger/Severity.html
Add a safe way to increment counters with BoJack.
Our "users" should not have to deal with concurrency when trying to create a safe counter with BoJack.
e.g.:
counter = client.get("counter")
if counter
counter = counter.to_i
counter += 1
client.set("counter", counter)
end
That is completely unsafe if you have concurrent access, imagine this in a multi-threaded web-server.
For this I purpose the command increment $key
so the users can rely upon us to increment the key value by 1.
Important:
Only valid keys can be incremented, to be valid the key MUST already exist and be MUST be able to be casted to Integer, meaning that it cannot be an Array or a non-numerical String.
When the server is not available we raise an error, where we could handle and show only a message instead of the stack.
Would be great if we can set the port using a settings file.
As discussed #17
It seems that we are not correctly implementing the "secure" routine for the TCP connection/request handler.
https://crystal-lang.org/docs/guides/concurrency.html
Currently we only spawn new connections and new requests, but we are not entirely sure if they do run, when do they run, or even how do they run. If one of those raises an error we are just going to ignore it probably. (before we were crashing everything for any sort of error, now we just ignore as an Unhandled exception
). Also if the program finishes or crashes we don't make sure to handle or close incoming requests.
The next step should be implementing something with Fibers and Channels, to achieve a safe-way of handling the concurrency.
One or more fibers will handle the BoJack::Memory
and the new connections, as also a channel will handle the requests itself, since we don't need to share the memory between then but only access the memory instance.
Crystal's guide already has a very nice example, with TCP Sockets:
require "socket"
channel = Channel(String).new
server = TCPServer.new
spawn do
socket = server.accept
while request = socket.gets
channel.send(request)
end
end
loop do
request = channel.receive
# handle the request
...
end
I believe this way we can achive something more reliable, because we will create something similar to a queue to handle the requests, even though we may hold a connection open for a long time on very huge concurrent access, at least we ensure its execution.
The way we lookup for commands and handle params today is a bit messy, and not scalable.
Mostly because we have created a contract for the command, establishing that every command needs the memory
, a key
and a value
. Which was valid at some point but not anymore.
That causes several problems, among them we have some 'lost' commands here, the one who don't match the signature:
params = Bojack::Params.from(request)
bjcommand = Bojack::Command.from(params.command)
if bjcommand
response = bjcommand.execute(memory, params.key, params.value)
socket.puts(response)
elsif params.command == "ping"
socket.puts("pong")
elsif params.command == "close"
socket.puts("closing...")
socket.close
break
else
socket.puts("error: '#{params.command}' is not a valid command")
end
For further development I believe that handling this is important, new commands, for instance delete *
, time
... will also not match the pattern, which in the future will lead to a huge effort on making this scalable.
My suggestion is to create an open structure to transport params from the server's 'network layer' to any given command. The command itself should declare internally its dependencies (params) and validate on runtime for missing or invalid params. This way we can manage in a more elegant way.
Currently the logger is hardcode to the STDOUT, which may be not a very good option for every user, what about defining this in the CLI?
e.g.: bojack server --log ./shared/logs/bojack.log
BoJack would have to create a new file, with the given prefix and add a timestamp:
e.g.: ./shared/logs/bojack_201609201010.log
I'm getting a strange result when benchmarking BoJack Client, the performance when I reuse the same TCPSocket is much worse than opening a new one for each command. The code is here.
user system total real
shared 0.000000 0.030000 0.030000 ( 7.918069)
new 0.000000 0.010000 0.010000 ( 0.010263)
I noticed that @marceloboeira is making some tests on branch benchmarks
, are you having the same problem?
BoJack Client is going to be the default client to access BoJack servers. It will be a built in client, on the same binary as the server.
Usage examples:
bojack client <server:127.0.0.1> <port:5000>
It should have a flow very much like the telnet, but should be a extensible wrapper, so we can evolve with syntax, autocomplete, colors and such in a near future.
References:
@hugoabonizio is going to handle the main development, however any pull-request is very much welcome.
This is going to be a blocker for the first release of BoJack.
I think we should implement a Time To Live to keys. Adding a timeout to a key is an important feature to use BoJack as a caching backend, for example.
The increment
command isn't safe between multiple connections by now. Since Crystal doesn't let users manage threads for IO operations the correct term can be concurrent-safe, but the thruth is that there are no mutex access for writing data yet. Running this sample:
require "bojack-client"
client = BoJack::Client.new
client.set "a", 1
1000.times do
spawn do
client.increment "a"
end
end
puts client.get "a"
The result that I get is something between 320 ~ 360. Maybe this is related to #19.
Both help and hostname flags have the same "short" version to -h.
Usually the priority is for the help command.
We can either change the param name, or the short name for the flag.
Hi!
Following the idea in #39, I was playing with RESP protocol parsing and I think that it would be pretty easy to add a Redis compatible mode to BoJack!
We can start the server with a bojack server --resp
flag to start a server that understands Redis protocol. I've made some work here: https://github.com/hugoabonizio/resp.cr
WDYT @marceloboeira?
Use something more like 2018-01-24 22:08:09.382
(utc always) instead of 2018-01-24 22:08:09 +01:00
This is a relevant thing we need for the first release.
The client code sets 127.0.0.1:5000 as default connection address instead of the given params from CLI.
set a 1
1
append a 2
["1", "2"]
pop a
2
pop a
1
pop a
error: 'a' is not a valid key
get a
[]
When a
has no elements, returns that a
is an invalid key. However if you use the get
command with the same key the return is an empty array, proving that the key IS valid, but empty.
The error message should be different when is a list is empty.
Hello @marceloboeira :)
If you plan to make bojack production ready, you may want to also support RESP so you will get a full compatible GUI like Redsmin that will offer bojack deployments free out-of-the-box real-time monitoring and alerting (on top of administrative tasks depending on your support of Redis commands) :)
For instance tile38 will be doing this to get a free monitoring system from day one :)
We need a special repository for the BoJack client shard, so that Crystal developers can install only the client, as we already have the Python client.
Client only, the console should remain under this repository.
@hugoabonizio If you want to be in charge of this you could create the bojack.cr
repository that would be a client for BoJack, move the content from this repository to yours and also use the shard here to point to the Client
class.
If you do so, I would like to ask to also rename the CLI command, from client
to console
to avoid any future misunderstandings.
I would like to as both @mauricioabreu and @hugoabonizio that the clients follow the release number of the server, so it is easy to identify compatible versions.
It would be seek to have a BoJack Horse ASCII logo when booting the server.
See:
http://www.asciiworld.com/-Horses-.html
Please show us the logo here before creating the code so we can decide.
Also remember to put the name and version of the project. (BoJack v0.0.0 for now).
UNIX sockets can provide performance boost over TCP/IP looback.
One of the reasons of BoJack is my will to study TCP and SSL
Right now SSL is in a very alpha stage of development for Crystal, there are the standard library bindinds and as well a shard, not sure of which would be the best.
As @cristianoliveira already mentioned, we should leave on EXIT signal and not with the close command.
I have created this issue for us to discuss whether we should keep or not the close command and how should we handle EXIT signal from telnet connections.
Relevant information:
http://redis.io/commands/QUIT
Add all the possible flags usage as also its explanations to README file
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.