Git Product home page Git Product logo

siridb / siridb-server Goto Github PK

View Code? Open in Web Editor NEW
497.0 36.0 48.0 10.63 MB

SiriDB is a highly-scalable, robust and super fast time series database. Build from the ground up SiriDB uses a unique mechanism to operate without a global index and allows server resources to be added on the fly. SiriDB's unique query language includes dynamic grouping of time series for easy analysis over large amounts of time series.

Home Page: https://siridb.com

License: MIT License

C 82.88% Python 14.43% Shell 0.09% Makefile 2.48% Dockerfile 0.12%
ticker-data siridb-server time-series timeseries database siridb

siridb-server's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

siridb-server's Issues

When a shard has an invalid header size at the end is found we do not mark the shard as corrupt.

When a shard has bytes left at the end and the number of bytes are less than one header size, we do not mark the shard as corrupt. The end result is that we start writing to the shard and only if the shard is optimized before actually turning SiriDB off, the new points are lost since we cannot read the shard after the invalid bytes. (optimizing solves this because only valid data is written to the optimized shard).

Insert rights are not checked.

All queries are check for the correct permissions but inserts can be done even with a user account without insert permissions.

Possible loss of data point when creating a replica server.

Since inserts are done asynchronous, it can happen that data points are missing when creating a new replica. We decide to send series to the replica and then start the asynchronous task.

One possible solution is to check if inserts are busy and only send the next series when no insert tasks are running. Another option is to decide to write data for the replica on each insert iteration.

Creating imap copy is not thread safe

Creating an imap slist is not thread safe and should have an appropriate lock when the list can be accessed by multiple treads. The main thread missed a lock which could cause the database to crash.

DNS is not resolved correctly.

When connecting to another server we do not correctly resolve DNS and for some reason localhost is used as fallback.

The correct behavior should be to first test for an IPv4 address, next an IPv6 address, then try DNS and if all fail we should simple write a log and free the socket. (on the next heart-beat, SiriDB will then try to connect again)

Big-endian bug in series ref count.

A series object reference counter is saved in a uint16_t, which is enough for a series object. Shards on the other hand can have more references and are stored in an uint32_t. We need to map a general function to increment and decrement the references on these objects. On little endian systems this works bun on big endian a problem might occur since bytes are stored in the reverse order.

DNS request should honor the ip_support setting

With issue #14 we have set dns request to be compatible with bot IPv4 and IPv6. Since we now have a property ip_support which can be set to IPV4ONLY, IPV6ONLY or ALL, we should honor this setting in DNS requests.

Add ip_support flag to the configuration file.

You should be able to enable the IPv6 stack only, and the same is true for IPv4 only.

We should add the option ip_support with the options ALL, IPV4ONLY and IPV6ONLY with the default setting to ALL

We should also update the grammar the view the ip_support setting on each server.

Build depends on msgpack.h

An unused include existed in auth.h and clserver.c for msgpack. MsgPack was used in early development but is now replaced with QPack.

Add Source IP to logging when an invalid package is received.

SiriDB writes a log entry when an invalid package (or too large package) is received. This line does currently not include the source IP which makes it hard to debug such packages.

Include the source IP so we can see who had send the illegal package.

Get shard size is slow and not thread safe

We can list, query or count shards and its size property. SiriDB currently uses a function to read the current shard size. This is rather slow since we read the file size from the operation system, or in case the file is open, we return the file size using an fseeko function call. This last function call is not thread safe and can conflict with the optimize thread.

We better keep a size property which is lightweight so we can return the property a lot faster and thread safe.

SiriDB listens to a client address/port but the address field is ignored.

Instead of solving this bug we better change the configuration option from
listen_client to listen_client_port and just accept a port number.

What we additionally can do is change listen_server to server_name and make
clear in the description that we listen on any address (0.0.0.0). We still need the
address because this is the address and port which other servers are using to connect to.

Aggregation methods median, median_low and median_high are slow

When using median, median_high or median_low on increasing data set, lets say for example having the values 1,2,3,4,5 etc, the current algorithm hit its worst case scenario.

Since these series are quite common (for example counter series) we should improve the algorithm for these types of series.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.