siridb / siridb-server Goto Github PK

SiriDB is a highly-scalable, robust and super fast time series database. Build from the ground up SiriDB uses a unique mechanism to operate without a global index and allows server resources to be added on the fly. SiriDB's unique query language includes dynamic grouping of time series for easy analysis over large amounts of time series.

Home Page: https://siridb.com

License: MIT License

C 82.88% Python 14.43% Shell 0.09% Makefile 2.48% Dockerfile 0.12%

ticker-data siridb-server time-series timeseries database siridb

siridb-server's People

Stargazers

Watchers

siridb-server's Issues

Recursion in queries can cause a stack overflow error.

When using a huge amount of recursion in a query this can cause a stack overflow.

Counter on fifo buffer

We can add a counter on the fifo buffer.

IPv6 addresses are not parsed when used in the configuration file.

One should be able to use an IPv6 address to specify a server_name in the configuration file.
The IPv6 address should then be wrapped in square brackets.

For example:

server_name = [::1]:9010

Memory leak with queries using multiple regular expressions

When using multiple regular expressions in one query, a memory leak can occur.

Drop group can crash if a group is not available

Reported by @Koos85 .

When dropping a group which does not exist its possible the SiriDB crashes.

List and count statements do not always show when a series is not found.

List, count and select statements should return an error message when a series is queried by name and the series does not exist. This works correctly when the receiving server should have the series but fails if the series should exist in another pool.

Select statements using a merge and string filter can cause a crash.

When using a select statement with merge and a string filter, this generates an error message because string filters are not allowed on number series. The message is correctly returned most of the time, but sometimes can cause a crash of SiriDB.

Select statements using between can hang-up SiriDB

When a start time greater than the end time is parsed in a query, SiriDB cannot respond to this request and seems to hang.

When a shard has an invalid header size at the end is found we do not mark the shard as corrupt.

When a shard has bytes left at the end and the number of bytes are less than one header size, we do not mark the shard as corrupt. The end result is that we start writing to the shard and only if the shard is optimized before actually turning SiriDB off, the new points are lost since we cannot read the shard after the invalid bytes. (optimizing solves this because only valid data is written to the optimized shard).

Insert rights are not checked.

All queries are check for the correct permissions but inserts can be done even with a user account without insert permissions.

Format warnings occur while compiling on 32-bit linux.

An int64_t on a 64bit system is a long int, but a long long on a 32bit system.

Formatting an int64_t with "%ld" will raise a warning on 32bit.
Formatting an int64_t with "%lld" will raise a warning on 64bit.

Remove dependencies between cleri and SiriDB

see cesbit/libcleri#1

Authentication requests from the 'own' siridb server should not be accepted

On the google cloud platform the case showed up where a request send to another SiriDB server was received by its own process. We now accept this request because the UUID is valid. We should however reject this request.

SiriDB creates a lock file in directory lost+found

SiriDB will try to load databases from all directories inside the configured database path.
The first thing SiriDB does is create a lock file, and then finds its not a valid DB path.

Huge queries can cause a stack overflow.

Queries with a huge size > 10MB can cause a stack overflow.

Possible loss of data point when creating a replica server.

Since inserts are done asynchronous, it can happen that data points are missing when creating a new replica. We decide to send series to the replica and then start the asynchronous task.

One possible solution is to check if inserts are busy and only send the next series when no insert tasks are running. Another option is to decide to write data for the replica on each insert iteration.

Creating imap copy is not thread safe

Creating an imap slist is not thread safe and should have an appropriate lock when the list can be accessed by multiple treads. The main thread missed a lock which could cause the database to crash.

DNS is not resolved correctly.

When connecting to another server we do not correctly resolve DNS and for some reason localhost is used as fallback.

The correct behavior should be to first test for an IPv4 address, next an IPv6 address, then try DNS and if all fail we should simple write a log and free the socket. (on the next heart-beat, SiriDB will then try to connect again)

Query statement 'drop series where...' crashes.

Drop series works fine, except when using only a 'where' statement. If used in combination with a regular expression of group everything works fine.

Discovering an address from DNS is not ipv6 compatible.

We only check DNS for an ipv4 address but we should include support for ipv6.

(change AF_INET to AF_UNSPEC and make the callback function ipv6 compatible)

Big-endian bug in series ref count.

A series object reference counter is saved in a uint16_t, which is enough for a series object. Shards on the other hand can have more references and are stored in an uint32_t. We need to map a general function to increment and decrement the references on these objects. On little endian systems this works bun on big endian a problem might occur since bytes are stored in the reverse order.

Segmentation fault when performing median on large data set.

When using median, a segmentation fault can occur when using a large points set.

Thanks @Koos85 for reporting this issue!

Default back-end address 'localhost' is not a good choice and should be replaced

We should replace the default value 'localhost' in the configuration file with a variable, for example %HOSTNAME which will be replaced with the systems host name. Localhost is never a good choice since this is the address we send to 'other' SiriDB servers which will then try to connect to this address.

Client request CPROTO_REQ_FILE_GROUPS is not logged correctly.

When send a CPROTO_REQ_FILE_GROUPS, the logging report and UNKNOWN type.

Memory leak in difference function. (imap object)

Memory leak in alter group set expression.

Possible memory leak found in alter group set expression. The reference counter on series in the group are not decremented correctly.

DNS request should honor the ip_support setting

With issue #14 we have set dns request to be compatible with bot IPv4 and IPv6. Since we now have a property ip_support which can be set to IPV4ONLY, IPV6ONLY or ALL, we should honor this setting in DNS requests.

Add ip_support flag to the configuration file.

You should be able to enable the IPv6 stack only, and the same is true for IPv4 only.

We should add the option ip_support with the options ALL, IPV4ONLY and IPV6ONLY with the default setting to ALL

We should also update the grammar the view the ip_support setting on each server.

Server properties reindex- and sync-progress are not handled in where statements.

An assertion is hit when listing or counting server when querying with properties REINDEX_PROGRESS and SYNC_PROGRESS.

Buffer path can be none null-terminated when path exceeds PATH_MAX

When the buffer path exceeds PATH_MAX (4096 characters) we should still get a null terminated string. Otherwise a segmentation fault can occur.

Merging data from multiple aggregation functions seems to fail.

When multiple aggregation functions are used together with a merge statement SiriDB crashes.

SiriDB crashes on heavy load

When multiple packages are send in a stream, SiriDB crashes with a memory fault.

When enabling or disabling backup_mode, uninitialized data is written

Using valgrind some errors showed up while enabling backup mode.

uninitialized data is written by server send package function.

Build depends on msgpack.h

An unused include existed in auth.h and clserver.c for msgpack. MsgPack was used in early development but is now replaced with QPack.

Memory leak is possible after trying to create an invalid group

When creating a group fails, for example by using an incorrect name, the temporary group is not cleared correctly.

Corrupt groups dat file after creating a group with a <SPACE> ending in the query.

When ending a set expression with a space character after the expression, the groups are not saved correctly so fail to read at next startup.

Double free on socket when a reference still exists.

A double free on a socket can occur when receiving data on a socket which is closed already be the on_data function. We should prevent the on_data to close a socket twice.

Select queries seem to fail when having 5 or more databases installed

When using a select query, some work must be done by uv_queue_work and therefore needs space in the thread pool. The default UV_THREADPOOL_SIZE is set to 4 but each database requires at least one thread.

Removing shards can fail when initiated just before the optimize has started.

When a task to remove shard is initiated, and next the optimize task will start, we can optionally drop the 'wrong', say 'old' shard.

Add Source IP to logging when an invalid package is received.

SiriDB writes a log entry when an invalid package (or too large package) is received. This line does currently not include the source IP which makes it hard to debug such packages.

Include the source IP so we can see who had send the illegal package.

Pre allocate buffer space to improve creating new series.

Now we can re-use dropped buffer space its also possible to use the same technique for pre-allocating space in the buffer for new series.

SiriDB cannot load buffer file from another path

SiriDB can set another buffer_path using the configuration file but the buffer is not created inside the given path.

SiriDB can crash when emptying the fifo buffer.

When having multiple fifo files, emptying can fail when opening the next file.

Get shard size is slow and not thread safe

We can list, query or count shards and its size property. SiriDB currently uses a function to read the current shard size. This is rather slow since we read the file size from the operation system, or in case the file is open, we return the file size using an fseeko function call. This last function call is not thread safe and can conflict with the optimize thread.

We better keep a size property which is lightweight so we can return the property a lot faster and thread safe.

Allow optimize interval of zero to disable the optimize task.

An optimize interval or zero should be possible to disable the optimize task.

Some reference increments can be written as MACROS for performance

Some objects like series, shards, promises (maybe servers and groups?) are incremented and decrements a lot and can be written as macros. We still need a normal function call available since we sometimes need to parse the decref function as callback.

Buffer space for dropped series can be re-used.

When we drop a series, we can reclaim the allocated buffer space. This is faster compared to extending the buffer and saves space on a running SiriDB instance.

Median aggregation with high integer values are failing

When asking for a median, median_low or median_high on large values (close to 2**63) wrong values are returned.

SiriDB listens to a client address/port but the address field is ignored.

Instead of solving this bug we better change the configuration option from
listen_client to listen_client_port and just accept a port number.

What we additionally can do is change listen_server to server_name and make
clear in the description that we listen on any address (0.0.0.0). We still need the
address because this is the address and port which other servers are using to connect to.

Aggregation methods median, median_low and median_high are slow

When using median, median_high or median_low on increasing data set, lets say for example having the values 1,2,3,4,5 etc, the current algorithm hit its worst case scenario.

Since these series are quite common (for example counter series) we should improve the algorithm for these types of series.

siridb / siridb-server Goto Github PK

siridb-server's People

Stargazers

Watchers

Forkers

siridb-server's Issues

Recommend Projects

Recommend Topics

Recommend Org