Git Product home page Git Product logo

libfastjson's Introduction

libfastjson

NOTE: libfastjson is a fork from json-c, and is currently under development.

The aim of this project is not to provide a slightly modified clone of json-c. It's aim is to provide

  • a small library with essential json handling functions
  • sufficiently good json support (not 100% standards compliant)
  • be very fast in processing

In order to reach these goals, we reduce the features of json-c. For similarities and differences, see the file DIFFERENCES.

IMPORTANT The current API is not stable and will change until version 1.0.0 is reached. We plan to reach it by summer 2016 at latest. With 1.0.0, the API will be stable. Until then, everything may change. Of course, we will not deliberatly break things but we need freedom to restructure.

Building on Unix with git, gcc and autotools

Prerequisites:

  • gcc, clang, or another C compiler
  • libtool

If you're not using a release tarball, you'll also need:

  • autoconf (autoreconf)
  • automake

Make sure you have a complete libtool install, including libtoolize.

libfastjson GitHub repo: https://github.com/rsyslog/libfastjson

$ git clone https://github.com/rsyslog/libfastjson.git
$ cd libfastjson
$ sh autogen.sh

followed by

$ ./configure
$ make
$ make install

To build and run the test programs:

$ make check

Linking to libfastjson

If your system has pkgconfig, then you can just add this to your makefile:

CFLAGS += $(shell pkg-config --cflags libfastjson)
LDFLAGS += $(shell pkg-config --libs libfastjson)

Without pkgconfig, you would do something like this:

LIBFASTJSON_DIR=/path/to/json_c/install
CFLAGS += -I$(LIBFASTJSON_DIR)/include/libfastjson
LDFLAGS+= -L$(LIBFASTJSON_DIR)/lib -lfastjson

libfastjson's People

Contributors

bachp avatar bwijen avatar cryogen avatar deweerdt avatar emielbruijntjes avatar ford-prefect avatar ghazel avatar haneefmubarak avatar hawicz avatar hikari-no-yume avatar jamesmyatt avatar janmejay avatar jehiah avatar jgerhards avatar jubalh avatar lespocky avatar mbiebl avatar michaeljclark avatar mjchinn avatar mloskot avatar mvdwerve avatar pkoretic avatar remicollet avatar rgerhards avatar rossburton avatar rouault avatar sixlettervariables avatar thecount avatar weltling avatar willdignazio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libfastjson's Issues

linkhash.c does not use atomic macros from atomic.h

0.99..2 release

linkhash.c in function lh_char_hash calls __sync_val_compare_and_swap directly instead of the ATOMIC_CAS_VAL compatibility macro defined in atomic.h. This causes link-time errors in rsyslog when cross-compiling for any architecture that lacks that sync_val_compare_and_swap function.

problem linking library. README incorrect?

I'm trying to replace json-c with this lib (0.99.2) to compile rsyslog 8.18.
The readme says to link with:
LDFLAGS+= -L$(LIBFASTJSON_DIR)/lib -llibfastjson

This gave me:
/usr/bin/ld: cannot find -llibfastjson

I looked in the pkgconfig file and it says:
Libs: -L${libdir} -lfastjson

Changing -llibfastjson to -lfastjson fixed the error for me.

check fjson_c_get_random_seed() #ifdef's

they may not work correctly, due to removed checks in configure.ac. HOWEVER, we may no longer need hash tables, and thus this function, so do not check before that decision is made.

fjson_object_iter_begin() invalid if first entry is deleted

The function does not check if the first entry inside the object table is empty. If it is, it invalidly returns a pointer to the non-existing (NULL) entry.

Bug was introduced after 0.99.2 while working on the 0.99.3 version. It does not exist in any released version.

change iterator API

currently, the full iterator structure is returned. this is inefficient if it is larges as a pointer. We can alternatively change it so that the caller always needs to pass in a pointer. However, this means implementation details will not properly be abstracted. In any case, we need to find a good way to handle this. Maybe the right one is to keep things like they are...

subobj: use small keys without malloc

We may use a 8-char (sizeof(Char*)) buffer to store small keys (via a union). However, we must think how we notice this was done. Most importantly, we should NOT copy constant keys - on the other hand: if the string is smaller, isn't that even better (also from a data spatial locality PoV).

remove "foreach" loop API

These require knowledge over implementation details. Instead, json-c already contained a clean iterator class (json_object_iterator.[ch]). They are functionally equivalent.

This also eases #35 (but that is not the main point).

do not call exit()

It's a terrible idea for a library to unconditionally terminate it's caller. This should be left as the choice of the caller.

Re-enable test suite

It is always a bad feeling to add a new package with disabled test suite when you read test suite was disabled due to problems.

I read in f189a25 that you plan to bring back the test suite with v1.0.0 so keep this as tracker bug to not forget :)

do not assert(ptr != NULL)

current hardware and testing tools reliably detect NULL access, so these asserts only slow down the code and make it less likely to detect performance-related issues (races).

Drop windows specific bits

If no-one is actively using/testing the windows specific bits, as it looks like, I think windows support is better removed, including stuff like
README-WIN32.html
config.h.win32
json-c.vcproj

(and the references to those files)

math_compat.h incorrectly tests declarations

In 0.99.2 HAVE_DECL_* is set to zero or 1 by autoconf and the test in math_compat,h is for definition. Change these to:

# if !HAVE_DECL_INFINITY    /\* for example */
# ....

rather than the existing #ifndef tests

please use consistent indentation

currently, the libfastjson sources use a wild mixture of 2 space, 4 space or tab indentation and even a mixture (tab + spaces).
This makes the code unnecessarily hard to read and inconsistent.

Please consider indentation scheme.

tautology in `fjson_object_iter_equal`

d4b3a2d introduced a tautology in fjson_object_iter_equal

            if ( (iter1->curr_idx == iter2->curr_idx) &&
                 (iter2->pg == iter2->pg)                ) {

resulting in:

[   68s] json_object_iterator.c: In function 'fjson_object_iter_equal':
[   68s] json_object_iterator.c:176:20: error: self-comparison always evaluates to true [-Werror=tautological-compare]
[   68s]          (iter2->pg == iter2->pg)                ) {
[   68s]                     ^~

Did you mean...

            if ( (iter1->curr_idx == iter2->curr_idx) &&
                 (iter1->pg == iter2->pg)                ) {

check testbench with clang and -fsanitze=address

The clang address sanitizer finds quite some problems during the testbench run, which valgrind does not detect. But this points into areas of code which mostly will be removed anyways, so wait for other work to continue.

Please use a consistent name space for exported symbols

Please consider using a consistent name space for the exported symbols, i.e ABI and the API

currently exported symbols which don't use json_:
array_list_add@Base 0.99.2
array_list_bsearch@Base 0.99.2
array_list_free@Base 0.99.2
array_list_get_idx@Base 0.99.2
array_list_length@Base 0.99.2
array_list_new@Base 0.99.2
array_list_put_idx@Base 0.99.2
array_list_sort@Base 0.99.2
lh_abort@Base 0.99.2
lh_char_equal@Base 0.99.2
lh_kchar_table_new@Base 0.99.2
lh_kptr_table_new@Base 0.99.2
lh_ptr_equal@Base 0.99.2
lh_table_delete@Base 0.99.2
lh_table_delete_entry@Base 0.99.2
lh_table_free@Base 0.99.2
lh_table_insert@Base 0.99.2
lh_table_insert_w_hash@Base 0.99.2
lh_table_length@Base 0.99.2
lh_table_lookup@Base 0.99.2
lh_table_lookup_entry@Base 0.99.2
lh_table_lookup_entry_w_hash@Base 0.99.2
lh_table_lookup_ex@Base 0.99.2
lh_table_new@Base 0.99.2
lh_table_resize@Base 0.99.2
mc_debug@Base 0.99.2
mc_error@Base 0.99.2
mc_get_debug@Base 0.99.2
mc_info@Base 0.99.2
mc_set_debug@Base 0.99.2
mc_set_syslog@Base 0.99.2
printbuf_free@Base 0.99.2
printbuf_memappend@Base 0.99.2
printbuf_memappend_char@Base 0.99.2
printbuf_memappend_no_nul@Base 0.99.2
printbuf_memset@Base 0.99.2
printbuf_new@Base 0.99.2
printbuf_reset@Base 0.99.2
printbuf_terminate_string@Base 0.99.2
sprintbuf@Base 0.99.2

think about the "null" object representation

currently (inherited from json-c), the null object is implemented differently from any other objects. It's not a struct json_object (ptr), but rather C NULL. This is inconsistent, and causes some trouble with the API (e.g. json_object_object_get(), which was deprecated for that reason). This also means that all callers need to handle the null object differently.

We should at least consider the represent the null object differently. A natural choice would be to do it just like with any other object with only the type set to fjson_type_null.

see also json-c/json-c#226

Feature: Option to (not) escape forward slashes

Feature request:
Escaping the forward slash (or solidus, U+002F) should be optional, not required.

Background:
My organization is moving towards logging events directly in JSON. We have found and/or created several tools that work better with the data if it is already formatted in the structured form, one that can use standard libraries to easily load the data without needing custom rules to parse it.

We prefer to use common tools, like grep, to easily search over the logs and extract events that meets our search criteria to later do further analysis and troubleshooting. Some of our logs contain file paths (Unix style as well as Windows) as well as HTTP content types (e.g. application/x-www-form-data, application/json, etc.).

Problem:
json-c defaults to escaping the forward slash during serialization. This makes our logs less intuitive to search over, since the analyst/engineer must now remember to escape their queries even though the presented data will be unescaped. This is problematic since many times the analysts/engineer is pivoting from other data they have seen, and should be able to copy-paste the search term to grep for. (i.e. "grep application/x-www-form-data app.log" makes more sense then "grep application/x-www-form-data app.log")

Reasoning:
Reading the JSON specification and other discussions on this subject makes me conclude that my request is the best direction forward.

Escaping the forward slash, according to the JSON spec, is optional. From the EMCA-404 (http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf) (the specification referenced to from http://www.json.org) Section 9, "All characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters U+0000 to U+001F." The forward slash (called here the solidus) is U+002F, outside the range of the control characters, and thus exempt for this list of must escape characters. Further evidence is found further down the section, where the spec provides an example using the forward slash. The fourth case is clearly the forward slash without the escape.

The above point is further corroborated by IETF's RFC 4627 (http://www.ietf.org/rfc/rfc4627.txt), specifying The application/json Media Type for JavaScript Object Notation (JSON). It uses the same definition for defining which characters must be escaped. It also provides a grammar for constructing JSON strings, which contains the rule "unescaped = %x20-21 / %x23-5B / %x5D-10FFFF", which includes the forward slash (%x2F).

The purpose for allowing the escaping of the forward slash at all is so that JSON can be embedded within HTML's JavaScript <script></script> tags without having to alter it (see the answer to this question on StackOverflow http://stackoverflow.com/questions/1580647/json-why-are-forward-slashes-escaped/1580664#1580664). This is a specific use case. Systems writing JSON within this use case should support and use the escaping, but other systems need not alter the strings to conform to this added requirement from this limited environment.

Conclusion:
Always escaping the forward slash to support the limited case of if this JSON would ever appear within an HTML document's script tags creates undesired consequences. It adds an unnecessary character to the data, increasing its data storage footprint (which is easily exacerbated over many events of similar structure). It prevents intuitive searching over stored JSON log events from common character sequences (Unix file paths, HTML content types, etc.). It is better that this escaping be optional, so that can be turned on if the JSON will/may end up within an HTML document as created or off if it will not be to save space and to be consistent in escaping with other languages such as C.

libfastjson:
json-c does provide an option to not escape the forward slash. The flag JSON_C_TO_STRING_NOSLASHESCAPE has to be added to the flags parameter passed to the json_object_to_json_string_ext() function call. libfastjson should carry this option forward as well to support the main JSON specification.

replace json_inttypes.h by stdint.h?

As we request C99 mode, we should be save with stdint.h. So in theory, we can use use stdint.h and are done. @mbiebl any concerns? Am I overlooking something? Can we enforce this on callers? (json-c since long forces callers to C99 if the "foreach" constructs are used).

handle escaped utf-8

The library does not seem to handle UTF-8 escaped strings correctly.

int main(void)
{
  char *s;
  json_object *json;

  s = "{ \"foo\" : \"\u00a9\" }";

  printf("string = %s\n", s);
  json = json_tokener_parse(s);
  printf("json = %s\n", json_object_to_json_string_ext(json, JSON_C_TO_STRING_PRETTY));

  json_object_put(json);
}

Output:

string = { "foo" : "©" }
json = {
  "foo":"\u00

Maybe I'm doing something wrong, or is it a bug?

encode chinese error!

libfastjson can't encode chinese correct.

i change test_parse.c like this:
new_obj = fjson_tokener_parse("// hello\n"张龙"");
print:
new_obj.to_string()============================"\u00E50.0\u00E9099"

from this,you can see clearly that \u00E50.0\u00E9099 is not a correct unicode.

macro rename

change FJSON_C_ to FJSON_ ; the "_C" part stems from json-c, so there is no need to keep this (and it actually looks somewhat strange). This affects the interface changes.

check version code

The functions associated with versions should base on configure.ac code. check version.[ch]

replace hash tables by linked list (or some other?)

I think that we may get overall better performance if we replace the hash tables by linked lists. Performance profiler (callgrind/kcachegrind) data suggests that the hash functions require a lot of computation time, and that access is not frequent enough to make up for this. We need to try an alternative implementation and compare the performance of the two.

It might make sense to keep both ways inside libfastjson, and permit the caller to specify which one should be used (e.g. linked lists if few fields per object are expected and hash table if many are).

In any case, the API should not assume that internally a hash table is used (there currently even is a function which returns a hash table).

replace json_object_get_object with a better solution

[f]json_object_get_object() returns a hash table, which exposes an implementation detail that a caller should never know about. We need to remove and replace it.

For objects with multiple attributes, the iterator family of functions can be used. For objects with just a single attribute, we should provide a peek funtion for key and value (plus probably a function to obtain the number of attributes inside the object (maybe such a function already exists).

remove support for user-defined data types

We only support the standard json data types. While user-defined extensions are nice to have, they are heavy in regard to the logic to support them. We should only revise this decision if during implementation we find there is limited value in removing them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.