beam-community / avro_ex Goto Github PK
View Code? Open in Web Editor NEWAn Avro Library that emphasizes testability and ease of use.
Home Page: https://hexdocs.pm/avro_ex/AvroEx.html
An Avro Library that emphasizes testability and ease of use.
Home Page: https://hexdocs.pm/avro_ex/AvroEx.html
UPDATE: I just realized this issue was recently fixed for arrays. It's still happening for maps but I might be able to fix it.
Hi,
I'm having an issue when trying to decode a previously encoded empty array or map.
Using the following schema:
{
"type": "record",
"name": "Minimal",
"fields": [{
"name": "my_list",
"type": {"type": "array", "items": "string"}
}]
}
When I try to encode an emtpy list, and then decode it again, the following happens:
iex> {:ok, encoded} = AvroEx.encode(schema, %{"my_list" => []})
{:ok, <<0>>}
iex> {:ok, decoded} = AvroEx.decode(schema, encoded)
** (FunctionClauseError) no function clause matching in AvroEx.Decode.variable_integer_decode/3
The following arguments were given to AvroEx.Decode.variable_integer_decode/3:
# 1
""
# 2
""
# 3
%AvroEx.Schema.Primitive{metadata: %{}, type: :long}
Attempted function clauses (showing 3 out of 3):
def variable_integer_decode(<<0::integer()-size(1), n::integer()-size(7), rest::bitstring()>>, acc, %AvroEx.Schema.Primitive{type: :integer}) when is_bitstring(acc) and is_bitstring(acc)
def variable_integer_decode(<<0::integer()-size(1), n::integer()-size(7), rest::bitstring()>>, acc, %AvroEx.Schema.Primitive{type: :long}) when is_bitstring(acc) and is_bitstring(acc)
def variable_integer_decode(<<1::integer()-size(1), n::integer()-size(7), rest::bitstring()>>, acc, type) when is_bitstring(acc) and is_bitstring(acc)
(avro_ex) lib/avro_ex/decode.ex:180: AvroEx.Decode.variable_integer_decode/3
(avro_ex) lib/avro_ex/decode.ex:77: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:92: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:101: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:144: anonymous fn/4 in AvroEx.Decode.do_decode/3
(elixir) lib/enum.ex:2949: Enum.reduce_range_dec/4
(avro_ex) lib/avro_ex/decode.ex:143: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:112: anonymous fn/3 in AvroEx.Decode.do_decode/3
If I switch the array
for a map
in the schema and try again with an empty map, the same behavior occurs.
Hi,
In confluent (and perhaps also in other AVRO schema check tools) this following schema passes the validation check.
{
"type": "record",
"name": "Level0",
"namespace": "root.namespace",
"fields": [
{
"name": "Level1",
"type": {
"type": "record",
"name": "Level1",
"fields": [
{
"name": "Lavel2",
"type": [
"null",
{
"type": "record",
"name": "Lavel2",
"fields": [
{
"name": "Lavel3",
"type": {
"type": "array",
"items": {
"type": "record",
"name": "Lavel4",
"fields": [
{
"name": "Lavel5",
"type": {
"type": "enum",
"name": "Lavel5",
"symbols": [
"EnumOption1",
"EnumOption2",
"EnumOption3",
"EnumOptionDefault"
],
"default": "EnumOptionDefault"
},
"default": "EnumOptionDefault"
}
]
}
},
"default": []
}
]
}
]
},
{
"name": "Lavel21",
"type": [
"null",
{
"type": "record",
"name": "Lavel21",
"fields": [
{
"name": "Lavel31",
"type": {
"type": "array",
"items": {
"type": "record",
"name": "Lavel41",
"fields": [
{
"name": "Lavel5",
"type": "Lavel5",
"default": "EnumOption1"
}
]
}
},
"default": []
}
]
}
]
}
]
}
}
]
}
But AvroEx.decode_schema/1
is throwing the following error.
(MatchError) no match of right hand side value:
{:error, %FunctionClauseError{module: AvroEx.Schema, function: :type_name, arity: 1, kind: nil, args: nil, clauses: nil}}
Am I missing something here, or is this a bug / missing support etc?
Can we get a new tag / release that includes the latest fixes? It looks like there as been several bugs fixed since the last release which was about a year ago.
If you have a record without a name
field, you'll get an error like
** (FunctionClauseError) no function clause matching in Ecto.Changeset.add_error/4
The following arguments were given to Ecto.Changeset.add_error/4:
# 1
#Ecto.Changeset<action: nil, changes: %{name: "match", type: %{"fields" => [%{"name" => "uuid", "type" => "string"}, %{"name" => "type", "type" => "string"}, %{"name" => "league", "type" => "string"}], "type" => "record"}}, errors: [], data: #AvroEx.Schema.Record.Field<>, valid?: true>
# 2
:type
# 3
%{name: ["can't be blank"]}
# 4
[]
Attempted function clauses (showing 1 out of 1):
def add_error(%Ecto.Changeset{errors: errors} = changeset, key, message, keys) when is_binary(message)
(ecto 3.7.1) lib/ecto/changeset.ex:1665: Ecto.Changeset.add_error/4
(ecto 3.7.1) lib/ecto/changeset.ex:851: anonymous fn/4 in Ecto.Changeset.on_cast_default/2
(ecto 3.7.1) lib/ecto/changeset/relation.ex:137: Ecto.Changeset.Relation.do_cast/6
(ecto 3.7.1) lib/ecto/changeset/relation.ex:351: Ecto.Changeset.Relation.map_changes/9
(ecto 3.7.1) lib/ecto/changeset/relation.ex:121: Ecto.Changeset.Relation.cast/5
(ecto 3.7.1) lib/ecto/changeset.ex:822: Ecto.Changeset.cast_relation/4
(avro_ex 1.2.0) lib/avro_ex/schema.ex:283: AvroEx.Schema.cast_schema/3
(avro_ex 1.2.0) lib/avro_ex/schema.ex:39: AvroEx.Schema.parse/2
Hey,
first of all thanks for a great project! I would like to request the latest master release as a package version because you did a great job and remove Poison
dependency in favour of Jason
.
I think it's a great improvement and I'm building a library based on this project, but unfortunately, I can't publish it with a dependency from git.
UPD: Do you plan to release stable?
Thanks ❤️
Hello! I'm getting an exception from the library when trying to decode nine digit integers greater than or equal to 140_000_000
. I've pasted the exception and its trace below:
16:35:46.379 [error] GenServer #PID<0.516.0> terminating
** (ArgumentError) argument error
(avro_ex) lib/avro_ex/decode.ex:220: AvroEx.Decode.variable_integer_decode/3
(avro_ex) lib/avro_ex/decode.ex:47: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:139: anonymous fn/3 in AvroEx.Decode.do_decode/3
(elixir) lib/enum.ex:1940: Enum."-reduce/3-lists^foldl/2-0-"/3
(avro_ex) lib/avro_ex/decode.ex:138: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:11: AvroEx.Decode.decode/2
(transactional_messaging_service) lib/avro/decoder.ex:9: Avro.Decoder.decode_datum/1
(transactional_messaging_service) lib/transactional_messaging_service/kafka_consumer.ex:17: anonymous fn/2 in TransactionalMessagingService.KafkaConsumer.handle_message_set/2
Last message: :timeout
Here's the pertinent snippet of the .avsc
schema file in question:
"fields": [
{
"name": "id",
"type": "int"
},
I've verified that 130_000_000 can be successfully decoded, but 140_000_000
cannot be successfully decoded.
There are a few things I'm not quite getting, and they're probably a result of my admitted lack of deep elixir debugging experience:
AvroEx.decode
into a function that matches against either {:ok, decoded_message}
and a default clause, but these exceptions are not being handled by the default function clause. Do you know why that might be?I'd love to help solve this issue in any way I can, so please let me know how I can be a helpful collaborator and contributor. Thank you so much in advance!
#70 added a better API for decoding messages, including a new exception AvroEx.DecodeError
. However, when issues occur, most of the time they do not generate help error messages. We should fix that!
Hey, thanks for putting together this library!
We are using AvroEx
to process messages sent over Kafka, and are running into some problems with a sufficiently complex avro model.
I've put together a small repo that isolates the issue we're running into here: https://github.com/mphfish/avro_ex_issue
Basically, what I've been able to isolate it down to, is what I think is an off-by-one error when you have an empty array that is behind nullable keys. If there are items in the array (as in the third unit test in that repo), the decode process works just fine. When we step through the code, do_decode/3
seems to be processing the empty byte arrays an additional time.
We've forked the code internally, and by adding:
def do_decode(%Record{}, _context, <<0>>), do: {%{}, <<>>}
def do_decode(%Record{}, _context, <<>>), do: {%{}, <<>>}
def do_decode(%Record{} = record, %Context{} = context, data) when is_binary(data) do
We're able to get it to properly decode messages, but we end up putting either %{}
or <<>>
into the arrays.
Unsure if you might have any ideas why this issue is occurring, but I'm happy to try anything we can with a much more complex Avro model that I can't have out in public.
Ecto is a big library and makes avro_ex
heavy-weight for what it is. I wanted to rewrite the schema validation portion to not use ecto, which I believe would be a breaking change and possibly controversial. I wanted to get some opinions before I took a crack at it
Hi! I'm having an issue when trying to decode a previously encoded message from another library.
Schema:
{
"type": "record",
"namespace": "lead",
"name": "CreateLead",
"fields": [
{
"name": "lead",
"type": {
"type": "record",
"name": "Lead",
"fields": [
{
"name": "id",
"type": {
"name": "uuid",
"type": "fixed",
"size": 36
}
},
{
"name": "phone",
"type": {
"name": "phone",
"type": "fixed",
"size": 11
}
},
{
"name": "name",
"type": "string"
},
{
"name": "utm_source",
"type": [
"null",
"string"
]
},
{
"name": "utm_medium",
"type": [
"null",
"string"
]
},
{
"name": "utm_campaign",
"type": [
"null",
"string"
]
},
{
"name": "utm_content",
"type": [
"null",
"string"
]
},
{
"name": "utm_term",
"type": [
"null",
"string"
]
},
{
"name": "referrer",
"type": [
"null",
"string"
]
}
]
}
}
]
}
Encoded message:
payload = "Obj\u0001\u0004\u0014avro.codec\bnull\u0016avro.schema\xEE\t{\"type\":\"record\",\"name\":\"CreateLead\",\"namespace\":\"lead\",\"fields\":[{\"name\":\"lead\",\"type\":{\"type\":\"record\",\"name\":\"Lead\",\"namespace\":\"lead\",\"fields\":[{\"name\":\"id\",\"type\":{\"type\":\"fixed\",\"name\":\"uuid\",\"namespace\":\"lead\",\"size\":36}},{\"name\":\"phone\",\"type\":{\"type\":\"fixed\",\"name\":\"phone\",\"namespace\":\"lead\",\"size\":11}},{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"utm_source\",\"type\":[\"null\",\"string\"]},{\"name\":\"utm_medium\",\"type\":[\"null\",\"string\"]},{\"name\":\"utm_campaign\",\"type\":[\"null\",\"string\"]},{\"name\":\"utm_content\",\"type\":[\"null\",\"string\"]},{\"name\":\"utm_term\",\"type\":[\"null\",\"string\"]},{\"name\":\"referrer\",\"type\":[\"null\",\"string\"]}]}}]}\u0000\xE7ƶ\xEAҀ\xE4\u001E\xBA\xB3\x972ޚ\xA1\xC6\u0002\x94\u000190ff2231-68fe-43fe-a1c2-ac24e12bd38279854538053\u0018Сергей\u0002\u000Eorganic\u0000\u0000\u0000\u0000\u0000\xE7ƶ\xEAҀ\xE4\u001E\xBA\xB3\x972ޚ\xA1\xC6"
When I try to dencode this object:
{:ok, json_schema} = @schema_path |> Path.absname |> File.read
{:ok, parsed_schema} = AvroEx.parse_schema(json_schema)
message = AvroEx.decode(parsed_schema, payload)
** (MatchError) no match of right hand side value: <<114, 100, 34, 44, 34, 110, 97, 109, 101, 34, 58, 34, 80, 101, 114, 102, 111, 114, 109, 101, 114, 67, 114, 101, 97, 116, 101, 76, 101, 97, 100, 34, 44, 34, 110, 97, 109, 101, 115, 112, 97, 99, 101, 34, 58, 34, 114, 117, 46, 115, ...>> lib/avro_ex/decode.ex:121: AvroEx.Decode.do_decode/3 lib/avro_ex/decode.ex:127: AvroEx.Decode.do_decode/3 lib/avro_ex/decode.ex:139: anonymous fn/3 in AvroEx.Decode.do_decode/3
(elixir) lib/enum.ex:1925: Enum."-reduce/3-lists^foldl/2-0-"/3
lib/avro_ex/decode.ex:138: AvroEx.Decode.do_decode/3
lib/avro_ex/decode.ex:139: anonymous fn/3 in AvroEx.Decode.do_decode/3
(elixir) lib/enum.ex:1925: Enum."-reduce/3-lists^foldl/2-0-"/3
lib/avro_ex/decode.ex:138: AvroEx.Decode.do_decode/3
How do I fix this error?
The Avro spec is pretty loose in what you're allow to include in a schema. This can be unhelpful when you're trying to write a schema and want the parser to give you detailed information on what is a recognized field versus one that will be ignored.
In #62 I began to add a strict check that any additional fields on a schema would raise an exception. We cannot do this by default as it may cause issues with interop between languages. Instead, we can support this feature by add a strict mode that can be turned on as an option to the parser strict: true
.
I need to put a record into a array as an optional item.
It seems the optional key must be presented in this case.
Is this an expected behavior?
Elixir 1.8
OTP 22
:avro_ex 2.1
profile_schema = %{
"type" => "record",
"name" => "profile",
"fields" => [
%{"name" => "name", "type" => "string"},
%{"name" => "age", "type" => ["null", "int"]},
]
}
array_schmea = %{
"type" => "array",
"items" => [
"null",
"string",
profile_schema
]
}
final_schema = AvroEx.decode_schema!(array_schmea, strict: true)
profile = %{
name: "Alice"
}
example = ["1","2", profile]
example_bin = AvroEx.encode!(final_schema, example) |> IO.inspect(label: "encode")
AvroEx.decode!(final_schema, example_bin) |> IO.inspect(label: "decode")
** (AvroEx.EncodeError) Schema Mismatch: Expected value of Union<possibilities=null|string|Record<name=profile>>, got %{name: "Alice"}
(avro_ex 2.1.0) lib/avro_ex.ex:163: AvroEx.encode!/3
#cell:k6mfb4za2ebwu6sux5ix32zdwrvnacar:27: (file)
profile = %{
name: "Alice",
age: nil
}
example = ["1","2", profile]
example_bin = AvroEx.encode!(final_schema, example) |> IO.inspect(label: "encode")
AvroEx.decode!(final_schema, example_bin) |> IO.inspect(label: "decode")
for example, if a Date
is provided in a default, e.g. when passing an elixir map to AvroEx.decode_schema/2
, it must be encoded back to json. Not sure what the solution is here but one to dig into
I have an issue when I try to encode a schema using the AvroEx.encode(schema, payload)
This is my schema object:
%AvroEx.Schema{
context: %AvroEx.Schema.Context{
names: %{
"command.Payload" => %AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "message",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
}
],
id: nil,
metadata: %{},
name: "Payload",
namespace: nil,
qualified_names: ["command.Payload"]
},
"command.test" => %AvroEx.Schema.Record{
aliases: [],
doc: "A Test Publish Event",
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "standard event metadata",
id: nil,
name: "meta",
type: "seven.cloud.services.event_metadata"
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "payload record",
id: nil,
name: "payload",
type: %AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "message",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
}
],
id: nil,
metadata: %{},
name: "Payload",
namespace: nil,
qualified_names: ["command.Payload"]
}
}
],
id: nil,
metadata: %{},
name: "command.test",
namespace: "command",
qualified_names: ["command.test"]
},
"seven.cloud.services.callback_topic" => %AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "topic",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{metadata: %{}, type: :string}
]
}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "partition",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{metadata: %{}, type: :integer}
]
}
}
],
id: nil,
metadata: %{},
name: "callback_topic",
namespace: nil,
qualified_names: ["seven.cloud.services.callback_topic"]
},
"seven.cloud.services.event_metadata" => %AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "id of this mesage",
id: nil,
name: "message_id",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "name of the emitting service",
id: nil,
name: "emitter_service",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "uuid of the emitting service",
id: nil,
name: "emitter_service_id",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "uuid of waiting callback in the emitting service",
id: nil,
name: "callback_id",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{metadata: %{}, type: :string}
]
}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "name of the event. Even if the name is defined elsewhere meta should always include the event name",
id: nil,
name: "message_name",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "UTC time stamp of event publication",
id: nil,
name: "timestamp",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: "null",
doc: "assert callback message topic & partition",
id: nil,
name: "callback_topic",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "topic",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{metadata: %{}, type: :string}
]
}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "partition",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{
metadata: %{},
type: :integer
}
]
}
}
],
id: nil,
metadata: %{},
name: "callback_topic",
namespace: nil,
qualified_names: ["seven.cloud.services.callback_topic"]
}
]
}
}
],
id: nil,
metadata: %{},
name: "event_metadata",
namespace: "seven.cloud.services",
qualified_names: ["seven.cloud.services.event_metadata"]
}
}
},
schema: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "id of this mesage",
id: nil,
name: "message_id",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "name of the emitting service",
id: nil,
name: "emitter_service",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "uuid of the emitting service",
id: nil,
name: "emitter_service_id",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "uuid of waiting callback in the emitting service",
id: nil,
name: "callback_id",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{metadata: %{}, type: :string}
]
}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "name of the event. Even if the name is defined elsewhere meta should always include the event name",
id: nil,
name: "message_name",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "UTC time stamp of event publication",
id: nil,
name: "timestamp",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: "null",
doc: "assert callback message topic & partition",
id: nil,
name: "callback_topic",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "topic",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{metadata: %{}, type: :string}
]
}
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "partition",
type: %AvroEx.Schema.Union{
possibilities: [
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{
metadata: %{},
type: :integer
}
]
}
}
],
id: nil,
metadata: %{},
name: "callback_topic",
namespace: nil,
qualified_names: ["seven.cloud.services.callback_topic"]
}
]
}
}
],
id: nil,
metadata: %{},
name: "event_metadata",
namespace: "seven.cloud.services",
qualified_names: ["seven.cloud.services.event_metadata"]
},
%AvroEx.Schema.Record{
aliases: [],
doc: "A Test Publish Event",
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "standard event metadata",
id: nil,
name: "meta",
type: "seven.cloud.services.event_metadata"
},
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: "payload record",
id: nil,
name: "payload",
type: %AvroEx.Schema.Record{
aliases: [],
doc: nil,
fields: [
%AvroEx.Schema.Record.Field{
aliases: [],
default: nil,
doc: nil,
id: nil,
name: "message",
type: %AvroEx.Schema.Primitive{metadata: %{}, type: :string}
}
],
id: nil,
metadata: %{},
name: "Payload",
namespace: nil,
qualified_names: ["command.Payload"]
}
}
],
id: nil,
metadata: %{},
name: "command.test",
namespace: "command",
qualified_names: ["command.test"]
}
]
}
}
This is my payload object:
%{
"meta" => %{
"callback_id" => nil,
"callback_topic" => nil,
"emitter_service" => "SampleService",
"emitter_service_id" => "SampleHost",
"message_id" => "XDITQOMvUW5Fjzj6~COay",
"message_name" => "command.test",
"timestamp" => "2019-03-06 21:06:42.149636Z"
},
"payload" => %{"message" => "Hello World"}
}
And this is the error that I get:
** (Protocol.UndefinedError) protocol String.Chars not implemented for {:error, :data_does_not_match_schema, "null", %AvroEx.Schema.Union{possibilities: [%AvroEx.Schema.Primitive{metadata: %{}, type: nil}, %AvroEx.Schema.Record{aliases: [], doc: nil, fields: [%AvroEx.Schema.Record.Field{aliases: [], default: nil, doc: nil, id: nil, name: "topic", type: %AvroEx.Schema.Union{possibilities: [%AvroEx.Schema.Primitive{metadata: %{}, type: nil}, %AvroEx.Schema.Primitive{metadata: %{}, type: :string}]}}, %AvroEx.Schema.Record.Field{aliases: [], default: nil, doc: nil, id: nil, name: "partition", type: %AvroEx.Schema.Union{possibilities: [%AvroEx.Schema.Primitive{metadata: %{}, type: nil}, %AvroEx.Schema.Primitive{metadata: %{}, type: :integer}]}}], id: nil, metadata: %{}, name: "callback_topic", namespace: nil, qualified_names: ["seven.cloud.services.callback_topic"]}]}}. This protocol is implemented for: Atom, BitString, Date, DateTime, Decimal, Ecto.Date, Ecto.DateTime, Ecto.Time, Float, Integer, List, NaiveDateTime, Time, URI, Version, Version.Requirement
(elixir) /private/tmp/elixir-20180825-41637-j0b7x7/elixir-1.7.3/lib/elixir/lib/string/chars.ex:3: String.Chars.impl_for!/1
(elixir) /private/tmp/elixir-20180825-41637-j0b7x7/elixir-1.7.3/lib/elixir/lib/string/chars.ex:22: String.Chars.to_string/1
(elixir) lib/enum.ex:2777: anonymous fn/3 in Enum.join/2
(elixir) lib/enum.ex:1925: Enum."-join/2-lists^foldl/2-0-"/3
(elixir) lib/enum.ex:1925: Enum.join/2
(elixir) lib/enum.ex:1314: Enum."-map/2-lists^map/1-0-"/2
lib/avro_ex/encode.ex:125: AvroEx.Encode.do_encode/3
lib/avro_ex/encode.ex:146: AvroEx.Encode.do_encode/3
lib/avro_ex/encode.ex:16: AvroEx.Encode.encode/2
(stdlib) erl_eval.erl:677: :erl_eval.do_apply/6
(iex) lib/iex/evaluator.ex:249: IEx.Evaluator.handle_eval/5
(iex) lib/iex/evaluator.ex:229: IEx.Evaluator.do_eval/3
(iex) lib/iex/evaluator.ex:207: IEx.Evaluator.eval/3
(iex) lib/iex/evaluator.ex:94: IEx.Evaluator.loop/1
(iex) lib/iex/evaluator.ex:24: IEx.Evaluator.init/4
(iex) lib/iex/pry.ex:64: IEx.Pry.pry/2
lib/test_support/mock_schema_registry.ex:36: KaufmannEx.TestSupport.MockSchemaRegistry.encode_event/2
lib/test_support/mock_bus.ex:107: KaufmannEx.TestSupport.MockBus.given_event/3
test/event_handler_test.exs:7: Sample.EventHandlerTest."test Events Can Be published & observed"/1
AvroEx.Decode.decode
is very useful to partially decode an avro binary, and keep the remaining bytes for further decoding using another schema. We use it to decode Avro OCF files.
Would it be possible to officially expose it as part of Avro's API, either directly or with a function like AvroEx.decode_partial
?
I can submit a PR if it helps !
Mix.install([:avrora, :avro_ex])
import ExUnit.Assertions
template = """
{
"type": "record",
"name": "format",
"fields": [
{
"name": "amounts",
"type": {
"type": "array",
"items": {
"type": "record",
"name": "amount",
"fields": [
{
"name": "amount",
"type": "long"
},
{
"name": "type",
"type": {
"type": "enum",
"name": "amount_type",
"symbols": [
"amount_type_a",
"amount_type_b"
]
}
}
]
}
}
}
]
}
"""
Avrora.start_link()
{:ok, schema_avrora} = Avrora.Schema.Encoder.from_json(template)
{:ok, schema_avro_ex} = AvroEx.decode_schema(template)
data = %{
amounts: [
%{type: "amount_type_a", amount: 10_000_000_000},
%{type: "amount_type_b", amount: 12_000_000_000}
]
}
{:ok, payload_avrora} = Avrora.Codec.Plain.encode(data, schema: schema_avrora)
{:ok, payload_avro_ex} = AvroEx.encode(schema_avro_ex, data)
assert payload_avrora == payload_avro_ex
** (ExUnit.AssertionError)
Assertion with == failed
code: assert payload_avrora == payload_avro_ex
left: <<3, 24, 128, 144, 223, 192, 74, 0, 128, 224, 139, 180, 89, 2, 0>>
right: <<4, 128, 144, 223, 192, 74, 0, 128, 224, 139, 180, 89, 2, 0>>
The difference seems to be in the first few bytes.
I want to call out that this might also be an issue with avrora, but we're considering switching and I ran a chunk of our existing data against avro_ex to see what breaks.
There's a bunch of dialyzer issues. They need to be fixed and dialyzer should be run in CI
lib/avro_ex/decode.ex:135:call
The function call will not succeed.
AvroEx.Decode.do_decode(
%AvroEx.Schema.Primitive{:metadata => %{}, :type => :long},
_context :: %AvroEx.Schema.Context{_ => _},
_data :: binary()
)
breaks the contract
(
binary()
| %{
:__struct__ =>
AvroEx.Schema.Array
| AvroEx.Schema.Enum
| AvroEx.Schema.Fixed
| AvroEx.Schema.Map
| AvroEx.Schema.Primitive
| AvroEx.Schema.Record
| AvroEx.Schema.Record.Field
| AvroEx.Schema.Union
},
AvroEx.Schema.Context.t(),
binary()
) :: {any(), any()}
________________________________________________________________________________
lib/avro_ex/decode.ex:144:call
The function call will not succeed.
AvroEx.Decode.do_decode(
%AvroEx.Schema.Primitive{:metadata => %{}, :type => :bytes},
_context :: %AvroEx.Schema.Context{_ => _},
_data :: binary()
)
breaks the contract
(
binary()
| %{
:__struct__ =>
AvroEx.Schema.Array
| AvroEx.Schema.Enum
| AvroEx.Schema.Fixed
| AvroEx.Schema.Map
| AvroEx.Schema.Primitive
| AvroEx.Schema.Record
| AvroEx.Schema.Record.Field
| AvroEx.Schema.Union
},
AvroEx.Schema.Context.t(),
binary()
) :: {any(), any()}
________________________________________________________________________________
lib/avro_ex/decode.ex:178:call
The function call will not succeed.
AvroEx.Decode.do_decode(
%AvroEx.Schema.Primitive{:metadata => %{}, :type => :long},
_context :: %AvroEx.Schema.Context{_ => _},
_data :: binary()
)
breaks the contract
(
binary()
| %{
:__struct__ =>
AvroEx.Schema.Array
| AvroEx.Schema.Enum
| AvroEx.Schema.Fixed
| AvroEx.Schema.Map
| AvroEx.Schema.Primitive
| AvroEx.Schema.Record
| AvroEx.Schema.Record.Field
| AvroEx.Schema.Union
},
AvroEx.Schema.Context.t(),
binary()
) :: {any(), any()}
________________________________________________________________________________
lib/avro_ex/decode.ex:185:call
The function call will not succeed.
AvroEx.Decode.do_decode(
%AvroEx.Schema.Primitive{:metadata => %{}, :type => :long},
_context :: %AvroEx.Schema.Context{_ => _},
_data :: binary()
)
breaks the contract
(
binary()
| %{
:__struct__ =>
AvroEx.Schema.Array
| AvroEx.Schema.Enum
| AvroEx.Schema.Fixed
| AvroEx.Schema.Map
| AvroEx.Schema.Primitive
| AvroEx.Schema.Record
| AvroEx.Schema.Record.Field
| AvroEx.Schema.Union
},
AvroEx.Schema.Context.t(),
binary()
) :: {any(), any()}
________________________________________________________________________________
lib/avro_ex/decode.ex:201:call
The function call will not succeed.
AvroEx.Decode.do_decode(
%AvroEx.Schema.Primitive{:metadata => %{}, :type => :long},
_context :: %AvroEx.Schema.Context{_ => _},
_data :: binary()
)
breaks the contract
(
binary()
| %{
:__struct__ =>
AvroEx.Schema.Array
| AvroEx.Schema.Enum
| AvroEx.Schema.Fixed
| AvroEx.Schema.Map
| AvroEx.Schema.Primitive
| AvroEx.Schema.Record
| AvroEx.Schema.Record.Field
| AvroEx.Schema.Union
},
AvroEx.Schema.Context.t(),
binary()
) :: {any(), any()}
________________________________________________________________________________
lib/avro_ex/decode.ex:219:call
The function call will not succeed.
AvroEx.Decode.do_decode(
%AvroEx.Schema.Primitive{:metadata => %{}, :type => :long},
_context :: %AvroEx.Schema.Context{_ => _},
_data :: binary()
)
breaks the contract
(
binary()
| %{
:__struct__ =>
AvroEx.Schema.Array
| AvroEx.Schema.Enum
| AvroEx.Schema.Fixed
| AvroEx.Schema.Map
| AvroEx.Schema.Primitive
| AvroEx.Schema.Record
| AvroEx.Schema.Record.Field
| AvroEx.Schema.Union
},
AvroEx.Schema.Context.t(),
binary()
) :: {any(), any()}
________________________________________________________________________________
lib/avro_ex/encode.ex:208:invalid_contract
The @spec for the function does not match the success typing of the function.
Function:
AvroEx.Encode.zigzag_encode/2
Success typing:
@spec zigzag_encode(%AvroEx.Schema.Primitive{:type => :integer | :long, _ => _}, integer()) :: integer()
________________________________________________________________________________
lib/avro_ex/schema/union.ex:21:no_return
Function cast/1 has no local return.
________________________________________________________________________________
lib/avro_ex/schema/union.ex:22:call
The function call will not succeed.
AvroEx.Schema.Union.changeset(%AvroEx.Schema.Union{:possibilities => nil}, %{:possibilities => maybe_improper_list()}) ::
:ok
def a() do
:ok
end
breaks the contract
(t(), :invalid | %{:__struct__ => none(), (atom() | binary()) => any()}) :: any()
________________________________________________________________________________
done (warnings were emitted)
Halting VM with exit status 2
Support canonicalizing schemas to their parsing canonical form
https://avro.apache.org/docs/current/spec.html#Transforming+into+Parsing+Canonical+Form
Only logical type fields should be allowed
This will support a path to defining schemas directly in Elixir that then can be encoded and registered with a schema registry
Currently, encodeable?
delegates to the Schema
module, which in turn delegates to match?
functions on each type. The problem with this implementation is that it is hard to guarantee its correctness without some serious property testing (See #51 )
With that being said, the recommended usage of this function is just for testing purposes. With that being said, performance is not important and we can just try to encode the data. If it succeeds, true
, otherwise, false
Top-level unions cannot be named (I think?). The current behavior needs to be adjusted accordingly to what the spec says.
Possibly helpful fastavro/fastavro#388
test/encode_test.exs:393
** (FunctionClauseError) no function clause matching in AvroEx.Schema.cast/1
The following arguments were given to AvroEx.Schema.cast/1:
# 1
%{"name" => "maybe_null", "type" => ["null", "string"]}
Attempted function clauses (showing 10 out of 25):
def cast(nil)
def cast("null")
def cast("boolean")
def cast("int")
def cast("long")
def cast("float")
def cast("double")
def cast("bytes")
def cast("string")
def cast(%{"type" => nil} = data)
...
(15 clauses not shown)
code: schema = AvroEx.parse_schema!(~S({"type": ["null", "string"], "name": "maybe_null"}))
stacktrace:
https://github.com/beam-community/avro_ex/runs/7027232352?check_suite_focus=true
1) test encoding of large integer as a union of int and long (AvroEx.PropertyTest)
Error: test/property_test.exs:5
** (MatchError) no match of right hand side value: false
code: assert {:ok, ^data} = AvroEx.decode(schema, encoded)
stacktrace:
(avro_ex 2.0.1) lib/avro_ex/decode.ex:254: AvroEx.Decode.variable_integer_decode/4
Warning: (avro_ex 2.0.1) lib/avro_ex/decode.ex:68: AvroEx.Decode.do_decode/4
(avro_ex 2.0.1) lib/avro_ex/decode.ex:189: AvroEx.Decode.do_decode/4
Warning: (avro_ex 2.0.1) lib/avro_ex/decode.ex:[19](https://github.com/beam-community/avro_ex/runs/7027232352?check_suite_focus=true#step:6:20): AvroEx.Decode.decode/3
(avro_ex 2.0.1) lib/avro_ex.ex:176: AvroEx.decode/3
Warning: test/property_test.exs:13: (test)
Hi - the current release (avro_ex 0.1.0-beta.6
) has a dependency on ecto 2.1.0 or 2.2.0
, but ecto is now on v3.2.5. My project depends on ecto 3.x - are there plans to make a release with updated dependencies soon?
Hi,
I believe I'm having an issue when the lib tries to decode a float like this 2.56238833228314e-39
.
When calling AvroEx.Decode.decode(schema, avro_binary)
get this error:
** (FunctionClauseError) no function clause matching in :lists.nth/2
The following arguments were given to :lists.nth/2:
# 1
-39
# 2
[
%AvroEx.Schema.Primitive{metadata: %{}, type: nil},
%AvroEx.Schema.Primitive{metadata: %{}, type: :double}
]
(stdlib) lists.erl:170: :lists.nth/2
(avro_ex) lib/avro_ex/decode.ex:162: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:139: anonymous fn/3 in AvroEx.Decode.do_decode/3
(elixir) lib/enum.ex:1940: Enum."-reduce/3-lists^foldl/2-0-"/3
(avro_ex) lib/avro_ex/decode.ex:138: AvroEx.Decode.do_decode/3
(avro_ex) lib/avro_ex/decode.ex:11: AvroEx.Decode.decode/2
I believe this is the bit causing the issue (as I don't have -39
on its own anywhere else in the file):
{
...
"additional": 2.56238833228314e-39,
...
}
Passing the following into AvroEx.decode_schema!/1
{
"type": "record",
"name": "record0",
"namespace": "namespace",
"fields": [
{
"name": "double_test",
"type": "double",
"default": 1
}
]
}
gives this error:
** (AvroEx.Schema.DecodeError) Invalid default in Field<name=double_test> Schema Mismatch: Expected value of double, got 1
Other validators, like the confluence cloud schema validator, validates this and to me it seems reasonable that an value without decimals should be parsed as a double.
With the following "fix" I was able to make AvroEx do as I wanted: https://github.com/beam-community/avro_ex/compare/master...larshesel:avro_ex:larshesel-double-integer-default?expand=1
I'm sure this is not the final solution - if you think this is an actual error I'd love to create a proper PR.
Heyo 👋
Ran into an edge case today where a schema defined in JSON containing nullable unions would have the "default": null
portion stripped out if it was run through a sequence of AvroEx.decode_schema
-> AvroEx.encode_schema
.
Example:
schema = AvroEx.decode_schema!(~S({
"type": "record",
"name": "Record",
"fields": [
{"type": ["null", "int"], "name": "nullable", "default": null}
]
}))
AvroEx.encode_schema(schema)
Output:
"{\"fields\":[{\"name\":\"nullable\",\"type\":[{\"type\":\"null\"},{\"type\":\"int\"}]}],\"name\":\"Record\",\"type\":\"record\"}"
I haven't dug too deep, but other primitives seem okay:
schema = AvroEx.decode_schema!(~S({
"type": "record",
"name": "Record",
"fields": [
{"type": ["null", "int"], "name": "nullable", "default": 5}
]
}))
AvroEx.encode_schema(schema)
"{\"fields\":[{\"default\":5,\"name\":\"nullable\",\"type\":[{\"type\":\"null\"},{\"type\":\"int\"}]}],\"name\":\"Record\",\"type\":\"record\"}"
Using v2.0.1
Hello,
We have encountered an error when encoding a DateTime
in a Union
:
** (Protocol.UndefinedError) protocol String.Chars not implemented for {:error, :data_does_not_match_schema, ~U[2020-09-17 12:56:50.438Z], %AvroEx.Schema.Union{possibilities: [%AvroEx.Schema.Primitive{metadata: %{}, type: nil}, %AvroEx.Schema.Primitive{metadata: %{"connect.name" => "org.apache.kafka.connect.data.Timestamp", "connect.version" => 1, "logicalType" => "timestamp-millis"}, type: :long}]}} of type Tuple. This protocol is implemented for the following type(s): Legacy.ReplayWorker.ReplayState, Common.Messages.DatasetsOnKafkaChanged, Common.Messages.WatchedKafkaConnectStatusChanged, ConnectorStateSummary, Postgrex.Query, Postgrex.Copy, Decimal, Atom, BitString, Date, DateTime, Float, Integer, List, NaiveDateTime, Time, URI, Version.Requirement, Version
code: new_avro_binary = AvroOCF.write(record_schema, new_blocks)
stacktrace:
(elixir 1.10.4) lib/string/chars.ex:3: String.Chars.impl_for!/1
(elixir 1.10.4) lib/string/chars.ex:22: String.Chars.to_string/1
(elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
(elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
(elixir 1.10.4) lib/enum.ex:1359: Enum.join/2
(elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
(elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
lib/avro_ex/encode.ex:124: AvroEx.Encode.do_encode/3
(elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
(elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
lib/avro_ex/encode.ex:124: AvroEx.Encode.do_encode/3
lib/avro_ocf.ex:47: Common.AvroOCF.encode_record/2
(elixir 1.10.4) lib/enum.ex:1396: Enum."-map/2-lists^map/1-0-"/2
test/avro_archive_file_test.exs:32: (test)
We were in the process of encoding a record as follows::
{:ok, record_encoded} = AvroEx.Encode.do_encode(record_schema.schema, record_schema.context, record)
Knowing that the schema comes from a decoding of an Avro file with avro_ex
.
After some research, it seems that something is missing here:
Line 108 in 75d683e
What do you think about it?
Change the encoder and decoder to throw exceptions rather than push error tuple a up the call stack. This will enable more efficient code and also will
result in better error messages
defmodule MyType do
use AvroEx.GeneratedSchema, file: "priv/avro_schemas/my_type.avsc"
def changeset(_, _) do
...
end
end
Alternatively, a mix task which generate .ex
files from a .avsc
file.
mix avro.gen.schema priv/schemas/my_schema.avsc MyApp.Context.SomeFile
test "works with logicalType field values" do
schema = AvroEx.parse_schema!(~S({"type": "record", "name": "Record", "fields": [
{"type": "long", "name": "timestamp", "logicalType": "timestamp-millis"}
]}))
timestamp = ~U[2022-02-23 20:28:13.498428Z]
assert {:ok, _} = @test_module.encode(schema, %{timestamp: timestamp})
end
Adding a test like this results in
code: assert {:ok, _} = @test_module.encode(schema, %{timestamp: timestamp})
left: {:ok, _}
right: {:error, %AvroEx.EncodeError{message: "Schema Mismatch: Expected value of long, got ~U[2022-02-23 20:28:13.498428Z]"}}
Hi,
Thank your for this amazing library !
I think I have found a small bug in the latest version of AvroEx: the empty string is not accepted as a valid value for a namespace, but I think that it should.
Indeed, while reading Avro's specification, it seems that an empty value is a valid value for the field namespace
(emphasis is mine):
A namespace is a dot-separated sequence of such names. The empty string may also be used as a namespace to indicate the null namespace.
Here is a small reproducible example:
Mix.install([{:avro_ex, "2.0.1"}])
schema = """
{
"type": "record",
"name": "something",
"namespace": "",
"fields" : [{"name": "count", "type": {"type": "string"}}]
}
"""
schema |> AvroEx.decode_schema() |> IO.inspect()
I get the following error:
{:error,
%AvroEx.Schema.DecodeError{
message: "Invalid name `` for `namespace` in %{\"fields\" => [%{\"name\" => \"count\", \"type\" => %{\"type\" => \"string\"}}], \"name\" => \"something\", \"namespace\" => \"\", \"type\" => \"record\"}"
}}
I stumbled upon this problem while trying to upgrade from AvroEx 1.2 to AvroEx 2.0.1. The avro files I'm decoding are produced by Avro's Java implementation.
Using the latest version of the code (commit 75d683e) I got this compilation error:
== Compilation error in file lib/avro_ex/schema/record/field.ex ==
** (CompileError) lib/avro_ex/schema/record/field.ex:20: type t/0 undefined (no such type in AvroEx.Schema.Record.Field)
(elixir 1.10.4) lib/kernel/typespec.ex:898: Kernel.Typespec.compile_error/2
(stdlib 3.13) lists.erl:1354: :lists.mapfoldl/3
(elixir 1.10.4) lib/kernel/typespec.ex:950: Kernel.Typespec.fn_args/5
(elixir 1.10.4) lib/kernel/typespec.ex:936: Kernel.Typespec.fn_args/6
(elixir 1.10.4) lib/kernel/typespec.ex:377: Kernel.Typespec.translate_spec/8
(stdlib 3.13) lists.erl:1354: :lists.mapfoldl/3
(elixir 1.10.4) lib/kernel/typespec.ex:229: Kernel.Typespec.translate_typespecs_for_module/2
could not compile dependency :avro_ex, "mix compile" failed. You can recompile this dependency with "mix deps.compile avro_ex", update it with "mix deps.update avro_ex" or clean it with "mix deps.clean avro_ex"
Elixir version: 1.10.4 (OTP 23)
The fix is to add this in the module as explained here:
@type t() :: %__MODULE__{}
I might make a PR when I have time.
Property test the library to guarantee correctness across encoding, decoding, and encodable?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.