Git Product home page Git Product logo

Comments (26)

yfakariya avatar yfakariya commented on August 17, 2024

Thank you your suggestion.

Can you give me some sample code to describe your use case?

from msgpack-cli.

Fristi avatar Fristi commented on August 17, 2024

I used it conjuction with memory mapped files. I saw there is also a StreamAccessor for that, however you have to specify a size of how much you want to persist. By persisting it through a stream there is no info about that so I am using the ByteAccessor whereas i can calculate how much bytes i'll be persisting :)

I've done some more profiling and it seems the overhead is not that much, but well in the end it's extra work so would be nice to have it :)

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

What version do you use? Unfortunately, there are independent drops for
.NET implementation of MessagePack, so I guess you might use older
(another) one.

By the way, I think you can use MsgPack for CLI with memory mapped files
using System.IO.MemoryMappedFiles.MemoryMappedViewStream object and
existing API as following:

// ---- Packing ----
// assuming "mmvStream" and "serializingValue" are given
var serializer = MessagePackSerializer.Create();
// Pack(Stream,T) is convenience method which creates Packer internally.
// Specify false to ownsStream parameter to prevent that Packer.Dispose
close the stream.
using( var packer = Packer.Create( mmvStream, ownsStream: false ) )
{
serializer.PackTo( packer, serializingValue);
}

// ---- Unpacking ----
// assuming "mmvStream" and "deserializedValue" are given...
// Specify false to ownsStream parameter to prevent that Unpacker.Dispose
close the stream.
var serializer = MessagePackSerializer.Create();
// Unpack(Stream) is convenience method which creates Unpacker internally.
using( var unpacker = Unpacker.Create( mmvStream, ownsStream: false ) )
{
deserializedValue serializer.UnpackFrom( unpacker );
}

from msgpack-cli.

Fristi avatar Fristi commented on August 17, 2024

.NET 4/4.5.

Yes I am saw that you could use MemoryMappedViewStream, however I prefer using MemoryMappedViewAccessor. I want to know the size of each serialized entity so I can get the position in the MMF and build indices :)

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

I see you prefer MMVA.
Do you want to add convenient method to return a byte array without explicit intermediate Stream (usually MemoryStream) creation? I don't understand well that whether "a lot of overhead" is developer's overhead or object creation overhead.

from msgpack-cli.

Fristi avatar Fristi commented on August 17, 2024

The overhead is not that much , but there's overhead if the method is in the hot execution path. I don't know how much implications this will have on the performance though (not that much from what I believe). Would this a lot of work? From what I saw you have to make some new implementations of packers/unpackers right?

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

Yes, I have to design new API and implement it with higher performance. It
is hard for me now, but pull-requeat is welcome :)

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

I have a similar issue. I would like to feed the deserializer with raw bytes off the wire, but I can't. This is the best I could come up with and it doesn't work either.

I should be able to deserialize a stream of \r\n delimited serialized tuples:

MessagePackSerializer<ArrayList> argsSerializer = MessagePackSerializer.Create<ArrayList>();
ArrayList rv = new ArrayList();
string line;
using(StreamReader reader = new StreamReader(stm)) {
    while((line = reader.ReadLine()) != null) {
        using(Stream s = GenerateStreamFromString(line)){
            rv.Add(argsSerializer.Unpack(s));
        }
    }
}

It throws a 'Unpacker is not in the array header' error. I know there must be a better way, but I can't seem to find it. In python I can do this with amazing speed and I need to do the same in C#. Your library is awesome (and I appreciate the 3.5 build too) but I'm having troubles using it effectively for multiple serialized objects in one stream. I really don't want to redesign my server code.

Really, I want to put this in a co-routine iterator type object and yield the deserialzed tuples one at a time, which is trivial in python and should be simple in C# as well. In python, it's just:

def emit(self):
    """byte handling and yield"""
    self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    self.socket.connect((self.host, self.port))
    self.socket.sendall(self.request())
    self.delimiter="\r"
    tmp=[]
    while True:
        data = self.socket.recv(1)
        if data == self.delimiter:
            data = self.socket.recv(2)
            yield msgpack.unpackb(''.join(tmp))
            tmp = []
            if len(data) == 1:
                break
            data = data.lstrip()
        tmp.append(data)
    self.socket.close()
    return

Could we strip the C# API down and expose some of the basic functionality?

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

MessagePackSerializer assumes that source stream contains an array of MessagePackObject. But you just supplies SINGLE MessagePackObject MULTIPLY. This behavior is by design. Alternatively, you can do it with Unpacking API as following:

MessagePackSerializer<List<MessagePackObject>> argsSerializer = MessagePackSerializer.Create<List<MessagePackObject>>();
List<MessagePackObject> rv = new List<MessagePackObject>();
string line;
using(StreamReader reader = new StreamReader(stm)) {
    while((line = reader.ReadLine()) != null) {
        // note: If the tupple is [key, value], you can use Unpacking.UnpackArray instead.
        rv.Add(Unpacking.UnpackObject(GenerateBytesFromString(line));
      }
}

Is it help you?

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

Yes, Unpack(byte[]) that mimics the python msgpack.unpackb() (and Pack too of course) in the C# API would make my design pattern more workable across all my code.

I believe that my iterator pattern of receiving individual messagpack objects and yielding them immediately is much more efficient for my application, as it involves near realtime updating of HUD info and lets me close sockets in the middle of a stream and still have yielded useful deserialized data.

Thanks for the advice on my C# code and I'll try your suggestions. But the Unpack(byte[]) code would really make me happy.

Keep up the good work.

On 2013-02-15, at 22:59, Yusuke Fujiwara [email protected] wrote:

I recognized that the Unpack(byte[]) API can simplify your code even if it
only a simple naive wrapper of Unpack(Stream).
If tiny overhead of wrapping is not a matter, it is easy to add
Pack/Unpack(byte[]) overload. Am I right?

BTW, d4g33z, you may have another issue related to msgpack array. I think
you can resolve your issue with one of following:

  1. MessagePackSerializer assumes that source stream contains an
    array of MessagePackObject. But you just supplies SINGLE MessagePackObject
    MULTIPLY. This behavior is by design. Alternatively, you can do it with
    Unpacking API as following:
MessagePackSerializer<List<MessagePackObject>> argsSerializer =
MessagePackSerializer.Create<List<MessagePackObject>>();
List<MessagePackObject> rv = new List<MessagePackObject>();
string line;
using(StreamReader reader = new StreamReader(stm)) { // Note: I think there
is another issue...
while((line = reader.ReadLine()) != null) {
rv.Add(Unpacking.UnpackObject(GenerateBytesFromString(line));
}
}
  1. You can fix this issue by serializing args as proper MsgPack array
    stream like {0x93, 0x1, 0x2, 0x3}. If the source stream can be msgpack
    array, you can just deserialize it like following:
Stream stm = new MemoryStream(new byte[]{ 0x93, 0x1, 0x2, 0x3 });

MessagePackSerializer<List<MessagePackObject>> argsSerializer =
MessagePackSerializer.Create<List<MessagePackObject>>();
List<MessagePackObject> rv = argsSerializer.Unpack(stm);


Reply to this email directly or view it on GitHub.

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

I pushed new API to master branch. Would you check new API and send feedback to them? Alghough I guess Unpack(byte[], ref int) is right because caller have to know how long bytes are consumed by deserialization, it does not match very well for your scenario.

public abstract class MessagePackSerializer<T> {
    ...
    public byte[] Pack(T objectTree);
    public int Pack(byte[] buffer, T objectTree);
    public int Pack(byte[] buffer, int offset, T objectTree);
    public ArraySegment<byte> Pack(ArraySegment<byte> buffer, T objectTree);
    public T Unpack(byte[] buffer, ref int offset);
    public T Unpack(ref ArraySegment<byte> buffer);
}

If the design is OK, this improvement will be included in next release.

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

The byte[] array going into Unpack should be a representation of a single messagepack object, so why can't the deserializer just count the bytes in it, and start its read at the 0th byte? The don't think the consumer of the API should have to do pass in a ref int to track an array index, like the python msgpack.unpackb() function doesn't. I don't really care where the deserializer's final read in the byte array is because I've already split my stream into msgpack objects by simple \r\n delimiters. The final read is always byte[-1], so to speak, the last byte.

I just want the tuple that each byte[] array represents, as fast as possible, which I will then yield in an iterator. Then I grab the next line from the stream and do the same. Each tuple's deserialization step is independent of the other. It's in a co-routine, with an entry point when a serialized tuple comes off the wire.

I'll try with the new API tomorrow and post the code and some feedback, but I think there's a final simplification of the Unpack(byte[] ref int) call to be made.

I'm not experienced in C# so I could be missing something. The idea is to keep the socket connection as simple as possible and make the time delta between yields of a deserialized tuple in a stream of delimited serialized tuples as small as possible, and to allow breaking off the stream with a socket.close() anytime...

Thanks again. I wish I help more.

On 2013-02-17, at 10:27, Yusuke Fujiwara [email protected] wrote:

I pushed new API to master branch. Would you check new API and send feedback to them? Alghough I guess Unpack(byte[], ref int) is right because caller have to know how long bytes are consumed by deserialization, it does not match very well for your scenario.

public abstract class MessagePackSerializer {
...
public byte[] Pack(T objectTree);
public int Pack(byte[] buffer, T objectTree);
public int Pack(byte[] buffer, int offset, T objectTree);
public ArraySegment Pack(ArraySegment buffer, T objectTree);
public T Unpack(byte[] buffer, ref int offset);
public T Unpack(ref ArraySegment buffer);
}
If the design is OK, this improvement will be included in next release.


Reply to this email directly or view it on GitHub.

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

I misunderstood your use case. I have understood that the API should assume
that single object tree starts from the head of the bytes, and the API
should not care about extra bytes; I've noticed that it is consumers'
responsibility to pass single serialized object bytes correctly.
I will fix it tonight (in JST). Thank you for detailed response.

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

Awesome. I think it's a good pattern for your API to hook into. Can't wait to try the new code.

Thanks.

On 2013-02-18, at 7:46, Yusuke Fujiwara [email protected] wrote:

I misunderstood your use case. I have understood that the API should assume
that single object tree starts from the head of the bytes, and the API
should not care about extra bytes; I've noticed that it is consumers'
responsibility to pass single serialized object bytes correctly.
I will fix it tonight (in JST). Thank you for detailed response.

Reply to this email directly or view it on GitHub.

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

I've just pushed new API. Note that I changed API name to PackSingleObject and UnpackSingleObject respectively to emphasize its use case.
Thanks.

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

Perfect, I see it.

On 2013-02-18, at 10:45, Yusuke Fujiwara [email protected] wrote:

I've just pushed new API. Note that I changed API name to PackSingleObject and UnpackSingleObject respectively to emphasize its use case.
Thanks.


Reply to this email directly or view it on GitHub.

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

I've made an attempt to use Unpack(byte []) within an iterator. The stream connections are meant to timeout on a Read() and go away when the \r\n delimited serialized records are exhausted.

        private MessagePackSerializer<List<MessagePackObject>> resultSerializer = MessagePackSerializer.Create<List<MessagePackObject>>();

        public IEnumerable<List<MessagePackObject>> GetResult(TcpClient tcpclnt)
        {



            byte[] msgpkb = new byte[0];
            byte[] data = new byte[1];
            Stream stm = tcpclnt.GetStream();
            while (true)
            {


                Print("Reading socket...");

                try
                {
                    stm.Read(data, 0, 1);
                }
                catch
                {

                    yield break;
                }



                if (data[0] == delimiter)
                {
                    try
                    {
                        stm.Read(data, 0, 1);
                    }
                    catch
                    {
                        yield break;
                    }   

                    yield return resultSerializer.UnpackSingleObject(msgpkb);
                    msgpkb = new byte[0];

                }
                else
                {
                    try {
                        msgpkb = Combine(msgpkb, data);
                    }
                    catch {
                        yield break;

                    }

                }



            }



        }   

And the use is:

foreach (List<MessagePackObject> record in GetResult(tcpclnt)) Process(record);

I'm randomly throwing a MsgPack.InvalidMessagePackStreamException:

Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
3324433 97.22 1 3 634968909000000000 
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
3324431 97.21 2 2 634968909000000000 
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
   at MsgPack.ItemsUnpacker.ReadSubtreeObject(MessagePackObject& result) in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\ItemsUnpacker.Unpacking.cs:line 7145
   at MsgPack.ItemsUnpacker.ReadCore() in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\ItemsUnpacker.cs:line 104
   at MsgPack.Unpacker.Read() in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\Unpacker.cs:line 289
   at MsgPack.Serialization.UnpackHelpers.UnpackCollectionTo[T](Unpacker unpacker, MessagePackSerializer`1 serializer, IEnumerable`1 collection, Action`1 addition) in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\Serialization\UnpackHelpers.cs:line 203
   at MsgPack.Serialization.EmittingSerializers.Generated.System_Collections_Generic_List_1_MsgPack_MessagePackObject_Serializer0.UnpackFromCore(Unpacker )
   at MsgPack.Serialization.MessagePackSerializer`1.UnpackFrom(Unpacker unpacker) in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\Serialization\MessagePackSerializer`1.cs:line 199
   at MsgPack.Serialization.AutoMessagePackSerializer`1.UnpackFromCore(Unpacker unpacker) in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\Serialization\AutoMessagePackSerializer`1.cs:line 113
   at MsgPack.Serialization.MessagePackSerializer`1.UnpackFrom(Unpacker unpacker) in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\Serialization\MessagePackSerializer`1.cs:line 199
   at MsgPack.Serialization.MessagePackSerializer`1.Unpack(Stream stream) in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\Serialization\MessagePackSerializer`1.cs:line 112
   at MsgPack.Serialization.MessagePackSerializer`1.UnpackSingleObject(Byte[] buffer) in c:\Users\x\Documents\msgpack\msgpack-cli35\src\MsgPack\Serialization\MessagePackSerializer`1.cs:line 336
   at xxx.muRpc.<GetResult>d__0.MoveNext() in c:\Users\x\Documents\xxxxxxxxx\muRpc.cs:line 186

Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
Reading socket...
.
.
.

Any ideas? Is there a better (but still fast) way to do this or is there something to be fixed in UnpackFromSingleObject(byte[])?

Thanks in advance.

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

I think it might be API bug or the inbound bytes are actually invalid.
Could you show your input data for me? If you can't, would you put tracing
code which dump bytes array before UnpackSingleObject call and show it? I
cannot analyze any more from your post.

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

I think I got it. Seems to be flawless now. Thanks for sticking with me on this, it's pretty useful to me.

\r\n should never be part of a msgpack byte sequence, right?

        public IEnumerable<List<MessagePackObject>> GetResult(TcpClient tcpclnt)
        {


            byte[] msgpkb = new byte[0];
            byte[] data = new byte[1];
            byte[] tmp = new byte[1];

            Stream stm = tcpclnt.GetStream();

            while (true)
            {


                //Print("Reading socket...");

                try
                {
                    stm.Read(data, 0, 1);

                    Print("Got byte "+data[0]+" "+b2s(data));
                }
                catch
                {
                    Print("Break");
                    yield break;
                }



                if (data[0] == delimiter)
                {
                    Print("Got delimiter byte "+data[0]+" "+b2s(data));
                    tmp[0] = data[0];
                    try
                    {
                        stm.Read(data, 0, 1);
                        Print("Got byte, should be second delimiter "+data[0]+" "+b2s(data));
                    }
                    catch
                    {
                        Print("Break");
                        yield break;
                    }
                    if (data[0] == secondDelimiter) {
                        Print("Yes, it was second delimiter");
                        Print("About to deserialize "+b2s(msgpkb));
                        yield return resultSerializer.UnpackSingleObject(msgpkb);
                        msgpkb = new byte[0];
                    }
                    else
                    try {
                        Print("No, it was not second delimiter");                       
                        //Print("Old msgpkb bytes "+b2s(msgpkb));
                        msgpkb = Combine(msgpkb, tmp);

                        msgpkb = Combine(msgpkb, data);
                        //Print("New msgpkb bytes "+b2s(msgpkb));
                    }
                    catch {
                        Print("Break");
                        yield break;

                    }   

                }
                else
                {
                    try {

                        //Print("Old msgpkb bytes "+b2s(msgpkb));
                        msgpkb = Combine(msgpkb, data);
                        //Print("New msgpkb bytes "+b2s(msgpkb));
                    }
                    catch {
                        Print("Break");
                        yield break;

                    }

                }



            }



        }

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

As long as I know, there are no safe delimiter unless an application has
some of restriction for values of msgpack objects bacause integers, reals,
and raw bytes body can contain ANY byte sequences. Although I do not know
msgpack python very much, I have an idea.
Can you try to put objects without any delimiters and use Stream as data
source? Unpacker.Create(Stream, bool) could help you.

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

Interesting idea. I like the idea of perfect data transfer, but I've become very attached to idea of asynchronous control via co-routines. Can Unpacker.Create(Stream, bool) be used to build an iterator that can yield control after emitting a single msgpack object in the Stream?

I think I might stick with the current pattern and trap an error every now and then. I was already filtering bad msgpack deserializations in python and doing fine. Now that my byte handling loop is more correct, I'll have even less errors. I ran one stream for about an hour and didn't hit a single \r\n sequence that wasn't a delimiter.

So, it's correct enough, empirically. For the record, here's the python code:

    def get(self):
        """byte handling and yield"""
        self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.socket.connect((self.host, self.port))
        self.socket.sendall(self.request())
        tmp = []
        while True:
            data = self.socket.recv(1)
            if data == self.delimiter:
                data = self.socket.recv(2)
                if data[0] == self.seconddelim:
                    yield msgpack.unpackb(''.join(tmp))
                    if len(data) == 1: break
                    tmp =[]
                    data = data[1]
                else:
                    tmp.append(self.delimiter)
                    data = data[1]
            tmp.append(data)
        self.socket.close()
        return

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

I'm sorry for late reply.

I do not know Python very much, but I guess you can write similar code with C# like following:

var listener = new TcpListener( IPAddress.IPv6Any, port  );
listener.Start();
try
{
    using ( var tcp = listener.AcceptTcpClient() )
    using ( var stream = tcp.GetStream() )
    {
        while ( true )
        {
            MessagePackObject mpo;
            try
            {
                mpo = Unpacking.UnpackObject( stream );
            }
            catch( UnpackException )
            {
                if( stream.DataAvailable )
                {
                    throw;
                }
                else
                {
                        // End of stream.
                        yield break;
                }
            }

            yield return mpo;
        }
    }
}
finally
{
    listener.Stop();
}

Unfortunately there are no UnpackSingleObject, but it may work well.
Anyway, does the above code represents what you want to write?

...And if you are interesting in RPC, please visit https://github.com/yfakariya/msgpack-rpc-cli/

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

Thanks, I'll try it out.

from msgpack-cli.

d4g33z avatar d4g33z commented on August 17, 2024

Of course, the stream interface works must better than taking a chance with a byte delimiters, if I add an explicit 'end of stream' token for the data being sent. Thanks for all the info.

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

@d4g33z

Do you have any additional questions?

from msgpack-cli.

yfakariya avatar yfakariya commented on August 17, 2024

Close because it have been opened a long time and no response.
It is welcome to reopen this or create new issue for an additional question.

from msgpack-cli.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.