Git Product home page Git Product logo

Comments (14)

cocoa-xu avatar cocoa-xu commented on June 15, 2024 1

Hi @zacky1972, thank you very much for proposing this issue!

I agree with you that evision should have better support for Nx. And I just had a quick look at Excv. I can see that you've put a lot of hard work into verifying the validity of parameters and creating convenient functions for bridging between Nx and OpenCV. Quite impressive work!

And I'm more than grateful that you would like to contribute to this project! The #Reference structure(s) is defined here. It is a C++ template struct because there are a large number of classes in OpenCV.

template<typename R>
struct evision_res {
    R val;
    static ErlNifResourceType * type;
};
template<typename R> ErlNifResourceType * evision_res<R>::type = nullptr;

The template type R is decided in the gen2.py using the ClassInfo.issimple variable. The value of this boolean variable is based on whether the class is marked as simple in the OpenCV's source code, CV_EXPORTS_W_SIMPLE . This marking is parsed in hdr_parser.py [here] (https://github.com/cocoa-xu/evision/blob/main/py_src/hdr_parser.py#L259). If it is marked as simple, then R=<class name>, otherwise, R=cv::Ptr<class name>.

As for Mat, it is a special case and I'm using evision_res<cv::Mat *> (R=cv::Mat *). It is registered here, the function that converts from #Reference to Mat can be found here, and lastly, here is the function that converts Mat to corresponding Erlang #Reference.

from evision.

cocoa-xu avatar cocoa-xu commented on June 15, 2024 1

Hi @zacky1972, thank you so much for contributing to this project. I'd like to add something which I hope will make it easier for you to implement anything on your plan. :)

If you would like to add NIFs, you may have a look at c_src/modules/opencv_mat.h. This file contains my custom NIFs for the cv::Mat. Let's take the first custom NIF, evision_cv_mat_type, as example.

static ERL_NIF_TERM evision_cv_mat_type(ErlNifEnv *env, int argc, const ERL_NIF_TERM argv[]) {
    using namespace cv;
    ERL_NIF_TERM error_term = 0;
    std::map<std::string, ERL_NIF_TERM> erl_terms;
    int nif_opts_index = 0;
    evision::nif::parse_arg(env, nif_opts_index, argv, erl_terms);

    {
        Mat img;

        if (evision_to_safe(env, evision_get_kw(env, erl_terms, "img"), img, ArgInfo("img", 0)))
        {
            // ...
        }
}
  1. The comment starts with // @evision c: will be parsed in gen2.py and became F(evision_cv_mat_type, 1) in the nif_function table.
  2. The comment starts with // @evision nif: will be parsed in gen2.py and became an Elixir function declaration in erl_cv_nif.ex

The content below is outdated by 6286725.

  1. In this example, the function will expect a keyword list to be passed to it from Erlang/Elixir.

    :erl_cv_nif.evision_cv_mat_type([img: mat])

    hence the keywords in C++,

    const char *keywords[] = {"img", NULL};
  2. evision::nif::parse_arg takes the following function args,

    1. env
    2. the index of the opts arg. Shoud be 0 because we are using evision_cv_mat_type(_opts \\ []).
    3. argv
    4. the keywords
    5. : terminated string with n characters (not :) before the :. n is the number of keywords in the keywords. I'd like to improve this later. It should not be there anymore because it is a convention in Python.
    6. from 6 and so on will be the addresses of ERL_NIF_TERMs. The order should be the same as how they are defined in the keywords list. I'd like to improve this later. It's not quite easy to use.

Lastly, I will try my best to solve any problem you encounter. :)

from evision.

cocoa-xu avatar cocoa-xu commented on June 15, 2024 1

I implemented it temporary in OpenCV.Mat as follows:

https://github.com/zacky1972/evision/blob/feature/nx/lib/opencv_mat.ex#L20-L34

  def to_nx(mat) do
    unless is_nil(mat) do
      case {type(mat), shape(mat), to_binary(mat), } do
        {{:ok, type}, {:ok, shape}, {:ok, binary}} ->
          {
            :ok,
            Nx.from_binary(binary, type)
            |> Nx.reshape(shape)
          }
        _ -> {:error, "Unknown Mat type"}
      end
    else
      {:ok, nil}
    end
  end

But, I haven't found features that generate Mat from a binary, which is required by implementing from_nx.

You are right. I haven't added the function to generate Mat from a binary.

Moreover, I have a question:

Does to_binary assume that the given binary is in order BGR?

I believe to_nx and from_nx should generate to or from Nx tensor in order RGB instead of BGR, so I should cvtColor to the code of to_nx if so.

For this question, it can be tricky because we have to consider the following parameters

  1. colorspaces (RGB / YUV / YCbCr / etc.). If it is in the RGB colorspace, then is it BGR / RGB / RGB565 / RGB555 / etc.
  2. type of the data (float, double, int). Easy to deal with.
  3. number of channels, rows, cols. Easy to deal with.

1 is the most difficult part. If our goal is to simplify it, we can require the input data to be either BGR or RGB. Or, we can give detailed information so that OpenCV can correctly process the data.

from evision.

cocoa-xu avatar cocoa-xu commented on June 15, 2024 1

If we want to simplify this problem, perhaps we can have something like

def to_mat(tensor, colorspace) do
    {rows, cols, channels} = Nx.shape(tensor)
    type = Nx.type(tensor)
    to_mat(Nx.to_binary(tensor), type, cols, rows, channels, colorspace)
end
def to_mat(binary, type, cols, rows, channels=3, "RGB") do
    :erl_cv_nif.evision_cv_mat_to_mat([binary: binary, type: type, cols: cols, rows: rows, channels: channels, "RGB"])
end
def to_mat(binary, type, cols, rows, channels=3, "BGR") do
    :erl_cv_nif.evision_cv_mat_to_mat([binary: binary, type: type, cols: cols, rows: rows, channels: channels, "BGR"])
end

The content below is outdated by 6286725.

As for the code on the C++ side,

// @evision c: evision_cv_mat_to_mat, 1
// @evision nif: def evision_cv_mat_to_mat(_opts \\ []), do: :erlang.nif_error("Mat::to_mat not loaded")
static ERL_NIF_TERM evision_cv_mat_to_mat(ErlNifEnv *env, int argc, const ERL_NIF_TERM argv[]) {
    using namespace cv;
    ERL_NIF_TERM error_term = 0;

    {
        ERL_NIF_TERM erl_term_data = erlang::nif::atom(env, "nil");
        ErlNifBinary data; // binary
        // special handling for erl_term_data
        ERL_NIF_TERM erl_term_type = erlang::nif::atom(env, "nil");
        // special handling for erl_term_type
        ERL_NIF_TERM erl_term_cols = erlang::nif::atom(env, "nil");
        int img_cols = 0;
        ERL_NIF_TERM erl_term_rows = erlang::nif::atom(env, "nil");
        int img_rows = 0;
        ERL_NIF_TERM erl_term_channels = erlang::nif::atom(env, "nil");
        int img_channels = 0;
        ERL_NIF_TERM erl_term_colorspace = erlang::nif::atom(env, "nil");
        std::string colorspace;

        Mat retval;

        const char *keywords[] = {"binary", "type", "cols", "rows", "channels", "colorspace", NULL};
        if (0 < argc &&
            erlang::nif::parse_arg(env, 0, argv, (char **) keywords, "OOOO:",
                                   &erl_term_net, &erl_term_data, &erl_term_cols, &erl_term_rows) &&
            /* convert erl_term* to corresponding C++ variables. */
        ) {
            // ...
        }

What do you think?

from evision.

cocoa-xu avatar cocoa-xu commented on June 15, 2024 1

I've updated the way evision parsing arguments that from Elixir.

erl_terms is the same as keyword list opts in Elixir.

std::map<std::string, ERL_NIF_TERM> erl_terms;

evision_get_kw is basically the same as opts[:img] in Elixir.

evision_get_kw returns nil(ERL_NIF_TERM) if the key is not presented

evision_get_kw(env, erl_terms, "img")

Therefore, evision_cv_mat_to_mat can be written as below.

// @evision c: evision_cv_mat_to_mat, 1
// @evision nif: def evision_cv_mat_to_mat(_opts \\ []), do: :erlang.nif_error("Mat::to_mat not loaded")
static ERL_NIF_TERM evision_cv_mat_to_mat(ErlNifEnv *env, int argc, const ERL_NIF_TERM argv[]) {
    using namespace cv;
    ERL_NIF_TERM error_term = 0;
    std::map<std::string, ERL_NIF_TERM> erl_terms;
    int nif_opts_index = 0;
    evision::nif::parse_arg(env, nif_opts_index, argv, erl_terms);

    {
        Mat img;
        int img_cols = 0;
        int img_rows = 0;
        int img_channels = 0;
        std::string colorspace;

        Mat retval;

        if (evision_to_safe(env, evision_get_kw(env, erl_terms, "img"), img, ArgInfo("img", 0)) && 
            evision_to_safe(env, evision_get_kw(env, erl_terms, "cols"), img_cols, ArgInfo("img_cols", 0)) && 
            evision_to_safe(env, evision_get_kw(env, erl_terms, "rows"), img_rows, ArgInfo("img_rows", 0)) && 
            evision_to_safe(env, evision_get_kw(env, erl_terms, "channels"), img_channels, ArgInfo("img_channels", 0)) && 
            evision_to_safe(env, evision_get_kw(env, erl_terms, "colorspace"), colorspace, ArgInfo("img_cols", 0)))
        {
            ERL_NIF_TERM erl_ binary = evision_get_kw(env, erl_terms, "binary");
            ErlNifBinary data;
            if (enif_inspect_binary(env, erl_binary, &data)) {
                // valid binary data

                // convert to Mat
                // not implemented ...
            } else {
                // invalid binary data
                return enif_make_badarg(env);
            }
        }
    }

    if (error_term != 0) return error_term;
    else return evision::nif::atom(env, "nil");
}

from evision.

cocoa-xu avatar cocoa-xu commented on June 15, 2024 1

Hi @zacky1972, sorry for the ping. Just let you know that I've added the OpenCV.Mat.from_binary function. Below shows a tiny example.

{:ok, mat} = OpenCV.imread("/path/to/img.jpg")
{:ok, type} = OpenCV.Mat.type(mat)
{:ok, {rows, cols, channels}} = OpenCV.Mat.shape(mat)
{:ok, binary_data} = OpenCV.Mat.to_binary(mat)
{:ok, reconstructed} = OpenCV.Mat.from_binary(binary_data, type, cols, rows, channels)

Please feel free to let me know if I was missing anything in this implementation. Also, please don't hesitate to let me know if this was a good design when you use it in practice.

from evision.

zacky1972 avatar zacky1972 commented on June 15, 2024 1

Hi,

I implemented NxEvision: https://github.com/zeam-vm/nx_evision
This works well. Thank you.

from evision.

cocoa-xu avatar cocoa-xu commented on June 15, 2024 1

This looks splendid!

from evision.

zacky1972 avatar zacky1972 commented on June 15, 2024

I began implementing it.

from evision.

zacky1972 avatar zacky1972 commented on June 15, 2024

I implemented it temporary in OpenCV.Mat as follows:

https://github.com/zacky1972/evision/blob/feature/nx/lib/opencv_mat.ex#L20-L34

  def to_nx(mat) do
    unless is_nil(mat) do
      case {type(mat), shape(mat), to_binary(mat), } do
        {{:ok, type}, {:ok, shape}, {:ok, binary}} ->
          {
            :ok,
            Nx.from_binary(binary, type)
            |> Nx.reshape(shape)
          }
        _ -> {:error, "Unknown Mat type"}
      end
    else
      {:ok, nil}
    end
  end

But, I haven't found features that generate Mat from a binary, which is required by implementing from_nx.

from evision.

zacky1972 avatar zacky1972 commented on June 15, 2024

Moreover, I have a question:

Does to_binary assume that the given binary is in order BGR?

I believe to_nx and from_nx should generate to or from Nx tensor in order RGB instead of BGR, so I should cvtColor to the code of to_nx if so.

from evision.

zacky1972 avatar zacky1972 commented on June 15, 2024

I see.

May I create an additional new hex library bridging evision and Nx?

from evision.

cocoa-xu avatar cocoa-xu commented on June 15, 2024

Sure, of course! We can have more flexibility in that way :)

As long as we have the necessary information passed to Mat, it will return a valid Mat instance.

from evision.

vans163 avatar vans163 commented on June 15, 2024

Hitting the same problem. CIFAR100 has the BGR RGB channels reversed

test
but needs to be
test

There should be a way when using to/from binary to set the channel order instead of having to do a bunch of Nx transformations.

To get it right
test
we need to do (for 32x32x3), but this is the same as just treating RGB and BGR.

            r = Nx.slice(img, [0,0,0], [32,32,1])
            g = Nx.slice(img, [0,0,1], [32,32,1])
            b = Nx.slice(img, [0,0,2], [32,32,1])
            img = Nx.stack([b,g,r]) |> Nx.reshape({3,32,32}) |> Nx.transpose(axes: [1,2,0])
            img = Nx.to_binary(img)

full code for completeness off first CIFAR100 image (produces RGB24 and grayscale from original)

            <<coarse::8, fine::8, img::binary-3072, bin::binary>> = bin
            
            #CIFAR100 is in R::1024 G::1024 B::1024
            #turn it into BGR24
            img24 = Nx.reshape(Nx.from_binary(img, {:u,8}), {3, 32, 32})
            |> Nx.transpose(axes: [1,2,0])
            r = Nx.slice(img24, [0,0,0], [32,32,1])
            g = Nx.slice(img24, [0,0,1], [32,32,1])
            b = Nx.slice(img24, [0,0,2], [32,32,1])
            img24 = Nx.stack([b,g,r]) |> Nx.reshape({3,32,32}) |> Nx.transpose(axes: [1,2,0])
            img24 = Nx.to_binary(img24)

            grey = Evision.Mat.from_binary!(img24, {:u, 8}, 32, 32, 3)
            |> Evision.cvtColor!(Evision.cv_COLOR_BGR2GRAY)
            |> Evision.Mat.to_binary!()

from evision.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.