<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Cached value of CvMat.fullSize and *Buffer fields might get incorrect,about tranvantriet/javacv

Comments (85)

GoogleCodeExporter commented on July 30, 2024

Thanks for reporting this issue! I hadn't really noticed that problem before, 
but yes, you are right, this is a problem with C++ functions.

Actually, the same thing can happen with the *Buffer fields. I'm not sure how 
to handle this efficiently. Any bright ideas?

Original comment by [email protected] on 26 May 2013 at 11:34

Changed title: *Cached value of CvMat.fullSize and Buffer fields might get incorrect
Changed state: Accepted

from javacv.

GoogleCodeExporter commented on July 30, 2024

I would remove all the fields, because they can't be updated reliably. The 
getXxxBuffer() methods should always return fresh Buffer instance, with freshly 
calculated capacity. We could encourage users to reuse *Buffer instance. I 
would deprecate fullSize() and reset() methods.

By the way, CvMat.size() calculation is also incorrect in case of subMat. It 
should be:

    public int size() {
        int rows = rows();
        return cols()*elemSize()*channels() + (rows > 1 ? step()*(rows-1) : 0);
    }

For example consider:

    CvMat mat = CvMat.create(10, 10, CV_8U);
    CvMat matSub = CvMat.createHeader(5, 5, CV_8U);
    cvGetSubRect(mat, matSub, cvRect(5, 5, 5, 5));

The size of data of mat is 100 bytes, step is 10 (let's ignore alignment). 
matSub's data point to mat.data + 55, that is the same data, offset by upper 
left corner of the rectangle. Current version of size() method will return 50 
for matSub, which will go beyond the mat.data length, because 50+55 > 100. This 
problem is minor, as it does not affect correctly implemented programs, it only 
allows buffer overflow errors, that are otherwise prevented.

Original comment by [email protected] on 28 May 2013 at 4:52

from javacv.

GoogleCodeExporter commented on July 30, 2024

Creating a Buffer is pretty expensive, and isn't that great for the garbage 
collector either, so creating one for each call to CvMat.put() isn't an option. 
That's why I'm doing this caching in the first place. It wasn't a problem with 
the simpler (IMHO better) C API... Any thoughts?

Thanks for the the size() fix! I'll put that in :)

Original comment by [email protected] on 2 Jun 2013 at 1:04

from javacv.

GoogleCodeExporter commented on July 30, 2024

I would deprecate all get() and put() methods, as they are slow and might get 
incorrect without reset().

When I started to use JavaCV, I began to use CvMat.get and .put methods, but 
soon stopped to, as it is much slower when compared to using the Buffer 
directly. JNI calls are expensive too, and the put(int i, double v) makes one 
call and put(int i, int j, double v) makes three more. Next, there is 
unnecessary conversion from double and back, which is not free (maybe it is 
optimized out by JIT, I did not test).

Compare these snippets. They all do one get, multiply and put per pixel. The 
first uses the (i, j) version, second uses the (i) version, the third uses the 
Buffer directly and the fourth copies the buffer to Java array and back:

public static void testIJ() {
    long start = System.nanoTime();
    CvMat mat = CvMat.create(1000, 1000, CV_8U);
    for (int i=0; i<1000; i++)
        for (int j=0; j<1000; j++) {
            byte val = (byte) mat.get(i, j);
            val = (byte) (2*val);
            mat.put(i, j, val);
        }
    System.out.println((System.nanoTime() - start) / 1000000 + " ms");
}

public static void testI() {
    long start = System.nanoTime();
    CvMat mat = CvMat.create(1000, 1000, CV_8U);
    int step = mat.step() / mat.elemSize();
    int channels = mat.nChannels();

    for (int i=0; i<1000; i++)
        for (int j=0; j<1000; j++) {
            byte val = (byte) mat.get(step + j*channels);
            val = (byte) (2*val);
            mat.put(step + j*channels, val);
        }
    System.out.println((System.nanoTime() - start) / 1000000 + " ms");
}

public static void testDirect() {
    long start = System.nanoTime();
    CvMat mat = CvMat.create(1000, 1000, CV_8U);
    int step = mat.step() / mat.elemSize();
    int channels = mat.nChannels();
    ByteBuffer buffer = mat.getByteBuffer();

    for (int i=0; i<1000; i++)
        for (int j=0; j<1000; j++) {
            byte val = buffer.get(step + j*channels);
            val = (byte) (2*val);
            buffer.put(step + j*channels, val);
        }
    System.out.println((System.nanoTime() - start) / 1000 + " us");
}

public static void testJavaArray() {
    long start = System.nanoTime();
    CvMat mat = CvMat.create(1000, 1000, CV_8U);
    int step = mat.step() / mat.elemSize();
    int channels = mat.nChannels();
    ByteBuffer buffer = mat.getByteBuffer();
    byte[] jbuffer = new byte[mat.size()];
    // I assume the buffer is continuous, as the mat.isContinuous() always 
    // returns false
    buffer.get(jbuffer);

    for (int i=0; i<1000; i++)
        for (int j=0; j<1000; j++) {
            byte val = jbuffer[step + j*channels];
            val = (byte) (2*val);
            jbuffer[step + j*channels] = val;
        }

    buffer.position(0);
    buffer.put(jbuffer);

    System.out.println((System.nanoTime() - start) / 1000 + " us");
}



Resulting times (after 10 iterations):

TEST    HOTSPOT VM 1.6, 3GHZ ATHLON X2       1GHZ 1CORE ANDROID 2.3.5 (no JIT)
testIJ                           280ms                  9800ms
testI                             65ms                  4500ms
testDirect                       3.5ms                  1450ms
testJavaArray                    5.1ms                    60ms



Surprising is the Android result, which I added for information. Probably due 
to the lack of JIT, even Buffer access is very slow. On desktop VM the 
JavaArray version is a bit slower than via the Buffer, due to unnecessary 
copying of the buffer data.

If you want to simplify pixel access for the users, you could add interface say 
"PixelAccessor", which will do the expensive operations of creating Buffer and 
calculating size upon initialization and users will be asked to reuse the 
instance. I attach BytePixelAccessor as an example, similar class should be 
created for other types. You can add createBytePixelAccessor() method to CvMat 
and IplImage (not getBytePixelAccessor(), so the name implies something is 
created upon every invocation).

Viliam

Original comment by [email protected] on 3 Jun 2013 at 5:30

Attachments:

BytePixelAccessor.java

from javacv.

GoogleCodeExporter commented on July 30, 2024

They are not slow with floating-point matrices (native function call isn't slow 
BTW), but I agree they are not ideal, and shouldn't be used for integer 
matrices...

Anyway, thanks for all the recommendations! I'm not sure how this is going to 
turn out. The thing is OpenCV itself switched to "cv::Mat", and I'm currently 
working on something to help me wrap this kind of thing more easily (not just 
for OpenCV), without having to create all the .java files by hand... When and 
if I succeed, then I would start looking into better ways of accessing these 
structures in Java, i.e.: your accessor class.

But it might take much time before I come up with anything :(

What do you think about all this?

Original comment by [email protected] on 8 Jun 2013 at 1:15

from javacv.

GoogleCodeExporter commented on July 30, 2024

I don't understand your statement that floating-point matrices aren't slow. I 
did a test with 32F matrices, and here are the results:

TEST    HOTSPOT VM 1.6, 3GHZ ATHLON X2       1GHZ 1CORE ANDROID 2.3.5 (no JIT)
testIJ_8u                        231ms                  9715ms
testI_8u                          59ms                  4467ms
testDirect_8u                    3.4ms                  1460ms
testJavaArray_8u                13.0ms                   120ms
testIJ_32f                       233ms                 10267ms
testI_32f                         65ms                  5050ms
testDirect_32f                   2.8ms                  2199ms
testJavaArray_32f               13.0ms                  1143ms

I did also test 64f matrices, and the times were roughly the same as float. The 
testJavaArray_32f is much slower than testJavaArray_8u on android, because 
probably some memory limit was approached (I saw grow heap operation after 
every test run during 32f, but not during 8u). Each test was run 10times, the 
result was taken by averaging the last five runs.

There *is* JNI call overhead. If you call, say, five operations on Mat, then 
the overhead is negligible, but if you call CvMat.get(i, j) 5-million times, 
which itself makes 5 JNI calls, then it is not. I first noticed this issue, 
when processing the HoughLines result. I iterated the returned collection of 
lines many times, and just by copying the lines to java array instead of 
calling CvPoint.x() and .y() it run noticeably faster.

I can contribute the PixelAccessor enhancement as a patch, if you wish.

Original comment by [email protected] on 10 Jun 2013 at 5:03

from javacv.

GoogleCodeExporter commented on July 30, 2024

FYI, here is the benchmark: http://pastebin.com/DtGBkzyk

Original comment by [email protected] on 10 Jun 2013 at 5:06

from javacv.

GoogleCodeExporter commented on July 30, 2024

Well, slow is relative, not as slow as creating a Buffer object :) But of 
course it's still not optimal. And yes Android/Dalvik does suck however one 
looks at it.

BTW, your "accessor" solution reminds me of this article:

Java Programming for High-Performance Numerical Computing
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.1554 

Actually, I feel this is something that should be included in JavaCPP. The way 
I see it we'd have a collection of classes, e.g.: ByteAccessor, ShortAccessor, 
etc. with constructors that accept a Buffer and the dimensions, and a 
collection of appropriate get() and put() methods for 2D and 3D access. 

Then, we could include factory methods in the corresponding Pointer classes, as 
well as others like CvMat, CvPoint, etc.

What do you think? Is this something you'd like to contribute? Sounds really 
useful in any case :)

And to cope with Android, instead of a Buffer, the accessors could also accept 
a Pointer, slurping and dumping native memory on user request only.

Original comment by [email protected] on 13 Jun 2013 at 7:24

from javacv.

GoogleCodeExporter commented on July 30, 2024

I'd like to contribute. Lets summarize what has to be done:
- deprecate getXxxBuffer and fullSize methods on CvMat
- add XxxPixelAccessor classes
- add factory methods for PixelAccessors to CvMat and IplImage
- add CopyingPixelAccessor for Android (and others without JIT)

We should consider the PixelAccessor classes to be convenience classes for 
developer. The direct Buffer access is still fastest, even only slightly. This 
is about avoiding the complexity for the developer and avoid possible error 
when calculating the indexes. There is no point in adding it to the Pointer 
class, as it has no internal structure.

Regarding the CvPoint class, there are bulk put and get methods to and from 
java (get and put) which could be rewritten using buffers too:

    public final CvPoint put(int[] pts, int offset, int length) {
        asByteBuffer().asIntBuffer().put(javapoints, offset, length);
    }

    public CvPoint get(int[] pts, int offset, int length) {
        points.asByteBuffer().asIntBuffer().get(javapoints, offset, length);
    }

These are approximately 10 times faster than the current versions (tested on 
array of 10000 points on desktop java). Also, PointAccessor could be written, 
that could use Buffer access. It is about 30 times faster than using x() and 
y() methods. See the benchmark: http://pastebin.com/bUC6u5R7

Actually, such class could be written for any c++ class, and, as you see, it 
strongly depends on the internal structure of the class. These files should be 
generated...

I read a JNI performance article, and one of the recommendations was to keep 
the number of needed JNI calls low, because there is overhead. As from my 
experience, such algorithms should be written in C(++), and call them from JNI. 
Java should be used to call complex operations only (for example operations on 
Mats), and not to do pixel processing. Or for prototyping.

By the way, we are rewriting our entire algorithm to C++, as we use lot of 
pixel processing and point processing, and on android it will be a lot faster. 
Further, we plan iOS version too, so the code can be reused.

Original comment by [email protected] on 14 Jun 2013 at 5:55

from javacv.

GoogleCodeExporter commented on July 30, 2024

I understand there are a lot of little details to take care of, but first 
things first: We need those "accessor" classes. Now, we could have a 
BytePixelAccessor, a BytePointerAccessor, etc. but I feel that having one 
ByteAccessor class usable on *any* structure would be more useful. Do you not 
feel this would be better?

Original comment by [email protected] on 14 Jun 2013 at 1:22

from javacv.

GoogleCodeExporter commented on July 30, 2024

Let me try to explain this a bit better. Assume we have some `CvMat mat` and 
`CvPoint pts`, then I think it would be nice to be able to do something like 
this:

IntAccessor matAccessor = mat.getIntAccessor(); // or create ...
IntAccessor ptsAccessor = pts.getIntAccessor(); 
matAccessor.put(1, 1, ptsAccessor.get(1, 1));

Of course we'd have FloatAccessor and what not, and maybe the names could be 
different too, but what I want to emphasize is that the reuse of the common 
class, used by both CvMat and CvPoint in this case. Do you not feel this would 
be a good thing?

Original comment by [email protected] on 16 Jun 2013 at 1:51

from javacv.

GoogleCodeExporter commented on July 30, 2024

The accessors are about structure. CvMat.data do have structure, CvPoint does 
too, but Pointer does not. They enable to write 

  matAccessor.get(i, j)

instead of 

  mat.getByteBuffer().get(i*mat.step() + j*mat.channels());

PixelAccessor saves the JNI calls and buffer creation (lets assume 
getByteBuffer does not return cached instance). PixelAccessor offers no more 
functionality than the current get() methods on CvMat, only it's lifetime is 
separate from that of CvMat. 

Try to write Accessor for Pointer. I think there would be no more functionality 
than in java.nio.XxxBuffer, so I think is needless.

Original comment by [email protected] on 17 Jun 2013 at 4:03

from javacv.

GoogleCodeExporter commented on July 30, 2024

"Multidimensional arrays" in C/C++, the ones accessed like this `mat[i][j]`, 
are actually allocated in one contiguous area of memory: Check it up! Java's 
Buffer provides no way to access those. I think it would be very useful to 
provide an interface for native multidimensional arrays. Why do you feel 
differently?

Original comment by [email protected] on 18 Jun 2013 at 12:30

from javacv.

GoogleCodeExporter commented on July 30, 2024

Mat is not multidimensional array, neither is Pointer or Mat.data. Actually, i 
have implemented the accessors already, but have not done any tests. I will 
send patch then, and then try to implement Accessor for Pointer yourself, 
you'll see there's no point in it.

Original comment by [email protected] on 18 Jun 2013 at 1:06

from javacv.

GoogleCodeExporter commented on July 30, 2024

Sure, please explain how you would access the pixel at (100, 100) of a 
dc1394video_frame_t.image :
http://code.google.com/p/javacv/source/browse/src/main/java/com/googlecode/javac
v/cpp/dc1394.java#834

Original comment by [email protected] on 19 Jun 2013 at 2:00

from javacv.

GoogleCodeExporter commented on July 30, 2024

I don't know, it don't know it's structure. If it's simple multidimensional 
array, then it's image.capacity(rows*cols).asByteBuffer().get(100*cols+100). I 
would cache the buffer for access for fast access to other pixels.

Original comment by [email protected] on 19 Jun 2013 at 3:57

from javacv.

GoogleCodeExporter commented on July 30, 2024

Here is the patch. Have a look at it and let me know.

Original comment by [email protected] on 19 Jun 2013 at 4:32

Attachments:

0001-pixel-accessor-API.patch

from javacv.

GoogleCodeExporter commented on July 30, 2024

I deprecated the getXxxBuffer() methods in CvMat, but found out that I still 
need them to create android bitmap:

  bitmap.copyPixelsFromBuffer(cvImage.getByteBuffer());

For backwards compatibility, I would still deprecate getXxxBuffer methods and 
create new createXxxBuffer() methods, that will always return new buffer. What 
do you think?

Original comment by [email protected] on 19 Jun 2013 at 9:49

from javacv.

GoogleCodeExporter commented on July 30, 2024

Let's go back to comment #16 first. Wouldn't it be nice instead to be able to 
do something like this:
    ByteAccessor a = new ByteAccessor(image, rows, cols);
    byte b = a.get(100, 100);
?

Original comment by [email protected] on 19 Jun 2013 at 12:11

from javacv.

GoogleCodeExporter commented on July 30, 2024

Sorry for not responding, I'm quite busy now... 

Now I got it: the CvMat's pixelaccessor could be generalized to 
multidimensional array accessor. But it not only has to know each dimension's 
length, but the step too.

Now, I would put these to JavaCPP project, not to JavaCV. In JavaCV, there 
could be factory methods in CvMat to simplify it's creation. What do you think?

Original comment by [email protected] on 26 Jun 2013 at 8:02

from javacv.

GoogleCodeExporter commented on July 30, 2024

Yes!!

We don't need to name it "step", even though for images obviously that's what 
is going to be used for the "width". We don't need to know the actual width of 
an image to access one pixel in memory. And to generalize more easily, we could 
keep the width, height, step or whatever in one "int[] dimensions" array, and 
it would even possible to have get/put methods with varargs, but I'm not sure 
how that would play with specialized 2D and 3D methods... ?

Anyway, as names I think ByteAccessor, IntAccessor, etc. sound fine, but what 
do you think?

When you have something post it here:
http://code.google.com/p/javacpp/issues/
Thanks!

Original comment by [email protected] on 27 Jun 2013 at 12:47

from javacv.

GoogleCodeExporter commented on July 30, 2024

Bump! It would be really nice to have those classes in JavaCPP. Please let me 
know if you are still interested in working on this, thanks!

Original comment by [email protected] on 22 Sep 2013 at 12:46

from javacv.

GoogleCodeExporter commented on July 30, 2024

Yes, I definitely plan to do it, but we postponed our CV project due to other 
tasks, but if nothing changes, I will get back in two-three weeks.

Original comment by [email protected] on 25 Sep 2013 at 11:08

from javacv.

GoogleCodeExporter commented on July 30, 2024

Hello,

I started to write the API. After writing it, I realized, that the code without 
this API is so simple that I hesitate whether new API is really needed. 

Example: Let's say we write a function to calculate sum of all pixels of 
1-channel image. Using the new pixel accessor API it would be:

public long imgSum(CvMat mat) {
  ByteArrayAccessor matacc = mat.createBytePixelAccessor(false);
  // cache to avoid repetitive JNI call
  int cols = mat.cols();
  int rows = mat.rows();
  long sum = 0;
  for (int i=0; i<rows; i++)
    for (int j=0; j<cols; j++)
      sum += matacc.get(i, j) & 0xff; // 0xff to convert to unsigned

  return sum;
}

By directly accessing the Buffer, the code would be:

public long imgSum(CvMat mat) {
  ByteBuffer buf = mat.createDataByteBuffer();
  int step = mat.step();
  int rows = mat.rows();
  int cols = mat.cols();
  long sum = 0;
  for (int i=0; i<rows; i++) 
    for (int j=0; j<cols; j++) 
      sum += buf.get(i*step + j) & 0xff;

  return sum;
}

The only saving is the formula i*step + j, that is hidden. Is this worth a new 
API?

Smarter programmer could even write it little faster like this:

public long imgSum(CvMat mat) {
  ByteBuffer buf = mat.createDataByteBuffer();
  int step = mat.step(), rows = mat.rows(), cols = mat.cols();
  long sum = 0;

  for (int i=0; i<rows; i++) {
    int pos = i*step;
    for (int j=0; j<cols; j++, pos++) 
      sum += buf.get(pos) & 0xff;
  }

  return sum;
}

This avoids i*step multiplication at every pixel, the trick I've learned by 
reading OpenCV's code... This trick is not possible with pixel accessor.

If I have time tomorrow, I'll benchmark all three versions.

Please have a look at the patches before I continue with other primitive types 
and let me know what do you think.

Viliam

Original comment by [email protected] on 12 Nov 2013 at 10:15

Attachments:

from javacv.

GoogleCodeExporter commented on July 30, 2024

[deleted comment]

from javacv.

GoogleCodeExporter commented on July 30, 2024

I just remembered, that step is always in bytes, it should be divided with 
elemSize(). It's long I did not work with OpenCV. I will rewrite it later.

BTW, our computer vision project is still postponed, maybe for very long. I'm 
sad of that, because it was very interesting for me and it would surely be very 
good selling mobile application. And finally, we plan to use neither javacpp 
nor javacv in the final product, due to performance. We do lot of pixel 
processing, and comparable function in OpenCV written in C runs 10 times faster 
than ours very similar in Java. That's on android, on desktop, the difference 
is not so big (probably due to JIT). But they were very good for us for initial 
prototypes, so I feel I owe to contribute...

Original comment by [email protected] on 12 Nov 2013 at 10:28

from javacv.

GoogleCodeExporter commented on July 30, 2024

It's fine to use Buffer directly, but it does get complicated when accessing 
many elements of a matrix. It's a lot easier if we can specify the indices as 
we would in MATLAB for example. 

Your API looks fine, thanks! Still, I would shorten the names to ByteAccessor, 
etc instead of ByteArray..., BytePixel..., etc. because it's shorter and it 
works for direct NIO Buffer objects as well, not just arrays or pixels, so it 
removes potential confusion. And com.googlecode.javacpp.accessors (should this 
be singular or plural?) is good as package name I think. Putting it in a 
separate package also indicates that they may be used independently from the 
classes in the main package, so good idea, thanks!

BTW, we can still use JavaCPP to call custom C/C++ functions. It's much nicer 
than using raw JNI with the NDK, while imposing almost no additional overhead.

Original comment by [email protected] on 17 Nov 2013 at 2:08

from javacv.

GoogleCodeExporter commented on July 30, 2024

Okay, so I'll finish the rest. I prefer "accessors" (plural). I also prefer 
ByteArrayAccessor, instead of ByteAccessor, because it improves readability. 
IDEs are to complete names. Anyway, I'll write "BAA" in eclipse, and I'll 
probably get better match than with "BA". I even thought of 
ByteNdArrayAccessor, but did not like this. But I'll do what you say, let me 
know.

I'd like to prepare an article, where I will summarize performance 
characteristics of each way, maybe on wiki. Then, I'd like to refer to that 
from the javadocs. There I could summarize individual ways of accessing pixels 
along with their benchmarks.

Original comment by [email protected] on 18 Nov 2013 at 6:03

from javacv.

GoogleCodeExporter commented on July 30, 2024

Regarding the usage of JavaCpp in our project, we only have two JNI methods. I 
never used JNI before, so it was a way to learn it and to be able to better 
understand javacpp, because, due to the lack of documentation and no experience 
with JNI it seemed complicated to me...

Original comment by [email protected] on 18 Nov 2013 at 6:06

from javacv.

GoogleCodeExporter commented on July 30, 2024

I'm not sure it's a good idea to name "ByteArrayAccessor" something that 
doesn't access a byte array (it's technically accessing a buffer in the 
non-copied case at least)... What do you think?

And something I've just noticed from your code, but `ByteBuffer.put(byte[])` is 
very very slow on Android, that's why I added `BytePointer.put(byte[])`:
    https://code.google.com/p/javacpp/issues/detail?id=11
So, let's think a bit more about that, and your `copyBack()` method, before 
going further.

I've added you as contributor, so please feel free to add a wiki page! Maybe 
you'd prefer to add one to the JavaCPP site though?

If you have only two JNI methods, maybe JavaCPP isn't so useful yes.

Original comment by [email protected] on 24 Nov 2013 at 2:01

from javacv.

GoogleCodeExporter commented on July 30, 2024

Regarding the issue 11, it seems that it is some improper buffer implementation 
in android. That's why I even created the otherwise odd Copying version. Maybe 
if I use pointer.get(int) method, that would be faster and make Copying version 
useless. Have to do benchmark.

Regarding the name, what better name you have? We started with 
BytePixelAccessor for CvMat, but then generalized it to all arrays, so 
ByteArrayAccessors. Alternatives that come to my mind are:
  ByteNdArrayAccesor
  ByteNdElementAccessor
  ByteNdArrayElementAccessor
  ByteAccessor
  ByteUtils
  BytePointerUtil
  BytePointerAccessor
  ByteMdArrayAccessor
  ByteMultiDimArrayAccessor
  ByteElementAccessor
  ElementAccessorByte
  ArrayElemAccessorByte
  ByteArrayElemAccessor

I think, the only self-describing is the longest. Any other would be explained 
in javadoc.

Original comment by [email protected] on 25 Nov 2013 at 3:17

from javacv.

GoogleCodeExporter commented on July 30, 2024

JNI calls are not as efficient as accessing array elements, so the copying 
accessor makes sense. But we still need JNI to copy the whole array, something 
like BytePointer.get(byte[])/put(byte[]), and we might as well use that.

So, the names could be ByteBufferAccessor and BytePointerAccessor. I think 
that's pretty self-describing? Maybe some other name than "Accessor" could 
clarify the meaning. Although I wouldn't want to lengthen the name further, so 
I'm not sure what...

Original comment by [email protected] on 26 Nov 2013 at 6:25

from javacv.

GoogleCodeExporter commented on July 30, 2024

Maybe we could name them "Indexer" since it's a concept already used in C# for 
multidimensional arrays, but not in Java?
http://www.javacamp.org/javavscsharp/indexer.html
http://msdn.microsoft.com/en-US/library/2yd9wwz4%28v=vs.80%29.aspx

That single word carries both the meaning of Array and Accessor, so we could 
shorten the names, i.e.: ByteIndexer...

Original comment by [email protected] on 30 Nov 2013 at 4:57

from javacv.

GoogleCodeExporter commented on July 30, 2024

Okay, so here are complete patches with Indexer API. Apply both patches to 
javacpp (the first adds "accessors", the second deletes it and adds "indexers", 
I was lazy to do a clean clone...)

I liked the "Indexer" name most, so I completed it using this. I also did 
JUnit-like test, but it is not included in patches, as you don't have JUnit set 
up in your project.

I also did performance test, with interesting results: 

Test configurations
1. jdk 1.6.0_41-64bit, athlon x2 64, 2-core 3GHz, run with -server switch
2. jdk 1.7.0_17-64bit, athlon x2 64, 2-core 3GHz, run with -server switch
3. Android 2.3.5, Qualcomm MSM8255 1-core 1GHz

                   (1)       (2)        (3)
sumBuffer         3507      3432    2 835 870 
sumPointer       93962     93992    1 683 551 
sumBufferCopy     5176      5275      132 824 
sumPointerCopy    3824      4112      125 451 
sumIndexer        3342     17884    3 329 522 
sumIndexerCopy    4234     20491      742 822 
cvSum             1885      1866       16 809 
addBuffer         7688      7375    6 068 920 
addPointer      207170    238679    3 636 462 
addBufferCopy     9399      9423      137 023 
addPointerCopy    6683      6488      135 833 
addIndexer        7277     20948    7 160 247 
addIndexerCopy    7628      8221    1 317 675 
cvAddS            1986      1974       33 465 

Test sources are in attached Test.java. It tests two operations (intensive to 
pixel access). "Sum" adds all pixel values (only does reading), "add" modifies 
pixels by incrementing by one. As you see, there are huge differences. 

The "never do" version is accessing value through pointer element by element 
(sumPointer and addPointer). Android's "never do" is to directly access native 
pixels, but rather copy them. I did not reproduce the android slowness with 
ByteBuffer.get(int[]) (compare sumBufferCopy and sumPointerCopy)

Indexer access is a bit slower than direct work with buffer or with copied 
array on android, but it's code is the simplest. This might be due to 
unnecessary method calls, and due to I cannot use premultiplied value the way I 
do it when accessing buffer. Strange is, that it is lot slower on JDK 1.7...

Not surprisingly, C-version is always fastest (1.7-3.5 times on HotSpot and 4-7 
times on Android. And it is not dependent on JIT/GC mood :)

I have not a device with newer android handy, can you do a test with that? I 
expect, this problem will disappear (or maybe disappeared) one day.

My final recommendations is to always prefer C. Use JavaCV only if:
- you don't need to access pixels, if you only call OpenCV's bulk operations 
(such as a sequence of various image transformations)
- you need portability (but you will need native libraries anyway)
- you are prototyping (i.e. less segfaults in java)
- your algorithm is fast enough

If you decide to use pixel access, use these method:
Android:
1. access copied data manually
2. use copying indexer for less code and slower speed (6-10 times)
On JDK
1. access buffer manually
2. use direct indexer for less code and slower speed (may not be slower, but 
may be up to 5 times slower). Run on a good JRE :).
3. the more times you access the same pixel, the more prefer copying the array
General:
Always create a copy using Pointer.get(), not Buffer.get().


---
The above could be a basis for a wiki page I recommend to write. We could link 
to it in the sources.

Please review the patches, and, if you can, try to review the benchmark too, if 
I have not done something wrong.

Viliam

Original comment by [email protected] on 11 Dec 2013 at 9:51

Attachments:

from javacv.

GoogleCodeExporter commented on July 30, 2024

I accidentaly got an Nexus 10 with Android 4.4 handy with ARM Cortex A15 2-core 
1.7 GHz, here are the results:

sumBuffer        965 584
sumPointer     5 105 642
sumBufferCopy     34 331
sumPointerCopy    34 406
sumIndexer     1 151 433
sumIndexerCopy   165 345
cvSum              3 335
addBuffer      2 128 167
addPointer    11 186 691
addBufferCopy     35 940
addPointerCopy    35 791
addIndexer     2 574 975
addIndexerCopy   410 906
cvAddS             8 068

Still no change, buffer element access is slow, buffer bulk access is as fast 
as pointer's, and C is 10 times faster (in this case opencv cvSum or cvAddS 
might even used multiple cores, did not check the code). Still, copying is lot 
faster.

Original comment by [email protected] on 11 Dec 2013 at 10:47

from javacv.

GoogleCodeExporter commented on July 30, 2024

[deleted comment]

from javacv.

GoogleCodeExporter commented on July 30, 2024

Great, thanks! It's going to take me a while to review that. It's actually 
important functionality I think, so I want to make sure we get the API right. I 
have a couple of comments for now though:
1. Please do provide patches that apply cleanly to the HEAD of master branches 
(no need right now though)
2. Are you OK with licensing that under the GPLv2 with Classpath exception? Or 
would you prefer something else?
3. Maven supports unit testing if we place files in the right directories, so 
feel free to put stuff there
4. You seem to provide a couple of unrelated fixes for opencv_core.java. Are 
all the changes that are in there indeed fixes?
5. Have you tried to write data back to native memory? Direct NIO buffers on 
Android should be pretty slow in that case.

Thanks for everything!

Original comment by [email protected] on 15 Dec 2013 at 7:49

from javacv.

GoogleCodeExporter commented on July 30, 2024

1. Patches should apply to the HEAD. There are just two patches, because I did 
it in two commits
2. Yes
3. I'll try, hopefully next week.
4. There is only one unrelated fix for CvMat.isContinuous(). It always returned 
false, because it used type() instead of raw_type(), and type() has some bits 
cleared, including the continuous bit.
5. The "sum" benchmark does not write back, it only reads pixels. The "add" 
benchmark does (it modifies pixels).

Original comment by [email protected] on 15 Dec 2013 at 11:16

from javacv.

GoogleCodeExporter commented on July 30, 2024

5. Does this mean both Buffer.put(array)/get(array) work efficiently on 
Android? I wonder why Adam found different: 
https://code.google.com/p/javacpp/issues/detail?id=11

Original comment by [email protected] on 15 Dec 2013 at 1:04

from javacv.

GoogleCodeExporter commented on July 30, 2024

I reproduced issue 11. I actually wonder, why it does not reproduce in my test 
case. Maybe because I've tested only ByteBuffer, but in issue 11 they have 
problem with floats or ints, I will have to give it a try... 

Anyway, I use Pointer's bulk methods for copying in the Copying indexers.

Original comment by [email protected] on 15 Dec 2013 at 1:26

from javacv.

GoogleCodeExporter commented on July 30, 2024

Ok, so let's take a step back. "NIO" stands for "New I/O", and the intention 
was that we should be able to do everything and do it efficiently using only 
*Buffer objects, and nothing else, but on Android this isn't the case, yet. So 
we're left to hack things up, for now. And your original proposal was optimal 
in that sense.

What would you think of the following then. We have only Indexer per type 
(ByteIndexer, ShortIndexer, etc) and they all accept a Buffer object of the 
corresponding type. But we also provide a flag to indicate whether we want to 
use it in "copying" mode. (BTW, it's possible to create a Pointer from a 
Buffer. That's what I'm doing in FFmpegFrameRecorder.java for example.) And 
eventually when Android gets better, this flag should become obsolete, but at 
least we won't be stuck with a hackish API.

How does that sound?

Original comment by [email protected] on 16 Dec 2013 at 3:56

Changed state: Started

from javacv.

GoogleCodeExporter commented on July 30, 2024

Sounds good. But the normal use case for copying version would be the following:

  CopyingByteIndexer ix = (CopyingByteIndexer) ByteIndexer.createXxx(..., true);
  // doing work...
  // ix.copyBack();

If we would want to silently ignore the copying flag, we would not return 
CopyingByteIndexer instance anymore. So I recommend to make Copying- and 
Direct- subclasses package-visible (i.e. default visibility) and put the 
copyBack() to the abstract method, which would be no-op for Direct- version. It 
will be cleanly stated in the documentation, when we need to call copyBack().

I did tests with IntPointer/IntBuffers, here are the results:
                   (1)     (2)      (3)    
sumBuffer         1673    1781  1055676    
sumPointer       23049   22984   452716    
sumBufferCopy     4172    4168  1081054    
sumPointerCopy    3026    3026    40960    
sumIndexer        1612    5785  1180523    
sumIndexerCopy    4137    6935   302331    
cvSum             1399    1364     9222    
addBuffer         2325    2289  2119384    
addPointer       55452   48051   865838    
addBufferCopy     7648    7529  1084564    
addPointerCopy    4990    4744    45794    
addIndexer        2230    6889  2490893    
addIndexerCopy    7215    7943   448577    
cvAddS            1778    1870    11267    

Now the slowness of bulk buffer get is reproduced (the issue 11 problem). And 
pointer copy takes 2/3 of the time of buffer copy even on JDK, so it's always 
faster.

What's interesting, with bytes, the addPointerCopy was faster than addBuffer. 
With ints, it's the opposite. Now it's very tough to give a general rule of 
what's faster. I can think of these:
- on Android, always work on a copy. On JDK, do your own test
- work manually with a buffer or a copy for better speed
- use indexer for shorter but slower code

Tomorrow, I'll rewrite the code to always use buffer parameter. And what do you 
think about hiding the copying and direct subclasses? Anyway, I don't think we 
will be able to remove copying version anytime sooner than in 15-20 years :)

Viliam

Original comment by [email protected] on 16 Dec 2013 at 5:54

from javacv.

GoogleCodeExporter commented on July 30, 2024

Well, I don't know about the subclasses either. In the case of a direct NIO 
buffer, we can copy its content to a non-direct NIO buffers. We would basically 
use the two-class pattern at the Buffer level. No need to duplicate it a level 
higher, if it's as efficient that way. I'm pretty sure that would be as 
efficient as an array, even on Android, but we should test this out.

BTW, when comparing with optimized functions like cvSum(), be sure to optimize 
the code in Java as well, by unrolling loops, etc:
http://stackoverflow.com/questions/10784951/do-any-jvms-jit-compilers-generate-c
ode-that-uses-vectorized-floating-point-ins/10809123#10809123

Original comment by [email protected] on 16 Dec 2013 at 6:12

from javacv.

GoogleCodeExporter commented on July 30, 2024

Not sure about your 15~20 years estimate either 
http://www.extremetech.com/computing/170677-android-art-google-finally-moves-to-
replace-dalvik-to-boost-performance-and-battery-life
:)

Original comment by [email protected] on 16 Dec 2013 at 6:19

from javacv.

GoogleCodeExporter commented on July 30, 2024

Here are results for Android with unrolled loop, compared to C

(1) android 2.2 1-core 1GHz with unrolled loops
(2) same as above, without unrolling

                   (1)      (2)
sumBuffer       981770  1055676    
sumPointerCopy   31024    40960    
cvSum             9002        -


As you see, a bit better, but not a lot. Maybe the Android's JIT does automatic 
unrolling, but still not comparable to C. So still no reason to use java on 
android if you need performance. I checked the source of cvSum, they explicitly 
unroll 4 times in case of 1-channel images. In others, they just do simple loop 
(and the compiler might do automatic unroll).

I also tried the the unrolled version on HotSpot, the difference is negligible:

(1) with unrolled loops, jdk 1.6.0_41-64bit, athlon x2 64, 2-core 3GHz, run 
with -server switch
(2) same as above, without unrolling

                   (1)      (2)
sumBuffer         1696     1673
sumPointerCopy    2950     3026    
cvSum             1453        -

Note, that I measure the the times as an average of 5 runs, after 15 "heat up" 
runs, with the -server JVM switch (-server causes lower threshold for the JIT 
to take place). Try to do this with your example at 
http://stackoverflow.com/a/10809123/952135, probably openjdk's JIT does not 
unroll automatically.

Copying still helps a lot, even on android 4.4 (see comment 35). I will test 
the ART and will let you know, maybe then we can disable copying version in 10 
years, when non-art is obsolete...

Tomorrow, I'll rewrite indexers to your previous notes.

Original comment by [email protected] on 18 Dec 2013 at 7:44

from javacv.

GoogleCodeExporter commented on July 30, 2024

Well in any case, I think we should compare with the same loop in C++ instead 
of calling cvSum(), just to be fair.

To be clear, I see the constructor of ByteIndexer doing something like this:
ByteIndexer(Buffer buffer, boolean cached) {
    if (cached) {
        this.buffer = this.cache = ByteBuffer.allocate(buffer.size());
        this.pointer = new Pointer(buffer);
    } else {
        this.buffer = buffer;
    }
}

With a couple of helper methods:
void fillCache() {
    pointer.get(cache.array());
}
void flushCache() {
    pointer.put(cache.array());
}

Seems like that would be the best things to do...

Original comment by [email protected] on 18 Dec 2013 at 11:52

from javacv.

GoogleCodeExporter commented on July 30, 2024

Tried to implement suggestions from comment 46, but failed due to slower 
performance:

1. Using heap buffer instead of array is slower:

(1) Copying version with buffer
(2) Copying version with array

                   (1)     (2)
sumIndexerCopy    6348    4137    
addIndexerCopy   10934    7215    

Tested only on HotSpot 1.6, but I have not doubt it will be slower on Android

2. Using buffer to initialize indexers is slower in case of many small 
matrices. It does unnecessary conversion from pointer to buffer and back to 
pointer. We can do about 1000 such conversions per ms on my 3GHz Athlon x2. On 
android, it will be definitely longer.

CvMat (and probably all JavaCPP-mapped classes) return their arrays as 
Pointers. I don't think the API from latest patch is "hackish". It is part of 
JavaCPP and supposed to simplify iterating multidimensional arrays, and it is 
created from pointer, that is commonly used by JavaCPP.

3. I renamed copyBack() to flushCache(), and added fillCache() method, moved 
that to the base class and renamed the "copying" flag to "cached". Made 
subclasses package-visible.

4. I added JUnit test.

Original comment by [email protected] on 19 Dec 2013 at 8:17

Attachments:

patches.zip

from javacv.

GoogleCodeExporter commented on July 30, 2024

Could you provide the code that gave you these results? (BTW, no need to update 
all your classes until we decide on the final API...)

Running the attached TestIndexer on my gear here (java version "1.7.0_45" 
OpenJDK fedora-2.4.3.0.fc19-x86_64 u45-b15) outputs this:

-249231677 171
-249231677 209
-249231677 219
-249231677 223
-249231677 233
-249231677 223
-249231677 167
-249231677 167

Doesn't look like there's a clear winner: Both are fast. What do you get?

Original comment by [email protected] on 20 Dec 2013 at 2:38

Attachments:

TestIndexer.java

from javacv.

GoogleCodeExporter commented on July 30, 2024

Try to run more iterations, and add -server switch to the jvm (don't know if 
openjdk supports it). The aim is to JIT to take place. I dont have my code, I 
rewrote the indexer and ran test from comment 34. After seeing that it is 
slower, I reverted it.

Original comment by [email protected] on 20 Dec 2013 at 5:48

from javacv.

GoogleCodeExporter commented on July 30, 2024

[deleted comment]

from javacv.

GoogleCodeExporter commented on July 30, 2024

I ran your test on my machine. Here are the results:

1. HotSpot 1.7.0_17 64-bit
2. HotSpot 1.6.0_41 64-bit
3. Android 4.4 nexus 7 2012

        (1)     (2)     (3)
Array   270     270     412
Buffer  290     240    2262

So not so big difference on HotSpot. Apparently, we are in the area of 
microoptimisation, and it seems very dependent on runtime configuration. As we 
saw earlier, caching indexer is only useful on Android (or maybe some less 
performant VMs too), and the third result decides it (I only used 5MB array, 
because of memory constraints).

So for this moment, I think, that the version from the last patch is best so 
far...

Original comment by [email protected] on 22 Dec 2013 at 10:04

from javacv.

GoogleCodeExporter commented on July 30, 2024

I will not have access to my development computer until January 7th, but I will 
be able to reply. I wish you Merry Christmas and a happy new year :). I'm glad 
for this cooperation, we will be definitely a lot better than the current API...

Original comment by [email protected] on 22 Dec 2013 at 10:06

from javacv.

GoogleCodeExporter commented on July 30, 2024

Thanks for trying it out! Yes, I understand that you may not always have access 
to your machines, that's fine...

So, it looks like Android doesn't like it that way. Let's think about this a 
bit more. I'm not a big fan of overengineering... Merry Christmas!

Original comment by [email protected] on 23 Dec 2013 at 3:02

from javacv.

GoogleCodeExporter commented on July 30, 2024

I think I see the problem with Android. According to "Ben, one of the engineers 
working on the JIT @ Google":
http://stackoverflow.com/questions/4912695/what-optimizations-can-i-expect-from-
dalvik-and-the-android-toolchain
Dalvik can only inline methods that do not have branches, but ByteBuffer.get() 
does branch in checkIndex():
https://android.googlesource.com/platform/libcore/+/master/luni/src/main/java/ja
va/nio/ByteArrayBuffer.java
So it looks like we can't use NIO buffers at all if we're looking into having 
high-ish performance on Android :(

Original comment by [email protected] on 23 Dec 2013 at 3:19

from javacv.

GoogleCodeExporter commented on July 30, 2024

Happy New Year! So, any news on your part?

I have a couple of new ideas about how the API should look like, so I'm jotting 
them down here. For consistency, I think we should mimic how things are done 
with Buffer and in C/C++. In the case of ByteIndexer, for example:

ByteIndexer wrap(BytePointer pointer, int ... sizes)
ByteIndexer wrap(ByteBuffer buffer, int ... sizes)
ByteIndexer wrap(byte[] array, int offset, int ... sizes)
byte get(int[] indices)
ByteIndexer put(int[] indices, byte b)
along with optimized get() and put() for 1, 2, and 3 dimensions

I don't think we'd need to provide a factory method for "steps" because that's 
not how it works with multidimensional arrays in C/C++ anyway, and AFAIK not 
even OpenCV ever aligns rows that badly on purpose, or does it?

What do you think?

Original comment by [email protected] on 11 Jan 2014 at 10:48

from javacv.

GoogleCodeExporter commented on July 30, 2024

Hello,

regarding the name, I don't mind, but I personally prefer "create" version,
as this is commonly used in Java for factory methods. On Wikipedia they use
"new" prefix. "Wrap" evokes the delegate pattern, which is not exactly this
case.

Secondly, OpenCV uses non-continuous arrays for submats (cvGetSubRect), so
we need the "steps" version.

I will add the array version of get/put probably tomorrow.
 D�a 11.1.2014 11:48 pou��vate� <[email protected]> nap�sal:

Original comment by [email protected] on 13 Jan 2014 at 5:54

from javacv.

GoogleCodeExporter commented on July 30, 2024

Thanks, but no need to update the source right away!

The problem with "create" is that we need to put something after, while "wrap" 
is already a name used by ByteBuffer, etc to do exactly the same thing (that 
is, add missing functionality from the wrapped object). I've already done that 
for capacity(), position(), and limit() for example and it's working just fine, 
so IMO we might as well keep doing the same thing :)

AFAIK, cvGetSubRect() is never going to cut in the middle of a pixel, so even 
if the start position changes, the effective "width" isn't going to change. 
Unless we keep two sets of variables? Could you elaborate on how is this could 
work for 4, or even 10 dimensional access?

Original comment by [email protected] on 14 Jan 2014 at 12:48

from javacv.

GoogleCodeExporter commented on July 30, 2024

Wait a minute, when creating a 24-bit per pixel RGB image with a width of 1001, 
OpenCV returns an image with a "widthStep" of 3004 bytes: Ouch.

So, I guess your proposal is to base everything on steps and not on dimensional 
sizes?

Original comment by [email protected] on 17 Jan 2014 at 1:21

from javacv.

GoogleCodeExporter commented on July 30, 2024

Sorry for responding late.

We need to keep the "steps" version, and we can decide on keeping the 
"dimensions" one. I think the latter it is simpler to understand for javists, 
who do not commonly use array padding or store multidimensional arrays in 1D 
ones. But, on the other side, if someone will try to create indexer for a CvMat 
without understanding it's internal storage, trying to provide 
rows/cols/channels as dimensions (which he understands) instead of steps (which 
he doesn't) will lead to more confusion. So I don't know whether to keep it or 
remove it for the same reason of simplicity. As of the rule "less code is 
better", I would remove it, it will force people to learn the necessary 
internals or use the CvMat.createXxxPixelIndexer.

And all our indexers should only be created from Pointers. There are already 
methods to convert bytes/pointers/arrays from/to each other. Our utilities are 
supposed to be used with JavaCPP classes, which always converts pointers to 
buffers and not vice versa and never uses java arrays. The other methods would 
only be useful, if indexers are used as a general purpose utility for data 
coming from somewhere else, which I don't expect. I think, this would be 
overengineering and not really needed functionality.

Original comment by [email protected] on 20 Jan 2014 at 6:40

from javacv.

GoogleCodeExporter commented on July 30, 2024

Yeah, we'll go with steps. Although we might want to name it "stride" because 
that's what they use in Java:
http://docs.oracle.com/javafx/2/api/javafx/scene/image/PixelReader.html
http://docs.oracle.com/javase/7/docs/api/java/awt/image/Raster.html

I wouldn't consider it overengineering, because we have to use a Java array in 
the case of Android. And actually I think it would simplifying things. We'd 
have ByteBufferIndexer that does its thing with ByteBuffer (created from a 
Pointer or not), and ByteArrayIndexer, that would accept an array, or create 
one from a Pointer. This way, the user wouldn't have to consider 
ByteArrayIndexer as some sort of "cache" or "copy". Conceptually it would just 
be doing something with an array, which it got from a Pointer maybe. And we 
could have easy-to-understand ByteArrayIndexer.readPointer()/writePointer() or 
something that would do exactly what one would expect! Sounds nice to me. What 
about you?

Original comment by [email protected] on 24 Jan 2014 at 8:20

from javacv.

GoogleCodeExporter commented on July 30, 2024

The array thing is only to solve poor performance of direct ByteBuffer on 
Android. While working with data on native heap, we should normally access it 
via Buffer-s, that should be in principle the fastest. But due to poor 
implementation on android, we have to hack it with an array, and not only that, 
we have to get that array via Pointer. For the user, the change is only in one 
boolean parameter, that could be stored in one global constant and changed 
appropriately.

I still disagree with making this a general purpose utility for accessing data 
of this structure, for few reasons:
- CV applications are usually performance constrained. For this, usually the 
most direct method is the fastest. In our case, even using indexer is bit 
slower than directly using buffers.
- It is trivial to write utility such as this one, if one understands the data 
structure.
- No one will search for it here, JavaCPP or JavaCV is not a general purpose 
utility package. If such package was already present in OpenCV, then this does 
not hold.
- It's purpose is mainly to provide convenience method for accessing elements 
of CvMat, as a faster replacement of now deprecated methods of CvMat. 

Similarly, I don't agree with "stride". We are not translating OpenCV to Java 
terminology. OpenCV uses "step", so should we. For this reason, I would even 
rename the "createBytePixelIndexer" to "createByteDataIndexer" in the CvMat 
class, because it is internally named "data", though in most cases is's 
"pixels". We should stick to simple mapping.

Maybe we could add the Indexer factory to another JavaCpp-ported libraries, 
which have similar structure of data. If it is not consistently named "steps" 
here, then I will agree with using "stride" within java.

But you have the last word.

Original comment by [email protected] on 27 Jan 2014 at 6:28

from javacv.

GoogleCodeExporter commented on July 30, 2024

Let's take a step back here. My intention with whatever goes in JavaCPP is to 
have it available to more than just OpenCV. Of course it's "trivial", but it 
happens that many people like to use MATLAB because it offers that kind of 
"trivial" functionality. Even Scala provides a feature for that:
http://www.tutorialspoint.com/scala/scala_arrays.htm

If someone were to add that to some other package, that would be nice. 
Unfortunately, it's not happening, so I thought I might as well offer it as 
part of JavaCPP, instead of only JavaCV, that's all. And if one were to happen 
to use some other library in Java that uses arrays (because that's the whole 
point of using Java, we're not limited to OpenCV), then that user could use 
those facilities to exchange data more easily between whatever other library 
one wishes to use with OpenCV, such as JavaFX, for example.

BTW, OpenCV *does* already offer that kind of "trivial" functionality, but JNI 
cannot yet inline native functions, so if we want performance, we need to rely 
on direct NIO buffers, or on arrays in the case of Android.

There must be some better name for "stride", "step", or "pitch"... Wikipedia 
doesn't offer much insight I find:
http://en.wikipedia.org/wiki/Stride_of_an_array

Original comment by [email protected] on 27 Jan 2014 at 8:34

from javacv.

GoogleCodeExporter commented on July 30, 2024

What do you think of this API?:


package com.googlecode.javacpp.indexers;

import java.nio.ByteBuffer;

import com.googlecode.javacpp.BytePointer;

/**
 * Convenient way to access elements of an signed 8-bit multidimensional array.
 * 
 * <p>Indexerss are available for all java primitive types.
 * 
 * @author Viliam Durina
 */
public abstract class ByteIndexer {

    protected final int[] strides;

    public ByteIndexer(int[] strides) {
        this.strides = strides;
    }

    /** Get element [i, j] of multidimensional array */
    public abstract byte get(int i, int j);
    /** Get element [i, j, k] of multidimensional array */
    public abstract byte get(int i, int j, int k);
    /** Get element at specified coordinates of multidimensional array */
    public abstract byte get(int ... coords);
    /** Set element [i, j] of multidimensional array */
    public abstract void put(byte v, int i, int j);
    /** Set element [i, j, k] of multidimensional array */
    public abstract void put(byte v, int i, int j, int k);
    /** Set element at specified coordinates of multidimensional array */
    public abstract void put(byte v, int ... coords);
    /** Copies modification back to the native memory after processing is done. No-op, if not cached. */
    public abstract void flushCache();
    /** Discards modifications to the cached copy and populates cache anew. No need to call this after creation. No-op, if not cached */
    public abstract void fillCache();

    /**
     * Factory method.
     * 
     * @param pointer The native pointer. It should have the {@link BytePointer#capacity(int)} set prior to the call.
     * 
     * @param strides <code>strides[0]</code> - byte position of the [1, 0, 0]th element,
     *     <code>strides[1]</code> - byte position of the [0, 1, 0]th elemnts etc.
     *     If the array is contiguous, see {@link #calcContiguousStrides(int[])}
     *     See <a href='http://en.wikipedia.org/wiki/Stride_of_an_array'>Wikipedia/Stride of an array</a>.
     * 
     * @param cached Whether to copy the buffer (this gives faster access on Android devices, tested up to android 4.4,
     *      but do your own test)
     * 
     * @return Initialized instance
     */
    public static ByteIndexer wrap(BytePointer pointer, int[] strides, boolean cached) {
        if (cached)
            return new CachingByteIndexer(pointer, strides);
        else
            return new DirectByteIndexer(pointer.asBuffer(), strides);
    }

    /**
     * @see #wrapNonUnit(ByteBuffer, int[], boolean)
     */
    public static ByteIndexer wrap(ByteBuffer buffer, int[] strides, boolean cached) {
        if (cached)
            return new CachingByteIndexer(new BytePointer(buffer), strides);
        else
            return new DirectByteIndexer(buffer, strides);
    }

    /**
     * @see #wrapNonUnit(ByteBuffer, int[], boolean)
     */
    public static ByteIndexer wrap(byte[] data, int[] strides, boolean cached) {
        if (cached)
            throw new UnsupportedOperationException("We don't support caching of java arrays");
        else
            return new CachingByteIndexer(data, strides);
    }

    /**
     * Calculates stride values for supplied dimensions for contiguous array.
     *
     * See <a href='http://en.wikipedia.org/wiki/Stride_of_an_array'>Wikipedia/Stride of an array</a>.
     */
    public static int[] calcContiguousStrides(int[] dims) {
        int strides[] = new int [dims.length];
        strides[dims.length -1] = 1;
        for (int i=dims.length - 2; i>=0; i--)
            strides[i] = dims[i] * strides[i+1];

        return strides;
    }

}




It has "wrap" method for strides. I removed the createForDims factory method, 
and replaced it with calcContiguousStrides, which calculates strides from 
dimensions. I also gave the link to the wikipedia page, which I find useful for 
those, who are not familiar with "strides".

I don't like the wrap(byte[]) method. I throw UnsupportedOperationException 
there, if caching is requested. For caching to work (if we want it), I would 
need to crate a third ByteIndexer subclass that will clone the array and be 
able to copy back. Or complicate the code of CachingByteIndexer to accept array 
which it will clone. Do we really want that?

Original comment by [email protected] on 28 Jan 2014 at 7:01

from javacv.

GoogleCodeExporter commented on July 30, 2024

Well, that still looks like a poor hack... As I explained a few times already, 
I'd like to put something a bit more generally useful in JavaCPP. You said that 
"even using indexer is bit slower than directly using buffers", but this isn't 
even true. With a properly optimizing compiler, everything can gets inlined and 
moved around appropriately. Here's proof that OpenJDK 7 does that already. This 
is the output of the attached "TestIndexer2" on my machine:
    indexer: 3151 ms
    array: 3152 ms
This shows that it doesn't need to be slower than directly using arrays or 
buffers.

Now, about the API, I guess we should try to make it similar to C# since they 
already have something like that in there:
    http://msdn.microsoft.com/en-us/library/system.array.createinstance
So, naming the factory method "create" might not be such a bad idea, but then 
again in our case we don't allocate the data, so "wrap" might still be a 
better... Also, we note that the user provides the dimensions of array, not the 
strides. The runtime should by default allocate whatever is most efficient for 
the machine, but we must also provide an additional optional "stride" 
parameter, just like it's possible with OpenCV. Consequently, the factory 
methods might look something like 
    ByteIndexer.wrap(byte[] array, int ... sizes)
    ByteIndexer.wrap(ByteBuffer buffer, int ... sizes)
    ByteIndexer.wrap(byte[] array, int[] sizes, int[] strides)
    ByteIndexer.wrap(ByteBuffer buffer, int[] sizes int[] strides)
And the accessors:
    byte get(int ... indices)
    ByteIndexer put(int i, byte b)
    ByteIndexer put(int i, int j, byte b)
    ByteIndexer put(int i, int j, int k, byte b)
    ByteIndexer put(int[] indices, byte b)
With parameters ordered to be consistent with both arrays and NIO buffers. 
Along with a couple of utility methods like read(Pointer)/write(Pointer) I 
guess.

That's starting to sound like something I want to use. Do you feel differently? 
Do you still feel like this would not be generally useful functionality?

Original comment by [email protected] on 3 Feb 2014 at 1:06

from javacv.

GoogleCodeExporter commented on July 30, 2024

Original comment by [email protected] on 3 Feb 2014 at 1:08

Attachments:

TestIndexer2.java

from javacv.

GoogleCodeExporter commented on July 30, 2024

First to the benchmark: I was bit surprised with your results, but now it seems 
logical to me. I even tried to rewrite the array version to something I thought 
should be faster, but actually, it was a bit slower than before:

    static long testArray(byte[] array) {
        long sum = 0;
        for (int i = 0, row = 0; i < SIZE; i++, row += SIZE) {
            for (int j = 0, idx = row; j < SIZE; j++, idx++) {
                sum += array[idx];
            }
        }
        return sum;
    }

I'm rather confused of this micro-optimisation. In my previous tests, the 
indexer version was really slower, look at comment 34, test code is there. But 
not always :(. Maybe it was because of base class and subclasses and virtual 
methods. Definitely, direct access is simpler for compiler to optimize, even 
though it might be able to optimize the indexer version to the same performance.

But leave that for now, we will include indexer in any case. Let's propose the 
API. I don't understand the method 
    ByteIndexer.wrap(byte[] array, int[] sizes, int[] strides)

Why do we need both sizes and strides? Internally, we will only use strides. If 
we did range checking, it would only make things slower. 

And should we not support automatic copying and leave that to the user? That 
would remove the need to have base class and subclasses, but the code would be 
more complicated, and does it have performance benefit on android to omit it?

So please finalize the API, and I will implement it. I'd wish to finally finish 
this neverending discussion :)

Original comment by [email protected] on 4 Feb 2014 at 5:38

from javacv.

GoogleCodeExporter commented on July 30, 2024

Sorry for the delay. Been busy with other stuff, check it out BTW:
    http://code.google.com/p/javacpp/wiki/Presets
And forgot to reply.

Regarding your comment #34, "sumBuffer" and "sumIndexer" are pretty much the 
equal, and the same for "addBuffer" and "addIndexer" so I'm not sure what you 
are surprised about. Of course it's a bit more difficult for a compiler to 
optimize virtual calls, but Hot Spot does it just fine.

Some code that uses an `Indexer` might need to know the sizes to perform some 
operation. So we'd need to provide some `int[] sizes()` method, and also a 
`int[] strides()` one too I guess.

What do you mean by "automatic copying"? When would the data get copied? AFAIU, 
this requires manual intervention, however we look at it. Since we have to do 
it manually anyway, we might as well make the interface more general, that's 
all. If we can do it automatically somehow, then that's a different matter, so 
please do explain.

Yeah, I agree, it would be nice to be able do this work in person :)

Original comment by [email protected] on 2 Mar 2014 at 5:05

from javacv.

GoogleCodeExporter commented on July 30, 2024

Unless you have some news concerning the support for the "automatic copying" 
you were mentioning, I think I've finally come up with a design that fully 
satisfies me! Basically, I thought of having an interface Indexable similar to 
Iterable, where the Indexable/Indexer pair has similar interactions to the 
Iterable/Iterator one. This way, the Indexer code needs to know nothing about 
Pointer, and we can copy the data around in a much more natural way.

public abstract class Indexer {
    protected int[] sizes;
    protected int[] strides;

    public int[] sizes() { return sizes; }
    public int[] strides() { return strides; }

    public abstract Object array();
    public abstract Buffer buffer();
}

public abstract class ByteIndexer extends Indexer {
    public ByteIndexer wrap(ByteBuffer buffer, int ... sizes) { ... }
    public ByteIndexer wrap(byte[] array, int offset, int ... sizes) { ... }
    public ByteIndexer wrap(ByteBuffer buffer, int[] sizes, int[] strides) { ... }
    public ByteIndexer wrap(byte[] array, int offset, int[] sizes, int[] strides) { ... }

    public abstract byte get(int i);
    public abstract byte get(int i, int j);
    public abstract byte get(int i, int j, int k);
    public abstract byte get(int ... indices);

    public abstract ByteIndexer put(int i, byte b);
    public abstract ByteIndexer put(int i, int j, byte b);
    public abstract ByteIndexer put(int i, int j, int k, byte b);
    public abstract ByteIndexer put(int[] indices, byte b);
}

class　ByteBufferIndexer extends ByteIndexer { ... }

class ByteArrayIndexer extends ByteIndexer { ... }

public interface Indexable {
    <I extends Indexer> I createIndexer(boolean direct);
    void readIndexer(Indexer i);
    void writeIndexer(Indexer i);
}

With that, the class the implements the interface does what it needs to do to 
return the appropriate Indexer. readIndexer()/writeIndexer() would do the 
copying around, but could be implemented as a noop when it happens that the 
Indexable and the Indexer both point to the same area of memory.

Of course we're going to need to do some benchmarking to make sure we don't 
sacrifice any performance with this class hierarchy... 

Concretely, I'm thinking of having the AbstractMat here implement this:
http://code.google.com/p/javacpp/source/browse/opencv/src/main/java/com/googleco
de/javacpp/helper/opencv_core.java?repo=presets#2188
And maybe eventually apply that to AbstractCvMat as well, but we could simply 
leave the old interface as deprecated since OpenCV itself has deprecated CvMat.

In any case, please let me know what you think! If it's something you'd like to 
work on, please let me know. (If not, I will be implementing something like 
that myself eventually.) Thanks!

Original comment by [email protected] on 16 Mar 2014 at 6:36

from javacv.

GoogleCodeExporter commented on July 30, 2024

That said, since the get()/put() methods of an Indexer doesn't change its 
state, unlike the next()/remove() methods of an Iterator, we might want to have 
our Indexable reuse the Indexer object. So, something like this may make more 
sense:

public interface Indexable {
    <I extends Indexer> I acquireIndexer(boolean direct);
    void releaseIndexer(Indexer i);
}

Where acquireIndexer() would read out data, and releaseIndexer() would write it 
back out (both being noops for direct indexers). I feel that this is about as 
"automatic" as it could get. What do you think? Do you have a better idea?

Original comment by [email protected] on 16 Mar 2014 at 2:05

from javacv.

GoogleCodeExporter commented on July 30, 2024

Hello Sam,

actually I'm confused of this micro-optimisation. I don't see this API that 
much important as you do. When I started with OpenCV/JavaCV, I did not know, I 
used CvMat.get(i,j) method, which is veery slow. Any of methods suggested here 
is good for me. For performance sensitive stuff I would go with C anyway. We 
still did not rewrite our alg. to C, but I expect double performance and stable 
speed from the first run (currently, our algorithm takes 3-4 seconds to 
complete, but the first run takes 10-12 sec. due to JIT taking place).

Regarding acquireIndexer/releaseIndexer, in case of copying indexer we would 
double the memory usage, even after the work is done.

And how would the Indexable interface know, which of byte/short/... Indexer to 
return? And the readIndexer should populate any type of indexer?

I will write any version you say is good for you :)

Viliam

Original comment by [email protected] on 17 Mar 2014 at 6:08

from javacv.

GoogleCodeExporter commented on July 30, 2024

Maybe we could simplify the interface even more:

public interface Indexable {
    <I extends Indexer> I indexer(boolean direct);
}

And have an abstract release() method in Indexer. However, that way, the 
Indexable must be able to subclass the Indexer, and call the constructors for 
all the concrete classes, so having wrap() factory methods becomes moot... What 
do you think?

The Indexable would return the correct subtype of Indexer based on its own 
internal information. It doesn't occur to me that a user might want to access 
floats as bytes or something. Do you see it differently?

Or course a non-direct indexer would double memory consumption, but that's just 
a workaround for crappy JIT compilers.

It's not about micro-optimization, it's about usability. With this kind of 
interface, we could easily, for example, scale all the elements of the first 
column  this way:

Mat Amat = ...
FloatIndexer A = Amat.indexer();
for (int i = 0; i < A.size[0]; i++) {
    A.put(i, 0, A.get(i, 0) * scale);
}
A.release();

We don't need to know anything about the number of columns, or the step/stride, 
and we don't need to use some awkward "getColumn()" utility method of some 
sort: We use nothing but simple Java arithmetic. People like it when they can 
do things easily. That's why a lot of people use MATLAB, for example. But 
MATLAB is slow when it comes to loops and stuff, so it's hard to generalize. 
Java, on the other hand, with a proper compiler, can be as fast as C++, but 
unlike C++, we can make it crash proof. I'm still confused myself as to why no 
one "gets it", but I guess I'll just have to keep working on it until these 
obvious (to me at least) concepts become more widely accepted...

Original comment by [email protected] on 17 Mar 2014 at 12:27

from javacv.

GoogleCodeExporter commented on July 30, 2024

I think this interface looks good too. Is this a version I should implement?

What I meant about the double memory, that if the Indexable reused the Indexer 
instance (comment 69), it will hold double memory even after the processing is 
done.

Original comment by [email protected] on 18 Mar 2014 at 5:01

from javacv.

GoogleCodeExporter commented on July 30, 2024

Sounds good! I'll be refactoring that helper file a bit with an AbstractArray 
under AbstractMat and CvArr, but you can start implementing whenever you feel 
like it, thanks!

Releasing the memory would hurt performance, but a user that doesn't care about 
performance on a poor compiler should be using the direct one anyway, so I 
think it's better not to release it. What do you think?

Original comment by [email protected] on 19 Mar 2014 at 1:47

from javacv.

GoogleCodeExporter commented on July 30, 2024

Ok, I've added such an AbstractArray class at the top of 
helper/opencv_core.java in this revision:
https://code.google.com/p/javacpp/source/detail?r=d5bd9f3afe353618a30f0fa55a38d3
e10ce5643d&repo=presets
The plan would to make that class implement Indexable. Comments?

Original comment by [email protected] on 24 Mar 2014 at 1:48

from javacv.

GoogleCodeExporter commented on July 30, 2024

BTW, I've moved the project site to GitHub:
    https://github.com/bytedeco
So let's work over there on this from now on, thanks!

Original comment by [email protected] on 29 Apr 2014 at 12:51

from javacv.

GoogleCodeExporter commented on July 30, 2024

I started to work on a project where I needed this functionality, so I've 
implemented it here:
https://github.com/bytedeco/javacpp/commit/4d8dc9d8b244b79ca3c9d9355e476f028f3f3
6f9
https://github.com/bytedeco/javacpp-presets/commit/ab8bde373733190fe204a3bb8d183
7fbb7c62b4f

Thanks for all the discussion and help with this! It was really appreciated :)

Original comment by [email protected] on 19 Oct 2014 at 12:29

Added labels: Type-Enhancement
Removed labels: Type-Defect

from javacv.

GoogleCodeExporter commented on July 30, 2024

Great!

Sorry for committing myself and not doing what I promised, but I really lose 
track of what is the best version and did not have time. Your version looks 
good, maybe I'll review it more thoroughly.

But you forgot to fix the original problem of this bug. See comments up to #2. 
Maybe we can keep returning cached xxxBuffer instance (as of comment #3), but 
from a deprecated method, and create the createXxxBuffer() method, that will 
return fresh instance.

Original comment by [email protected] on 20 Oct 2014 at 10:01

from javacv.

GoogleCodeExporter commented on July 30, 2024

Thanks for your review!

They've been marked as deprecated in the parent `AbstractMat` class, at least 
for the benefit of the C++ `Mat` class, and a new `createBuffer()` method 
added. But I see that the annotation doesn't get inherited by `CvMat`. It's not 
a problem anymore though because all the C++ functions that used to modify 
`CvMat` are now using `Mat` instead, so the original issue with `CvMat` doesn't 
happen anymore. Does that look OK to you?

Original comment by [email protected] on 25 Oct 2014 at 1:21

from javacv.

GoogleCodeExporter commented on July 30, 2024

Hello,

I reviewed the changes, as far I can tell, the Indexer implementation looks 
correct.

I suggest to deprecate all the get() and put() methods and the reset() method 
in AbstractCvMat, as they are very slow, in favor of the Indexer API.

Secondly, I would always add comments to deprecated methods. If you have a 
codebase and upgrade some library and it deprecates something, I'd like to know 
why and what should I replace it with.

I would mark all overriden getXxxBuffer methods as deprecated (with comments), 
just for clarity.

And I would add typed createXxxBuffer methods, as the createBuffer method 
returns Buffer, which is unusable without casting. It is a question, whether it 
should fail if changing the type here, as it is rarely useful to see, say, a 
mat with CV_16S as ints, I think...

Original comment by [email protected] on 3 Nov 2014 at 7:05

from javacv.

GoogleCodeExporter commented on July 30, 2024

And I would add comment to the createIndexer method, at least to its "direct" 
parameter. Something like:

@param direct If false, the native buffer is first copied to an Java array and 
copied back during the call to #release(). This is an optimization for 
platforms, where direct Buffer access is slow (specifically Android, but do 
your own test). Changes the behavior of the indexer in a way, that until the 
release() call, changes are not visible to the underlying object. Doubles 
memory usage.

Original comment by [email protected] on 3 Nov 2014 at 7:48

from javacv.

GoogleCodeExporter commented on July 30, 2024

I found another issue. Indexer has the release() method. The name invokes, that 
the associated resources are released upon this call. It only has associated 
resources in case of non-direct indexer, but does not release the copied array.

I can think of few solutions:

1. rename release() to finish() or commit() or copyBack(). The createIndexer 
method documentation should tell, that you should call that method after work 
in case of non-direct indexer.

2. release should release internal reference to array (or buffer) and further 
work with indexer will fail with NullPointerException (or better 
IllegalStateException)

3. The copy/copyBack logic could be encapsulated in the XxxArrayIndexer: it 
could have Pointer constructor.

I vote for 2: that is the easiest to document and use. And if the user will 
always call release (whether or not it is needed), he can safely change from 
direct to non-direct.

Another point:
The createIndexer method could be member of the Pointer class (or its 
subclasses). That way javacv needs not to have any change. And it will better 
reflect the OpenCV data structure, with the user knowing that Indexer is a way 
to access ND arrays referenced by pointers, and not a special feature of Mat. 
He could write:

mat.data_i().position(0).limit(mat.size()).createIndexer();

Original comment by [email protected] on 3 Nov 2014 at 8:42

from javacv.

GoogleCodeExporter commented on July 30, 2024

Thanks for the review I'll add a couple of `@see createBuffer()` tags for the 
deprecated methods. `createBuffer()` doesn't return `Buffer`, it returns `B` a 
subclass of `Buffer`, so I think that's OK?

As for CvMat, there are some methods that don't map to anything right now 
because I'm not sure how those should be done, so deprecating either all of 
them or just some of them doesn't make sense to me. Does it make sense to you 
in some way to deprecated functionality that has no replacement? Anyway, the 
whole class is deprecated by OpenCV, so new code should not be using that. :)

And the problem with the `direct` parameter and the `release()` is that they 
depend on the implementation. It's up to the implementing class to do whatever 
works with their underlying data. After all, if the data is available 
originally in an array, `direct` could mean a few different things, and 
`release()` be not implemented, so all these implementation details can't be 
part of the interface...

But, yes, I think it would make sense to make `Pointer` and its subclasses 
implement all that, and then it would make sense to specify the implementation 
details there. (Although they should be clear just by looking at the code :) 
Let me think about that

Original comment by [email protected] on 3 Nov 2014 at 9:00

from javacv.

GoogleCodeExporter commented on July 30, 2024

`Pointer` doesn't contain any information about the dimensions. Where would 
that come from?

Original comment by [email protected] on 8 Nov 2014 at 4:29

from javacv.

GoogleCodeExporter commented on July 30, 2024

In the end I've added `Indexer.create(Pointer ...)` factory methods:
https://github.com/bytedeco/javacpp/commit/cd632cfb0421639355a9622ce1285469dd5ff
add
It looks good like that I think. Thanks for pointing this out! This way we have 
a consistent hierarchical-like relationship between Pointer < Indexer < Mat.

Original comment by [email protected] on 9 Nov 2014 at 1:05

from javacv.

GoogleCodeExporter commented on July 30, 2024

This functionality has finally been included in JavaCPP 0.10! 
http://bytedeco.org/news/2014/12/23/third-release/
Thanks for all the help with this, and if you have any improvements, send a 
pull request! Thanks

Original comment by [email protected] on 25 Dec 2014 at 11:34

Changed state: Fixed

from javacv.

Cached value of CvMat.fullSize and *Buffer fields might get incorrect about javacv HOT 85 CLOSED

Comments (85)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent