Comments (85)
Thanks for reporting this issue! I hadn't really noticed that problem before,
but yes, you are right, this is a problem with C++ functions.
Actually, the same thing can happen with the *Buffer fields. I'm not sure how
to handle this efficiently. Any bright ideas?
Original comment by [email protected]
on 26 May 2013 at 11:34
- Changed title: *Cached value of CvMat.fullSize and Buffer fields might get incorrect
- Changed state: Accepted
from javacv.
I would remove all the fields, because they can't be updated reliably. The
getXxxBuffer() methods should always return fresh Buffer instance, with freshly
calculated capacity. We could encourage users to reuse *Buffer instance. I
would deprecate fullSize() and reset() methods.
By the way, CvMat.size() calculation is also incorrect in case of subMat. It
should be:
public int size() {
int rows = rows();
return cols()*elemSize()*channels() + (rows > 1 ? step()*(rows-1) : 0);
}
For example consider:
CvMat mat = CvMat.create(10, 10, CV_8U);
CvMat matSub = CvMat.createHeader(5, 5, CV_8U);
cvGetSubRect(mat, matSub, cvRect(5, 5, 5, 5));
The size of data of mat is 100 bytes, step is 10 (let's ignore alignment).
matSub's data point to mat.data + 55, that is the same data, offset by upper
left corner of the rectangle. Current version of size() method will return 50
for matSub, which will go beyond the mat.data length, because 50+55 > 100. This
problem is minor, as it does not affect correctly implemented programs, it only
allows buffer overflow errors, that are otherwise prevented.
Original comment by [email protected]
on 28 May 2013 at 4:52
from javacv.
Creating a Buffer is pretty expensive, and isn't that great for the garbage
collector either, so creating one for each call to CvMat.put() isn't an option.
That's why I'm doing this caching in the first place. It wasn't a problem with
the simpler (IMHO better) C API... Any thoughts?
Thanks for the the size() fix! I'll put that in :)
Original comment by [email protected]
on 2 Jun 2013 at 1:04
from javacv.
I would deprecate all get() and put() methods, as they are slow and might get
incorrect without reset().
When I started to use JavaCV, I began to use CvMat.get and .put methods, but
soon stopped to, as it is much slower when compared to using the Buffer
directly. JNI calls are expensive too, and the put(int i, double v) makes one
call and put(int i, int j, double v) makes three more. Next, there is
unnecessary conversion from double and back, which is not free (maybe it is
optimized out by JIT, I did not test).
Compare these snippets. They all do one get, multiply and put per pixel. The
first uses the (i, j) version, second uses the (i) version, the third uses the
Buffer directly and the fourth copies the buffer to Java array and back:
public static void testIJ() {
long start = System.nanoTime();
CvMat mat = CvMat.create(1000, 1000, CV_8U);
for (int i=0; i<1000; i++)
for (int j=0; j<1000; j++) {
byte val = (byte) mat.get(i, j);
val = (byte) (2*val);
mat.put(i, j, val);
}
System.out.println((System.nanoTime() - start) / 1000000 + " ms");
}
public static void testI() {
long start = System.nanoTime();
CvMat mat = CvMat.create(1000, 1000, CV_8U);
int step = mat.step() / mat.elemSize();
int channels = mat.nChannels();
for (int i=0; i<1000; i++)
for (int j=0; j<1000; j++) {
byte val = (byte) mat.get(step + j*channels);
val = (byte) (2*val);
mat.put(step + j*channels, val);
}
System.out.println((System.nanoTime() - start) / 1000000 + " ms");
}
public static void testDirect() {
long start = System.nanoTime();
CvMat mat = CvMat.create(1000, 1000, CV_8U);
int step = mat.step() / mat.elemSize();
int channels = mat.nChannels();
ByteBuffer buffer = mat.getByteBuffer();
for (int i=0; i<1000; i++)
for (int j=0; j<1000; j++) {
byte val = buffer.get(step + j*channels);
val = (byte) (2*val);
buffer.put(step + j*channels, val);
}
System.out.println((System.nanoTime() - start) / 1000 + " us");
}
public static void testJavaArray() {
long start = System.nanoTime();
CvMat mat = CvMat.create(1000, 1000, CV_8U);
int step = mat.step() / mat.elemSize();
int channels = mat.nChannels();
ByteBuffer buffer = mat.getByteBuffer();
byte[] jbuffer = new byte[mat.size()];
// I assume the buffer is continuous, as the mat.isContinuous() always
// returns false
buffer.get(jbuffer);
for (int i=0; i<1000; i++)
for (int j=0; j<1000; j++) {
byte val = jbuffer[step + j*channels];
val = (byte) (2*val);
jbuffer[step + j*channels] = val;
}
buffer.position(0);
buffer.put(jbuffer);
System.out.println((System.nanoTime() - start) / 1000 + " us");
}
Resulting times (after 10 iterations):
TEST HOTSPOT VM 1.6, 3GHZ ATHLON X2 1GHZ 1CORE ANDROID 2.3.5 (no JIT)
testIJ 280ms 9800ms
testI 65ms 4500ms
testDirect 3.5ms 1450ms
testJavaArray 5.1ms 60ms
Surprising is the Android result, which I added for information. Probably due
to the lack of JIT, even Buffer access is very slow. On desktop VM the
JavaArray version is a bit slower than via the Buffer, due to unnecessary
copying of the buffer data.
If you want to simplify pixel access for the users, you could add interface say
"PixelAccessor", which will do the expensive operations of creating Buffer and
calculating size upon initialization and users will be asked to reuse the
instance. I attach BytePixelAccessor as an example, similar class should be
created for other types. You can add createBytePixelAccessor() method to CvMat
and IplImage (not getBytePixelAccessor(), so the name implies something is
created upon every invocation).
Viliam
Original comment by [email protected]
on 3 Jun 2013 at 5:30
Attachments:
from javacv.
They are not slow with floating-point matrices (native function call isn't slow
BTW), but I agree they are not ideal, and shouldn't be used for integer
matrices...
Anyway, thanks for all the recommendations! I'm not sure how this is going to
turn out. The thing is OpenCV itself switched to "cv::Mat", and I'm currently
working on something to help me wrap this kind of thing more easily (not just
for OpenCV), without having to create all the .java files by hand... When and
if I succeed, then I would start looking into better ways of accessing these
structures in Java, i.e.: your accessor class.
But it might take much time before I come up with anything :(
What do you think about all this?
Original comment by [email protected]
on 8 Jun 2013 at 1:15
from javacv.
I don't understand your statement that floating-point matrices aren't slow. I
did a test with 32F matrices, and here are the results:
TEST HOTSPOT VM 1.6, 3GHZ ATHLON X2 1GHZ 1CORE ANDROID 2.3.5 (no JIT)
testIJ_8u 231ms 9715ms
testI_8u 59ms 4467ms
testDirect_8u 3.4ms 1460ms
testJavaArray_8u 13.0ms 120ms
testIJ_32f 233ms 10267ms
testI_32f 65ms 5050ms
testDirect_32f 2.8ms 2199ms
testJavaArray_32f 13.0ms 1143ms
I did also test 64f matrices, and the times were roughly the same as float. The
testJavaArray_32f is much slower than testJavaArray_8u on android, because
probably some memory limit was approached (I saw grow heap operation after
every test run during 32f, but not during 8u). Each test was run 10times, the
result was taken by averaging the last five runs.
There *is* JNI call overhead. If you call, say, five operations on Mat, then
the overhead is negligible, but if you call CvMat.get(i, j) 5-million times,
which itself makes 5 JNI calls, then it is not. I first noticed this issue,
when processing the HoughLines result. I iterated the returned collection of
lines many times, and just by copying the lines to java array instead of
calling CvPoint.x() and .y() it run noticeably faster.
I can contribute the PixelAccessor enhancement as a patch, if you wish.
Original comment by [email protected]
on 10 Jun 2013 at 5:03
from javacv.
FYI, here is the benchmark: http://pastebin.com/DtGBkzyk
Original comment by [email protected]
on 10 Jun 2013 at 5:06
from javacv.
Well, slow is relative, not as slow as creating a Buffer object :) But of
course it's still not optimal. And yes Android/Dalvik does suck however one
looks at it.
BTW, your "accessor" solution reminds me of this article:
Java Programming for High-Performance Numerical Computing
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.1554
Actually, I feel this is something that should be included in JavaCPP. The way
I see it we'd have a collection of classes, e.g.: ByteAccessor, ShortAccessor,
etc. with constructors that accept a Buffer and the dimensions, and a
collection of appropriate get() and put() methods for 2D and 3D access.
Then, we could include factory methods in the corresponding Pointer classes, as
well as others like CvMat, CvPoint, etc.
What do you think? Is this something you'd like to contribute? Sounds really
useful in any case :)
And to cope with Android, instead of a Buffer, the accessors could also accept
a Pointer, slurping and dumping native memory on user request only.
Original comment by [email protected]
on 13 Jun 2013 at 7:24
from javacv.
I'd like to contribute. Lets summarize what has to be done:
- deprecate getXxxBuffer and fullSize methods on CvMat
- add XxxPixelAccessor classes
- add factory methods for PixelAccessors to CvMat and IplImage
- add CopyingPixelAccessor for Android (and others without JIT)
We should consider the PixelAccessor classes to be convenience classes for
developer. The direct Buffer access is still fastest, even only slightly. This
is about avoiding the complexity for the developer and avoid possible error
when calculating the indexes. There is no point in adding it to the Pointer
class, as it has no internal structure.
Regarding the CvPoint class, there are bulk put and get methods to and from
java (get and put) which could be rewritten using buffers too:
public final CvPoint put(int[] pts, int offset, int length) {
asByteBuffer().asIntBuffer().put(javapoints, offset, length);
}
public CvPoint get(int[] pts, int offset, int length) {
points.asByteBuffer().asIntBuffer().get(javapoints, offset, length);
}
These are approximately 10 times faster than the current versions (tested on
array of 10000 points on desktop java). Also, PointAccessor could be written,
that could use Buffer access. It is about 30 times faster than using x() and
y() methods. See the benchmark: http://pastebin.com/bUC6u5R7
Actually, such class could be written for any c++ class, and, as you see, it
strongly depends on the internal structure of the class. These files should be
generated...
I read a JNI performance article, and one of the recommendations was to keep
the number of needed JNI calls low, because there is overhead. As from my
experience, such algorithms should be written in C(++), and call them from JNI.
Java should be used to call complex operations only (for example operations on
Mats), and not to do pixel processing. Or for prototyping.
By the way, we are rewriting our entire algorithm to C++, as we use lot of
pixel processing and point processing, and on android it will be a lot faster.
Further, we plan iOS version too, so the code can be reused.
Original comment by [email protected]
on 14 Jun 2013 at 5:55
from javacv.
I understand there are a lot of little details to take care of, but first
things first: We need those "accessor" classes. Now, we could have a
BytePixelAccessor, a BytePointerAccessor, etc. but I feel that having one
ByteAccessor class usable on *any* structure would be more useful. Do you not
feel this would be better?
Original comment by [email protected]
on 14 Jun 2013 at 1:22
from javacv.
Let me try to explain this a bit better. Assume we have some `CvMat mat` and
`CvPoint pts`, then I think it would be nice to be able to do something like
this:
IntAccessor matAccessor = mat.getIntAccessor(); // or create ...
IntAccessor ptsAccessor = pts.getIntAccessor();
matAccessor.put(1, 1, ptsAccessor.get(1, 1));
Of course we'd have FloatAccessor and what not, and maybe the names could be
different too, but what I want to emphasize is that the reuse of the common
class, used by both CvMat and CvPoint in this case. Do you not feel this would
be a good thing?
Original comment by [email protected]
on 16 Jun 2013 at 1:51
from javacv.
The accessors are about structure. CvMat.data do have structure, CvPoint does
too, but Pointer does not. They enable to write
matAccessor.get(i, j)
instead of
mat.getByteBuffer().get(i*mat.step() + j*mat.channels());
PixelAccessor saves the JNI calls and buffer creation (lets assume
getByteBuffer does not return cached instance). PixelAccessor offers no more
functionality than the current get() methods on CvMat, only it's lifetime is
separate from that of CvMat.
Try to write Accessor for Pointer. I think there would be no more functionality
than in java.nio.XxxBuffer, so I think is needless.
Original comment by [email protected]
on 17 Jun 2013 at 4:03
from javacv.
"Multidimensional arrays" in C/C++, the ones accessed like this `mat[i][j]`,
are actually allocated in one contiguous area of memory: Check it up! Java's
Buffer provides no way to access those. I think it would be very useful to
provide an interface for native multidimensional arrays. Why do you feel
differently?
Original comment by [email protected]
on 18 Jun 2013 at 12:30
from javacv.
Mat is not multidimensional array, neither is Pointer or Mat.data. Actually, i
have implemented the accessors already, but have not done any tests. I will
send patch then, and then try to implement Accessor for Pointer yourself,
you'll see there's no point in it.
Original comment by [email protected]
on 18 Jun 2013 at 1:06
from javacv.
Sure, please explain how you would access the pixel at (100, 100) of a
dc1394video_frame_t.image :
http://code.google.com/p/javacv/source/browse/src/main/java/com/googlecode/javac
v/cpp/dc1394.java#834
Original comment by [email protected]
on 19 Jun 2013 at 2:00
from javacv.
I don't know, it don't know it's structure. If it's simple multidimensional
array, then it's image.capacity(rows*cols).asByteBuffer().get(100*cols+100). I
would cache the buffer for access for fast access to other pixels.
Original comment by [email protected]
on 19 Jun 2013 at 3:57
from javacv.
Here is the patch. Have a look at it and let me know.
Original comment by [email protected]
on 19 Jun 2013 at 4:32
Attachments:
from javacv.
I deprecated the getXxxBuffer() methods in CvMat, but found out that I still
need them to create android bitmap:
bitmap.copyPixelsFromBuffer(cvImage.getByteBuffer());
For backwards compatibility, I would still deprecate getXxxBuffer methods and
create new createXxxBuffer() methods, that will always return new buffer. What
do you think?
Original comment by [email protected]
on 19 Jun 2013 at 9:49
from javacv.
Let's go back to comment #16 first. Wouldn't it be nice instead to be able to
do something like this:
ByteAccessor a = new ByteAccessor(image, rows, cols);
byte b = a.get(100, 100);
?
Original comment by [email protected]
on 19 Jun 2013 at 12:11
from javacv.
Sorry for not responding, I'm quite busy now...
Now I got it: the CvMat's pixelaccessor could be generalized to
multidimensional array accessor. But it not only has to know each dimension's
length, but the step too.
Now, I would put these to JavaCPP project, not to JavaCV. In JavaCV, there
could be factory methods in CvMat to simplify it's creation. What do you think?
Original comment by [email protected]
on 26 Jun 2013 at 8:02
from javacv.
Yes!!
We don't need to name it "step", even though for images obviously that's what
is going to be used for the "width". We don't need to know the actual width of
an image to access one pixel in memory. And to generalize more easily, we could
keep the width, height, step or whatever in one "int[] dimensions" array, and
it would even possible to have get/put methods with varargs, but I'm not sure
how that would play with specialized 2D and 3D methods... ?
Anyway, as names I think ByteAccessor, IntAccessor, etc. sound fine, but what
do you think?
When you have something post it here:
http://code.google.com/p/javacpp/issues/
Thanks!
Original comment by [email protected]
on 27 Jun 2013 at 12:47
from javacv.
Bump! It would be really nice to have those classes in JavaCPP. Please let me
know if you are still interested in working on this, thanks!
Original comment by [email protected]
on 22 Sep 2013 at 12:46
from javacv.
Yes, I definitely plan to do it, but we postponed our CV project due to other
tasks, but if nothing changes, I will get back in two-three weeks.
Original comment by [email protected]
on 25 Sep 2013 at 11:08
from javacv.
Hello,
I started to write the API. After writing it, I realized, that the code without
this API is so simple that I hesitate whether new API is really needed.
Example: Let's say we write a function to calculate sum of all pixels of
1-channel image. Using the new pixel accessor API it would be:
public long imgSum(CvMat mat) {
ByteArrayAccessor matacc = mat.createBytePixelAccessor(false);
// cache to avoid repetitive JNI call
int cols = mat.cols();
int rows = mat.rows();
long sum = 0;
for (int i=0; i<rows; i++)
for (int j=0; j<cols; j++)
sum += matacc.get(i, j) & 0xff; // 0xff to convert to unsigned
return sum;
}
By directly accessing the Buffer, the code would be:
public long imgSum(CvMat mat) {
ByteBuffer buf = mat.createDataByteBuffer();
int step = mat.step();
int rows = mat.rows();
int cols = mat.cols();
long sum = 0;
for (int i=0; i<rows; i++)
for (int j=0; j<cols; j++)
sum += buf.get(i*step + j) & 0xff;
return sum;
}
The only saving is the formula i*step + j, that is hidden. Is this worth a new
API?
Smarter programmer could even write it little faster like this:
public long imgSum(CvMat mat) {
ByteBuffer buf = mat.createDataByteBuffer();
int step = mat.step(), rows = mat.rows(), cols = mat.cols();
long sum = 0;
for (int i=0; i<rows; i++) {
int pos = i*step;
for (int j=0; j<cols; j++, pos++)
sum += buf.get(pos) & 0xff;
}
return sum;
}
This avoids i*step multiplication at every pixel, the trick I've learned by
reading OpenCV's code... This trick is not possible with pixel accessor.
If I have time tomorrow, I'll benchmark all three versions.
Please have a look at the patches before I continue with other primitive types
and let me know what do you think.
Viliam
Original comment by [email protected]
on 12 Nov 2013 at 10:15
Attachments:
from javacv.
[deleted comment]
from javacv.
I just remembered, that step is always in bytes, it should be divided with
elemSize(). It's long I did not work with OpenCV. I will rewrite it later.
BTW, our computer vision project is still postponed, maybe for very long. I'm
sad of that, because it was very interesting for me and it would surely be very
good selling mobile application. And finally, we plan to use neither javacpp
nor javacv in the final product, due to performance. We do lot of pixel
processing, and comparable function in OpenCV written in C runs 10 times faster
than ours very similar in Java. That's on android, on desktop, the difference
is not so big (probably due to JIT). But they were very good for us for initial
prototypes, so I feel I owe to contribute...
Original comment by [email protected]
on 12 Nov 2013 at 10:28
from javacv.
It's fine to use Buffer directly, but it does get complicated when accessing
many elements of a matrix. It's a lot easier if we can specify the indices as
we would in MATLAB for example.
Your API looks fine, thanks! Still, I would shorten the names to ByteAccessor,
etc instead of ByteArray..., BytePixel..., etc. because it's shorter and it
works for direct NIO Buffer objects as well, not just arrays or pixels, so it
removes potential confusion. And com.googlecode.javacpp.accessors (should this
be singular or plural?) is good as package name I think. Putting it in a
separate package also indicates that they may be used independently from the
classes in the main package, so good idea, thanks!
BTW, we can still use JavaCPP to call custom C/C++ functions. It's much nicer
than using raw JNI with the NDK, while imposing almost no additional overhead.
Original comment by [email protected]
on 17 Nov 2013 at 2:08
from javacv.
Okay, so I'll finish the rest. I prefer "accessors" (plural). I also prefer
ByteArrayAccessor, instead of ByteAccessor, because it improves readability.
IDEs are to complete names. Anyway, I'll write "BAA" in eclipse, and I'll
probably get better match than with "BA". I even thought of
ByteNdArrayAccessor, but did not like this. But I'll do what you say, let me
know.
I'd like to prepare an article, where I will summarize performance
characteristics of each way, maybe on wiki. Then, I'd like to refer to that
from the javadocs. There I could summarize individual ways of accessing pixels
along with their benchmarks.
Original comment by [email protected]
on 18 Nov 2013 at 6:03
from javacv.
Regarding the usage of JavaCpp in our project, we only have two JNI methods. I
never used JNI before, so it was a way to learn it and to be able to better
understand javacpp, because, due to the lack of documentation and no experience
with JNI it seemed complicated to me...
Original comment by [email protected]
on 18 Nov 2013 at 6:06
from javacv.
I'm not sure it's a good idea to name "ByteArrayAccessor" something that
doesn't access a byte array (it's technically accessing a buffer in the
non-copied case at least)... What do you think?
And something I've just noticed from your code, but `ByteBuffer.put(byte[])` is
very very slow on Android, that's why I added `BytePointer.put(byte[])`:
https://code.google.com/p/javacpp/issues/detail?id=11
So, let's think a bit more about that, and your `copyBack()` method, before
going further.
I've added you as contributor, so please feel free to add a wiki page! Maybe
you'd prefer to add one to the JavaCPP site though?
If you have only two JNI methods, maybe JavaCPP isn't so useful yes.
Original comment by [email protected]
on 24 Nov 2013 at 2:01
from javacv.
Regarding the issue 11, it seems that it is some improper buffer implementation
in android. That's why I even created the otherwise odd Copying version. Maybe
if I use pointer.get(int) method, that would be faster and make Copying version
useless. Have to do benchmark.
Regarding the name, what better name you have? We started with
BytePixelAccessor for CvMat, but then generalized it to all arrays, so
ByteArrayAccessors. Alternatives that come to my mind are:
ByteNdArrayAccesor
ByteNdElementAccessor
ByteNdArrayElementAccessor
ByteAccessor
ByteUtils
BytePointerUtil
BytePointerAccessor
ByteMdArrayAccessor
ByteMultiDimArrayAccessor
ByteElementAccessor
ElementAccessorByte
ArrayElemAccessorByte
ByteArrayElemAccessor
I think, the only self-describing is the longest. Any other would be explained
in javadoc.
Original comment by [email protected]
on 25 Nov 2013 at 3:17
from javacv.
JNI calls are not as efficient as accessing array elements, so the copying
accessor makes sense. But we still need JNI to copy the whole array, something
like BytePointer.get(byte[])/put(byte[]), and we might as well use that.
So, the names could be ByteBufferAccessor and BytePointerAccessor. I think
that's pretty self-describing? Maybe some other name than "Accessor" could
clarify the meaning. Although I wouldn't want to lengthen the name further, so
I'm not sure what...
Original comment by [email protected]
on 26 Nov 2013 at 6:25
from javacv.
Maybe we could name them "Indexer" since it's a concept already used in C# for
multidimensional arrays, but not in Java?
http://www.javacamp.org/javavscsharp/indexer.html
http://msdn.microsoft.com/en-US/library/2yd9wwz4%28v=vs.80%29.aspx
That single word carries both the meaning of Array and Accessor, so we could
shorten the names, i.e.: ByteIndexer...
Original comment by [email protected]
on 30 Nov 2013 at 4:57
from javacv.
Okay, so here are complete patches with Indexer API. Apply both patches to
javacpp (the first adds "accessors", the second deletes it and adds "indexers",
I was lazy to do a clean clone...)
I liked the "Indexer" name most, so I completed it using this. I also did
JUnit-like test, but it is not included in patches, as you don't have JUnit set
up in your project.
I also did performance test, with interesting results:
Test configurations
1. jdk 1.6.0_41-64bit, athlon x2 64, 2-core 3GHz, run with -server switch
2. jdk 1.7.0_17-64bit, athlon x2 64, 2-core 3GHz, run with -server switch
3. Android 2.3.5, Qualcomm MSM8255 1-core 1GHz
(1) (2) (3)
sumBuffer 3507 3432 2 835 870
sumPointer 93962 93992 1 683 551
sumBufferCopy 5176 5275 132 824
sumPointerCopy 3824 4112 125 451
sumIndexer 3342 17884 3 329 522
sumIndexerCopy 4234 20491 742 822
cvSum 1885 1866 16 809
addBuffer 7688 7375 6 068 920
addPointer 207170 238679 3 636 462
addBufferCopy 9399 9423 137 023
addPointerCopy 6683 6488 135 833
addIndexer 7277 20948 7 160 247
addIndexerCopy 7628 8221 1 317 675
cvAddS 1986 1974 33 465
Test sources are in attached Test.java. It tests two operations (intensive to
pixel access). "Sum" adds all pixel values (only does reading), "add" modifies
pixels by incrementing by one. As you see, there are huge differences.
The "never do" version is accessing value through pointer element by element
(sumPointer and addPointer). Android's "never do" is to directly access native
pixels, but rather copy them. I did not reproduce the android slowness with
ByteBuffer.get(int[]) (compare sumBufferCopy and sumPointerCopy)
Indexer access is a bit slower than direct work with buffer or with copied
array on android, but it's code is the simplest. This might be due to
unnecessary method calls, and due to I cannot use premultiplied value the way I
do it when accessing buffer. Strange is, that it is lot slower on JDK 1.7...
Not surprisingly, C-version is always fastest (1.7-3.5 times on HotSpot and 4-7
times on Android. And it is not dependent on JIT/GC mood :)
I have not a device with newer android handy, can you do a test with that? I
expect, this problem will disappear (or maybe disappeared) one day.
My final recommendations is to always prefer C. Use JavaCV only if:
- you don't need to access pixels, if you only call OpenCV's bulk operations
(such as a sequence of various image transformations)
- you need portability (but you will need native libraries anyway)
- you are prototyping (i.e. less segfaults in java)
- your algorithm is fast enough
If you decide to use pixel access, use these method:
Android:
1. access copied data manually
2. use copying indexer for less code and slower speed (6-10 times)
On JDK
1. access buffer manually
2. use direct indexer for less code and slower speed (may not be slower, but
may be up to 5 times slower). Run on a good JRE :).
3. the more times you access the same pixel, the more prefer copying the array
General:
Always create a copy using Pointer.get(), not Buffer.get().
---
The above could be a basis for a wiki page I recommend to write. We could link
to it in the sources.
Please review the patches, and, if you can, try to review the benchmark too, if
I have not done something wrong.
Viliam
Original comment by [email protected]
on 11 Dec 2013 at 9:51
Attachments:
from javacv.
I accidentaly got an Nexus 10 with Android 4.4 handy with ARM Cortex A15 2-core
1.7 GHz, here are the results:
sumBuffer 965 584
sumPointer 5 105 642
sumBufferCopy 34 331
sumPointerCopy 34 406
sumIndexer 1 151 433
sumIndexerCopy 165 345
cvSum 3 335
addBuffer 2 128 167
addPointer 11 186 691
addBufferCopy 35 940
addPointerCopy 35 791
addIndexer 2 574 975
addIndexerCopy 410 906
cvAddS 8 068
Still no change, buffer element access is slow, buffer bulk access is as fast
as pointer's, and C is 10 times faster (in this case opencv cvSum or cvAddS
might even used multiple cores, did not check the code). Still, copying is lot
faster.
Original comment by [email protected]
on 11 Dec 2013 at 10:47
from javacv.
[deleted comment]
from javacv.
Great, thanks! It's going to take me a while to review that. It's actually
important functionality I think, so I want to make sure we get the API right. I
have a couple of comments for now though:
1. Please do provide patches that apply cleanly to the HEAD of master branches
(no need right now though)
2. Are you OK with licensing that under the GPLv2 with Classpath exception? Or
would you prefer something else?
3. Maven supports unit testing if we place files in the right directories, so
feel free to put stuff there
4. You seem to provide a couple of unrelated fixes for opencv_core.java. Are
all the changes that are in there indeed fixes?
5. Have you tried to write data back to native memory? Direct NIO buffers on
Android should be pretty slow in that case.
Thanks for everything!
Original comment by [email protected]
on 15 Dec 2013 at 7:49
from javacv.
1. Patches should apply to the HEAD. There are just two patches, because I did
it in two commits
2. Yes
3. I'll try, hopefully next week.
4. There is only one unrelated fix for CvMat.isContinuous(). It always returned
false, because it used type() instead of raw_type(), and type() has some bits
cleared, including the continuous bit.
5. The "sum" benchmark does not write back, it only reads pixels. The "add"
benchmark does (it modifies pixels).
Original comment by [email protected]
on 15 Dec 2013 at 11:16
from javacv.
5. Does this mean both Buffer.put(array)/get(array) work efficiently on
Android? I wonder why Adam found different:
https://code.google.com/p/javacpp/issues/detail?id=11
Original comment by [email protected]
on 15 Dec 2013 at 1:04
from javacv.
I reproduced issue 11. I actually wonder, why it does not reproduce in my test
case. Maybe because I've tested only ByteBuffer, but in issue 11 they have
problem with floats or ints, I will have to give it a try...
Anyway, I use Pointer's bulk methods for copying in the Copying indexers.
Original comment by [email protected]
on 15 Dec 2013 at 1:26
from javacv.
Ok, so let's take a step back. "NIO" stands for "New I/O", and the intention
was that we should be able to do everything and do it efficiently using only
*Buffer objects, and nothing else, but on Android this isn't the case, yet. So
we're left to hack things up, for now. And your original proposal was optimal
in that sense.
What would you think of the following then. We have only Indexer per type
(ByteIndexer, ShortIndexer, etc) and they all accept a Buffer object of the
corresponding type. But we also provide a flag to indicate whether we want to
use it in "copying" mode. (BTW, it's possible to create a Pointer from a
Buffer. That's what I'm doing in FFmpegFrameRecorder.java for example.) And
eventually when Android gets better, this flag should become obsolete, but at
least we won't be stuck with a hackish API.
How does that sound?
Original comment by [email protected]
on 16 Dec 2013 at 3:56
- Changed state: Started
from javacv.
Sounds good. But the normal use case for copying version would be the following:
CopyingByteIndexer ix = (CopyingByteIndexer) ByteIndexer.createXxx(..., true);
// doing work...
// ix.copyBack();
If we would want to silently ignore the copying flag, we would not return
CopyingByteIndexer instance anymore. So I recommend to make Copying- and
Direct- subclasses package-visible (i.e. default visibility) and put the
copyBack() to the abstract method, which would be no-op for Direct- version. It
will be cleanly stated in the documentation, when we need to call copyBack().
I did tests with IntPointer/IntBuffers, here are the results:
(1) (2) (3)
sumBuffer 1673 1781 1055676
sumPointer 23049 22984 452716
sumBufferCopy 4172 4168 1081054
sumPointerCopy 3026 3026 40960
sumIndexer 1612 5785 1180523
sumIndexerCopy 4137 6935 302331
cvSum 1399 1364 9222
addBuffer 2325 2289 2119384
addPointer 55452 48051 865838
addBufferCopy 7648 7529 1084564
addPointerCopy 4990 4744 45794
addIndexer 2230 6889 2490893
addIndexerCopy 7215 7943 448577
cvAddS 1778 1870 11267
Now the slowness of bulk buffer get is reproduced (the issue 11 problem). And
pointer copy takes 2/3 of the time of buffer copy even on JDK, so it's always
faster.
What's interesting, with bytes, the addPointerCopy was faster than addBuffer.
With ints, it's the opposite. Now it's very tough to give a general rule of
what's faster. I can think of these:
- on Android, always work on a copy. On JDK, do your own test
- work manually with a buffer or a copy for better speed
- use indexer for shorter but slower code
Tomorrow, I'll rewrite the code to always use buffer parameter. And what do you
think about hiding the copying and direct subclasses? Anyway, I don't think we
will be able to remove copying version anytime sooner than in 15-20 years :)
Viliam
Original comment by [email protected]
on 16 Dec 2013 at 5:54
from javacv.
Well, I don't know about the subclasses either. In the case of a direct NIO
buffer, we can copy its content to a non-direct NIO buffers. We would basically
use the two-class pattern at the Buffer level. No need to duplicate it a level
higher, if it's as efficient that way. I'm pretty sure that would be as
efficient as an array, even on Android, but we should test this out.
BTW, when comparing with optimized functions like cvSum(), be sure to optimize
the code in Java as well, by unrolling loops, etc:
http://stackoverflow.com/questions/10784951/do-any-jvms-jit-compilers-generate-c
ode-that-uses-vectorized-floating-point-ins/10809123#10809123
Original comment by [email protected]
on 16 Dec 2013 at 6:12
from javacv.
Not sure about your 15~20 years estimate either
http://www.extremetech.com/computing/170677-android-art-google-finally-moves-to-
replace-dalvik-to-boost-performance-and-battery-life
:)
Original comment by [email protected]
on 16 Dec 2013 at 6:19
from javacv.
Here are results for Android with unrolled loop, compared to C
(1) android 2.2 1-core 1GHz with unrolled loops
(2) same as above, without unrolling
(1) (2)
sumBuffer 981770 1055676
sumPointerCopy 31024 40960
cvSum 9002 -
As you see, a bit better, but not a lot. Maybe the Android's JIT does automatic
unrolling, but still not comparable to C. So still no reason to use java on
android if you need performance. I checked the source of cvSum, they explicitly
unroll 4 times in case of 1-channel images. In others, they just do simple loop
(and the compiler might do automatic unroll).
I also tried the the unrolled version on HotSpot, the difference is negligible:
(1) with unrolled loops, jdk 1.6.0_41-64bit, athlon x2 64, 2-core 3GHz, run
with -server switch
(2) same as above, without unrolling
(1) (2)
sumBuffer 1696 1673
sumPointerCopy 2950 3026
cvSum 1453 -
Note, that I measure the the times as an average of 5 runs, after 15 "heat up"
runs, with the -server JVM switch (-server causes lower threshold for the JIT
to take place). Try to do this with your example at
http://stackoverflow.com/a/10809123/952135, probably openjdk's JIT does not
unroll automatically.
Copying still helps a lot, even on android 4.4 (see comment 35). I will test
the ART and will let you know, maybe then we can disable copying version in 10
years, when non-art is obsolete...
Tomorrow, I'll rewrite indexers to your previous notes.
Original comment by [email protected]
on 18 Dec 2013 at 7:44
from javacv.
Well in any case, I think we should compare with the same loop in C++ instead
of calling cvSum(), just to be fair.
To be clear, I see the constructor of ByteIndexer doing something like this:
ByteIndexer(Buffer buffer, boolean cached) {
if (cached) {
this.buffer = this.cache = ByteBuffer.allocate(buffer.size());
this.pointer = new Pointer(buffer);
} else {
this.buffer = buffer;
}
}
With a couple of helper methods:
void fillCache() {
pointer.get(cache.array());
}
void flushCache() {
pointer.put(cache.array());
}
Seems like that would be the best things to do...
Original comment by [email protected]
on 18 Dec 2013 at 11:52
from javacv.
Tried to implement suggestions from comment 46, but failed due to slower
performance:
1. Using heap buffer instead of array is slower:
(1) Copying version with buffer
(2) Copying version with array
(1) (2)
sumIndexerCopy 6348 4137
addIndexerCopy 10934 7215
Tested only on HotSpot 1.6, but I have not doubt it will be slower on Android
2. Using buffer to initialize indexers is slower in case of many small
matrices. It does unnecessary conversion from pointer to buffer and back to
pointer. We can do about 1000 such conversions per ms on my 3GHz Athlon x2. On
android, it will be definitely longer.
CvMat (and probably all JavaCPP-mapped classes) return their arrays as
Pointers. I don't think the API from latest patch is "hackish". It is part of
JavaCPP and supposed to simplify iterating multidimensional arrays, and it is
created from pointer, that is commonly used by JavaCPP.
3. I renamed copyBack() to flushCache(), and added fillCache() method, moved
that to the base class and renamed the "copying" flag to "cached". Made
subclasses package-visible.
4. I added JUnit test.
Original comment by [email protected]
on 19 Dec 2013 at 8:17
Attachments:
from javacv.
Could you provide the code that gave you these results? (BTW, no need to update
all your classes until we decide on the final API...)
Running the attached TestIndexer on my gear here (java version "1.7.0_45"
OpenJDK fedora-2.4.3.0.fc19-x86_64 u45-b15) outputs this:
-249231677 171
-249231677 209
-249231677 219
-249231677 223
-249231677 233
-249231677 223
-249231677 167
-249231677 167
Doesn't look like there's a clear winner: Both are fast. What do you get?
Original comment by [email protected]
on 20 Dec 2013 at 2:38
Attachments:
from javacv.
Try to run more iterations, and add -server switch to the jvm (don't know if
openjdk supports it). The aim is to JIT to take place. I dont have my code, I
rewrote the indexer and ran test from comment 34. After seeing that it is
slower, I reverted it.
Original comment by [email protected]
on 20 Dec 2013 at 5:48
from javacv.
[deleted comment]
from javacv.
I ran your test on my machine. Here are the results:
1. HotSpot 1.7.0_17 64-bit
2. HotSpot 1.6.0_41 64-bit
3. Android 4.4 nexus 7 2012
(1) (2) (3)
Array 270 270 412
Buffer 290 240 2262
So not so big difference on HotSpot. Apparently, we are in the area of
microoptimisation, and it seems very dependent on runtime configuration. As we
saw earlier, caching indexer is only useful on Android (or maybe some less
performant VMs too), and the third result decides it (I only used 5MB array,
because of memory constraints).
So for this moment, I think, that the version from the last patch is best so
far...
Original comment by [email protected]
on 22 Dec 2013 at 10:04
from javacv.
I will not have access to my development computer until January 7th, but I will
be able to reply. I wish you Merry Christmas and a happy new year :). I'm glad
for this cooperation, we will be definitely a lot better than the current API...
Original comment by [email protected]
on 22 Dec 2013 at 10:06
from javacv.
Thanks for trying it out! Yes, I understand that you may not always have access
to your machines, that's fine...
So, it looks like Android doesn't like it that way. Let's think about this a
bit more. I'm not a big fan of overengineering... Merry Christmas!
Original comment by [email protected]
on 23 Dec 2013 at 3:02
from javacv.
I think I see the problem with Android. According to "Ben, one of the engineers
working on the JIT @ Google":
http://stackoverflow.com/questions/4912695/what-optimizations-can-i-expect-from-
dalvik-and-the-android-toolchain
Dalvik can only inline methods that do not have branches, but ByteBuffer.get()
does branch in checkIndex():
https://android.googlesource.com/platform/libcore/+/master/luni/src/main/java/ja
va/nio/ByteArrayBuffer.java
So it looks like we can't use NIO buffers at all if we're looking into having
high-ish performance on Android :(
Original comment by [email protected]
on 23 Dec 2013 at 3:19
from javacv.
Happy New Year! So, any news on your part?
I have a couple of new ideas about how the API should look like, so I'm jotting
them down here. For consistency, I think we should mimic how things are done
with Buffer and in C/C++. In the case of ByteIndexer, for example:
ByteIndexer wrap(BytePointer pointer, int ... sizes)
ByteIndexer wrap(ByteBuffer buffer, int ... sizes)
ByteIndexer wrap(byte[] array, int offset, int ... sizes)
byte get(int[] indices)
ByteIndexer put(int[] indices, byte b)
along with optimized get() and put() for 1, 2, and 3 dimensions
I don't think we'd need to provide a factory method for "steps" because that's
not how it works with multidimensional arrays in C/C++ anyway, and AFAIK not
even OpenCV ever aligns rows that badly on purpose, or does it?
What do you think?
Original comment by [email protected]
on 11 Jan 2014 at 10:48
from javacv.
Hello,
regarding the name, I don't mind, but I personally prefer "create" version,
as this is commonly used in Java for factory methods. On Wikipedia they use
"new" prefix. "Wrap" evokes the delegate pattern, which is not exactly this
case.
Secondly, OpenCV uses non-continuous arrays for submats (cvGetSubRect), so
we need the "steps" version.
I will add the array version of get/put probably tomorrow.
D�a 11.1.2014 11:48 pou��vate� <[email protected]> nap�sal:
Original comment by [email protected]
on 13 Jan 2014 at 5:54
from javacv.
Thanks, but no need to update the source right away!
The problem with "create" is that we need to put something after, while "wrap"
is already a name used by ByteBuffer, etc to do exactly the same thing (that
is, add missing functionality from the wrapped object). I've already done that
for capacity(), position(), and limit() for example and it's working just fine,
so IMO we might as well keep doing the same thing :)
AFAIK, cvGetSubRect() is never going to cut in the middle of a pixel, so even
if the start position changes, the effective "width" isn't going to change.
Unless we keep two sets of variables? Could you elaborate on how is this could
work for 4, or even 10 dimensional access?
Original comment by [email protected]
on 14 Jan 2014 at 12:48
from javacv.
Wait a minute, when creating a 24-bit per pixel RGB image with a width of 1001,
OpenCV returns an image with a "widthStep" of 3004 bytes: Ouch.
So, I guess your proposal is to base everything on steps and not on dimensional
sizes?
Original comment by [email protected]
on 17 Jan 2014 at 1:21
from javacv.
Sorry for responding late.
We need to keep the "steps" version, and we can decide on keeping the
"dimensions" one. I think the latter it is simpler to understand for javists,
who do not commonly use array padding or store multidimensional arrays in 1D
ones. But, on the other side, if someone will try to create indexer for a CvMat
without understanding it's internal storage, trying to provide
rows/cols/channels as dimensions (which he understands) instead of steps (which
he doesn't) will lead to more confusion. So I don't know whether to keep it or
remove it for the same reason of simplicity. As of the rule "less code is
better", I would remove it, it will force people to learn the necessary
internals or use the CvMat.createXxxPixelIndexer.
And all our indexers should only be created from Pointers. There are already
methods to convert bytes/pointers/arrays from/to each other. Our utilities are
supposed to be used with JavaCPP classes, which always converts pointers to
buffers and not vice versa and never uses java arrays. The other methods would
only be useful, if indexers are used as a general purpose utility for data
coming from somewhere else, which I don't expect. I think, this would be
overengineering and not really needed functionality.
Original comment by [email protected]
on 20 Jan 2014 at 6:40
from javacv.
Yeah, we'll go with steps. Although we might want to name it "stride" because
that's what they use in Java:
http://docs.oracle.com/javafx/2/api/javafx/scene/image/PixelReader.html
http://docs.oracle.com/javase/7/docs/api/java/awt/image/Raster.html
I wouldn't consider it overengineering, because we have to use a Java array in
the case of Android. And actually I think it would simplifying things. We'd
have ByteBufferIndexer that does its thing with ByteBuffer (created from a
Pointer or not), and ByteArrayIndexer, that would accept an array, or create
one from a Pointer. This way, the user wouldn't have to consider
ByteArrayIndexer as some sort of "cache" or "copy". Conceptually it would just
be doing something with an array, which it got from a Pointer maybe. And we
could have easy-to-understand ByteArrayIndexer.readPointer()/writePointer() or
something that would do exactly what one would expect! Sounds nice to me. What
about you?
Original comment by [email protected]
on 24 Jan 2014 at 8:20
from javacv.
The array thing is only to solve poor performance of direct ByteBuffer on
Android. While working with data on native heap, we should normally access it
via Buffer-s, that should be in principle the fastest. But due to poor
implementation on android, we have to hack it with an array, and not only that,
we have to get that array via Pointer. For the user, the change is only in one
boolean parameter, that could be stored in one global constant and changed
appropriately.
I still disagree with making this a general purpose utility for accessing data
of this structure, for few reasons:
- CV applications are usually performance constrained. For this, usually the
most direct method is the fastest. In our case, even using indexer is bit
slower than directly using buffers.
- It is trivial to write utility such as this one, if one understands the data
structure.
- No one will search for it here, JavaCPP or JavaCV is not a general purpose
utility package. If such package was already present in OpenCV, then this does
not hold.
- It's purpose is mainly to provide convenience method for accessing elements
of CvMat, as a faster replacement of now deprecated methods of CvMat.
Similarly, I don't agree with "stride". We are not translating OpenCV to Java
terminology. OpenCV uses "step", so should we. For this reason, I would even
rename the "createBytePixelIndexer" to "createByteDataIndexer" in the CvMat
class, because it is internally named "data", though in most cases is's
"pixels". We should stick to simple mapping.
Maybe we could add the Indexer factory to another JavaCpp-ported libraries,
which have similar structure of data. If it is not consistently named "steps"
here, then I will agree with using "stride" within java.
But you have the last word.
Original comment by [email protected]
on 27 Jan 2014 at 6:28
from javacv.
Let's take a step back here. My intention with whatever goes in JavaCPP is to
have it available to more than just OpenCV. Of course it's "trivial", but it
happens that many people like to use MATLAB because it offers that kind of
"trivial" functionality. Even Scala provides a feature for that:
http://www.tutorialspoint.com/scala/scala_arrays.htm
If someone were to add that to some other package, that would be nice.
Unfortunately, it's not happening, so I thought I might as well offer it as
part of JavaCPP, instead of only JavaCV, that's all. And if one were to happen
to use some other library in Java that uses arrays (because that's the whole
point of using Java, we're not limited to OpenCV), then that user could use
those facilities to exchange data more easily between whatever other library
one wishes to use with OpenCV, such as JavaFX, for example.
BTW, OpenCV *does* already offer that kind of "trivial" functionality, but JNI
cannot yet inline native functions, so if we want performance, we need to rely
on direct NIO buffers, or on arrays in the case of Android.
There must be some better name for "stride", "step", or "pitch"... Wikipedia
doesn't offer much insight I find:
http://en.wikipedia.org/wiki/Stride_of_an_array
Original comment by [email protected]
on 27 Jan 2014 at 8:34
from javacv.
What do you think of this API?:
package com.googlecode.javacpp.indexers;
import java.nio.ByteBuffer;
import com.googlecode.javacpp.BytePointer;
/**
* Convenient way to access elements of an signed 8-bit multidimensional array.
*
* <p>Indexerss are available for all java primitive types.
*
* @author Viliam Durina
*/
public abstract class ByteIndexer {
protected final int[] strides;
public ByteIndexer(int[] strides) {
this.strides = strides;
}
/** Get element [i, j] of multidimensional array */
public abstract byte get(int i, int j);
/** Get element [i, j, k] of multidimensional array */
public abstract byte get(int i, int j, int k);
/** Get element at specified coordinates of multidimensional array */
public abstract byte get(int ... coords);
/** Set element [i, j] of multidimensional array */
public abstract void put(byte v, int i, int j);
/** Set element [i, j, k] of multidimensional array */
public abstract void put(byte v, int i, int j, int k);
/** Set element at specified coordinates of multidimensional array */
public abstract void put(byte v, int ... coords);
/** Copies modification back to the native memory after processing is done. No-op, if not cached. */
public abstract void flushCache();
/** Discards modifications to the cached copy and populates cache anew. No need to call this after creation. No-op, if not cached */
public abstract void fillCache();
/**
* Factory method.
*
* @param pointer The native pointer. It should have the {@link BytePointer#capacity(int)} set prior to the call.
*
* @param strides <code>strides[0]</code> - byte position of the [1, 0, 0]th element,
* <code>strides[1]</code> - byte position of the [0, 1, 0]th elemnts etc.
* If the array is contiguous, see {@link #calcContiguousStrides(int[])}
* See <a href='http://en.wikipedia.org/wiki/Stride_of_an_array'>Wikipedia/Stride of an array</a>.
*
* @param cached Whether to copy the buffer (this gives faster access on Android devices, tested up to android 4.4,
* but do your own test)
*
* @return Initialized instance
*/
public static ByteIndexer wrap(BytePointer pointer, int[] strides, boolean cached) {
if (cached)
return new CachingByteIndexer(pointer, strides);
else
return new DirectByteIndexer(pointer.asBuffer(), strides);
}
/**
* @see #wrapNonUnit(ByteBuffer, int[], boolean)
*/
public static ByteIndexer wrap(ByteBuffer buffer, int[] strides, boolean cached) {
if (cached)
return new CachingByteIndexer(new BytePointer(buffer), strides);
else
return new DirectByteIndexer(buffer, strides);
}
/**
* @see #wrapNonUnit(ByteBuffer, int[], boolean)
*/
public static ByteIndexer wrap(byte[] data, int[] strides, boolean cached) {
if (cached)
throw new UnsupportedOperationException("We don't support caching of java arrays");
else
return new CachingByteIndexer(data, strides);
}
/**
* Calculates stride values for supplied dimensions for contiguous array.
*
* See <a href='http://en.wikipedia.org/wiki/Stride_of_an_array'>Wikipedia/Stride of an array</a>.
*/
public static int[] calcContiguousStrides(int[] dims) {
int strides[] = new int [dims.length];
strides[dims.length -1] = 1;
for (int i=dims.length - 2; i>=0; i--)
strides[i] = dims[i] * strides[i+1];
return strides;
}
}
It has "wrap" method for strides. I removed the createForDims factory method,
and replaced it with calcContiguousStrides, which calculates strides from
dimensions. I also gave the link to the wikipedia page, which I find useful for
those, who are not familiar with "strides".
I don't like the wrap(byte[]) method. I throw UnsupportedOperationException
there, if caching is requested. For caching to work (if we want it), I would
need to crate a third ByteIndexer subclass that will clone the array and be
able to copy back. Or complicate the code of CachingByteIndexer to accept array
which it will clone. Do we really want that?
Original comment by [email protected]
on 28 Jan 2014 at 7:01
from javacv.
Well, that still looks like a poor hack... As I explained a few times already,
I'd like to put something a bit more generally useful in JavaCPP. You said that
"even using indexer is bit slower than directly using buffers", but this isn't
even true. With a properly optimizing compiler, everything can gets inlined and
moved around appropriately. Here's proof that OpenJDK 7 does that already. This
is the output of the attached "TestIndexer2" on my machine:
indexer: 3151 ms
array: 3152 ms
This shows that it doesn't need to be slower than directly using arrays or
buffers.
Now, about the API, I guess we should try to make it similar to C# since they
already have something like that in there:
http://msdn.microsoft.com/en-us/library/system.array.createinstance
So, naming the factory method "create" might not be such a bad idea, but then
again in our case we don't allocate the data, so "wrap" might still be a
better... Also, we note that the user provides the dimensions of array, not the
strides. The runtime should by default allocate whatever is most efficient for
the machine, but we must also provide an additional optional "stride"
parameter, just like it's possible with OpenCV. Consequently, the factory
methods might look something like
ByteIndexer.wrap(byte[] array, int ... sizes)
ByteIndexer.wrap(ByteBuffer buffer, int ... sizes)
ByteIndexer.wrap(byte[] array, int[] sizes, int[] strides)
ByteIndexer.wrap(ByteBuffer buffer, int[] sizes int[] strides)
And the accessors:
byte get(int ... indices)
ByteIndexer put(int i, byte b)
ByteIndexer put(int i, int j, byte b)
ByteIndexer put(int i, int j, int k, byte b)
ByteIndexer put(int[] indices, byte b)
With parameters ordered to be consistent with both arrays and NIO buffers.
Along with a couple of utility methods like read(Pointer)/write(Pointer) I
guess.
That's starting to sound like something I want to use. Do you feel differently?
Do you still feel like this would not be generally useful functionality?
Original comment by [email protected]
on 3 Feb 2014 at 1:06
from javacv.
Original comment by [email protected]
on 3 Feb 2014 at 1:08
Attachments:
from javacv.
First to the benchmark: I was bit surprised with your results, but now it seems
logical to me. I even tried to rewrite the array version to something I thought
should be faster, but actually, it was a bit slower than before:
static long testArray(byte[] array) {
long sum = 0;
for (int i = 0, row = 0; i < SIZE; i++, row += SIZE) {
for (int j = 0, idx = row; j < SIZE; j++, idx++) {
sum += array[idx];
}
}
return sum;
}
I'm rather confused of this micro-optimisation. In my previous tests, the
indexer version was really slower, look at comment 34, test code is there. But
not always :(. Maybe it was because of base class and subclasses and virtual
methods. Definitely, direct access is simpler for compiler to optimize, even
though it might be able to optimize the indexer version to the same performance.
But leave that for now, we will include indexer in any case. Let's propose the
API. I don't understand the method
ByteIndexer.wrap(byte[] array, int[] sizes, int[] strides)
Why do we need both sizes and strides? Internally, we will only use strides. If
we did range checking, it would only make things slower.
And should we not support automatic copying and leave that to the user? That
would remove the need to have base class and subclasses, but the code would be
more complicated, and does it have performance benefit on android to omit it?
So please finalize the API, and I will implement it. I'd wish to finally finish
this neverending discussion :)
Original comment by [email protected]
on 4 Feb 2014 at 5:38
from javacv.
Sorry for the delay. Been busy with other stuff, check it out BTW:
http://code.google.com/p/javacpp/wiki/Presets
And forgot to reply.
Regarding your comment #34, "sumBuffer" and "sumIndexer" are pretty much the
equal, and the same for "addBuffer" and "addIndexer" so I'm not sure what you
are surprised about. Of course it's a bit more difficult for a compiler to
optimize virtual calls, but Hot Spot does it just fine.
Some code that uses an `Indexer` might need to know the sizes to perform some
operation. So we'd need to provide some `int[] sizes()` method, and also a
`int[] strides()` one too I guess.
What do you mean by "automatic copying"? When would the data get copied? AFAIU,
this requires manual intervention, however we look at it. Since we have to do
it manually anyway, we might as well make the interface more general, that's
all. If we can do it automatically somehow, then that's a different matter, so
please do explain.
Yeah, I agree, it would be nice to be able do this work in person :)
Original comment by [email protected]
on 2 Mar 2014 at 5:05
from javacv.
Unless you have some news concerning the support for the "automatic copying"
you were mentioning, I think I've finally come up with a design that fully
satisfies me! Basically, I thought of having an interface Indexable similar to
Iterable, where the Indexable/Indexer pair has similar interactions to the
Iterable/Iterator one. This way, the Indexer code needs to know nothing about
Pointer, and we can copy the data around in a much more natural way.
public abstract class Indexer {
protected int[] sizes;
protected int[] strides;
public int[] sizes() { return sizes; }
public int[] strides() { return strides; }
public abstract Object array();
public abstract Buffer buffer();
}
public abstract class ByteIndexer extends Indexer {
public ByteIndexer wrap(ByteBuffer buffer, int ... sizes) { ... }
public ByteIndexer wrap(byte[] array, int offset, int ... sizes) { ... }
public ByteIndexer wrap(ByteBuffer buffer, int[] sizes, int[] strides) { ... }
public ByteIndexer wrap(byte[] array, int offset, int[] sizes, int[] strides) { ... }
public abstract byte get(int i);
public abstract byte get(int i, int j);
public abstract byte get(int i, int j, int k);
public abstract byte get(int ... indices);
public abstract ByteIndexer put(int i, byte b);
public abstract ByteIndexer put(int i, int j, byte b);
public abstract ByteIndexer put(int i, int j, int k, byte b);
public abstract ByteIndexer put(int[] indices, byte b);
}
class ByteBufferIndexer extends ByteIndexer { ... }
class ByteArrayIndexer extends ByteIndexer { ... }
public interface Indexable {
<I extends Indexer> I createIndexer(boolean direct);
void readIndexer(Indexer i);
void writeIndexer(Indexer i);
}
With that, the class the implements the interface does what it needs to do to
return the appropriate Indexer. readIndexer()/writeIndexer() would do the
copying around, but could be implemented as a noop when it happens that the
Indexable and the Indexer both point to the same area of memory.
Of course we're going to need to do some benchmarking to make sure we don't
sacrifice any performance with this class hierarchy...
Concretely, I'm thinking of having the AbstractMat here implement this:
http://code.google.com/p/javacpp/source/browse/opencv/src/main/java/com/googleco
de/javacpp/helper/opencv_core.java?repo=presets#2188
And maybe eventually apply that to AbstractCvMat as well, but we could simply
leave the old interface as deprecated since OpenCV itself has deprecated CvMat.
In any case, please let me know what you think! If it's something you'd like to
work on, please let me know. (If not, I will be implementing something like
that myself eventually.) Thanks!
Original comment by [email protected]
on 16 Mar 2014 at 6:36
from javacv.
That said, since the get()/put() methods of an Indexer doesn't change its
state, unlike the next()/remove() methods of an Iterator, we might want to have
our Indexable reuse the Indexer object. So, something like this may make more
sense:
public interface Indexable {
<I extends Indexer> I acquireIndexer(boolean direct);
void releaseIndexer(Indexer i);
}
Where acquireIndexer() would read out data, and releaseIndexer() would write it
back out (both being noops for direct indexers). I feel that this is about as
"automatic" as it could get. What do you think? Do you have a better idea?
Original comment by [email protected]
on 16 Mar 2014 at 2:05
from javacv.
Hello Sam,
actually I'm confused of this micro-optimisation. I don't see this API that
much important as you do. When I started with OpenCV/JavaCV, I did not know, I
used CvMat.get(i,j) method, which is veery slow. Any of methods suggested here
is good for me. For performance sensitive stuff I would go with C anyway. We
still did not rewrite our alg. to C, but I expect double performance and stable
speed from the first run (currently, our algorithm takes 3-4 seconds to
complete, but the first run takes 10-12 sec. due to JIT taking place).
Regarding acquireIndexer/releaseIndexer, in case of copying indexer we would
double the memory usage, even after the work is done.
And how would the Indexable interface know, which of byte/short/... Indexer to
return? And the readIndexer should populate any type of indexer?
I will write any version you say is good for you :)
Viliam
Original comment by [email protected]
on 17 Mar 2014 at 6:08
from javacv.
Maybe we could simplify the interface even more:
public interface Indexable {
<I extends Indexer> I indexer(boolean direct);
}
And have an abstract release() method in Indexer. However, that way, the
Indexable must be able to subclass the Indexer, and call the constructors for
all the concrete classes, so having wrap() factory methods becomes moot... What
do you think?
The Indexable would return the correct subtype of Indexer based on its own
internal information. It doesn't occur to me that a user might want to access
floats as bytes or something. Do you see it differently?
Or course a non-direct indexer would double memory consumption, but that's just
a workaround for crappy JIT compilers.
It's not about micro-optimization, it's about usability. With this kind of
interface, we could easily, for example, scale all the elements of the first
column this way:
Mat Amat = ...
FloatIndexer A = Amat.indexer();
for (int i = 0; i < A.size[0]; i++) {
A.put(i, 0, A.get(i, 0) * scale);
}
A.release();
We don't need to know anything about the number of columns, or the step/stride,
and we don't need to use some awkward "getColumn()" utility method of some
sort: We use nothing but simple Java arithmetic. People like it when they can
do things easily. That's why a lot of people use MATLAB, for example. But
MATLAB is slow when it comes to loops and stuff, so it's hard to generalize.
Java, on the other hand, with a proper compiler, can be as fast as C++, but
unlike C++, we can make it crash proof. I'm still confused myself as to why no
one "gets it", but I guess I'll just have to keep working on it until these
obvious (to me at least) concepts become more widely accepted...
Original comment by [email protected]
on 17 Mar 2014 at 12:27
from javacv.
I think this interface looks good too. Is this a version I should implement?
What I meant about the double memory, that if the Indexable reused the Indexer
instance (comment 69), it will hold double memory even after the processing is
done.
Original comment by [email protected]
on 18 Mar 2014 at 5:01
from javacv.
Sounds good! I'll be refactoring that helper file a bit with an AbstractArray
under AbstractMat and CvArr, but you can start implementing whenever you feel
like it, thanks!
Releasing the memory would hurt performance, but a user that doesn't care about
performance on a poor compiler should be using the direct one anyway, so I
think it's better not to release it. What do you think?
Original comment by [email protected]
on 19 Mar 2014 at 1:47
from javacv.
Ok, I've added such an AbstractArray class at the top of
helper/opencv_core.java in this revision:
https://code.google.com/p/javacpp/source/detail?r=d5bd9f3afe353618a30f0fa55a38d3
e10ce5643d&repo=presets
The plan would to make that class implement Indexable. Comments?
Original comment by [email protected]
on 24 Mar 2014 at 1:48
from javacv.
BTW, I've moved the project site to GitHub:
https://github.com/bytedeco
So let's work over there on this from now on, thanks!
Original comment by [email protected]
on 29 Apr 2014 at 12:51
from javacv.
I started to work on a project where I needed this functionality, so I've
implemented it here:
https://github.com/bytedeco/javacpp/commit/4d8dc9d8b244b79ca3c9d9355e476f028f3f3
6f9
https://github.com/bytedeco/javacpp-presets/commit/ab8bde373733190fe204a3bb8d183
7fbb7c62b4f
Thanks for all the discussion and help with this! It was really appreciated :)
Original comment by [email protected]
on 19 Oct 2014 at 12:29
- Added labels: Type-Enhancement
- Removed labels: Type-Defect
from javacv.
Great!
Sorry for committing myself and not doing what I promised, but I really lose
track of what is the best version and did not have time. Your version looks
good, maybe I'll review it more thoroughly.
But you forgot to fix the original problem of this bug. See comments up to #2.
Maybe we can keep returning cached xxxBuffer instance (as of comment #3), but
from a deprecated method, and create the createXxxBuffer() method, that will
return fresh instance.
Original comment by [email protected]
on 20 Oct 2014 at 10:01
from javacv.
Thanks for your review!
They've been marked as deprecated in the parent `AbstractMat` class, at least
for the benefit of the C++ `Mat` class, and a new `createBuffer()` method
added. But I see that the annotation doesn't get inherited by `CvMat`. It's not
a problem anymore though because all the C++ functions that used to modify
`CvMat` are now using `Mat` instead, so the original issue with `CvMat` doesn't
happen anymore. Does that look OK to you?
Original comment by [email protected]
on 25 Oct 2014 at 1:21
from javacv.
Hello,
I reviewed the changes, as far I can tell, the Indexer implementation looks
correct.
I suggest to deprecate all the get() and put() methods and the reset() method
in AbstractCvMat, as they are very slow, in favor of the Indexer API.
Secondly, I would always add comments to deprecated methods. If you have a
codebase and upgrade some library and it deprecates something, I'd like to know
why and what should I replace it with.
I would mark all overriden getXxxBuffer methods as deprecated (with comments),
just for clarity.
And I would add typed createXxxBuffer methods, as the createBuffer method
returns Buffer, which is unusable without casting. It is a question, whether it
should fail if changing the type here, as it is rarely useful to see, say, a
mat with CV_16S as ints, I think...
Original comment by [email protected]
on 3 Nov 2014 at 7:05
from javacv.
And I would add comment to the createIndexer method, at least to its "direct"
parameter. Something like:
@param direct If false, the native buffer is first copied to an Java array and
copied back during the call to #release(). This is an optimization for
platforms, where direct Buffer access is slow (specifically Android, but do
your own test). Changes the behavior of the indexer in a way, that until the
release() call, changes are not visible to the underlying object. Doubles
memory usage.
Original comment by [email protected]
on 3 Nov 2014 at 7:48
from javacv.
I found another issue. Indexer has the release() method. The name invokes, that
the associated resources are released upon this call. It only has associated
resources in case of non-direct indexer, but does not release the copied array.
I can think of few solutions:
1. rename release() to finish() or commit() or copyBack(). The createIndexer
method documentation should tell, that you should call that method after work
in case of non-direct indexer.
2. release should release internal reference to array (or buffer) and further
work with indexer will fail with NullPointerException (or better
IllegalStateException)
3. The copy/copyBack logic could be encapsulated in the XxxArrayIndexer: it
could have Pointer constructor.
I vote for 2: that is the easiest to document and use. And if the user will
always call release (whether or not it is needed), he can safely change from
direct to non-direct.
Another point:
The createIndexer method could be member of the Pointer class (or its
subclasses). That way javacv needs not to have any change. And it will better
reflect the OpenCV data structure, with the user knowing that Indexer is a way
to access ND arrays referenced by pointers, and not a special feature of Mat.
He could write:
mat.data_i().position(0).limit(mat.size()).createIndexer();
Original comment by [email protected]
on 3 Nov 2014 at 8:42
from javacv.
Thanks for the review I'll add a couple of `@see createBuffer()` tags for the
deprecated methods. `createBuffer()` doesn't return `Buffer`, it returns `B` a
subclass of `Buffer`, so I think that's OK?
As for CvMat, there are some methods that don't map to anything right now
because I'm not sure how those should be done, so deprecating either all of
them or just some of them doesn't make sense to me. Does it make sense to you
in some way to deprecated functionality that has no replacement? Anyway, the
whole class is deprecated by OpenCV, so new code should not be using that. :)
And the problem with the `direct` parameter and the `release()` is that they
depend on the implementation. It's up to the implementing class to do whatever
works with their underlying data. After all, if the data is available
originally in an array, `direct` could mean a few different things, and
`release()` be not implemented, so all these implementation details can't be
part of the interface...
But, yes, I think it would make sense to make `Pointer` and its subclasses
implement all that, and then it would make sense to specify the implementation
details there. (Although they should be clear just by looking at the code :)
Let me think about that
Original comment by [email protected]
on 3 Nov 2014 at 9:00
from javacv.
`Pointer` doesn't contain any information about the dimensions. Where would
that come from?
Original comment by [email protected]
on 8 Nov 2014 at 4:29
from javacv.
In the end I've added `Indexer.create(Pointer ...)` factory methods:
https://github.com/bytedeco/javacpp/commit/cd632cfb0421639355a9622ce1285469dd5ff
add
It looks good like that I think. Thanks for pointing this out! This way we have
a consistent hierarchical-like relationship between Pointer < Indexer < Mat.
Original comment by [email protected]
on 9 Nov 2014 at 1:05
from javacv.
This functionality has finally been included in JavaCPP 0.10!
http://bytedeco.org/news/2014/12/23/third-release/
Thanks for all the help with this, and if you have any improvements, send a
pull request! Thanks
Original comment by [email protected]
on 25 Dec 2014 at 11:34
- Changed state: Fixed
from javacv.
Related Issues (20)
- FFmpegFrameGrabber won't grab till the end of video. HOT 12
- Conversion of audio in FFMpegFrameGrabber HOT 6
- Missing default metadata in FFmpegRecorder HOT 4
- .so files gets deleted on Build&Run (Android) HOT 4
- Execution error HOT 1
- capture problem HOT 4
- Clear documentation on how to use callbacks from FFMpeg and OpenCV in JavaCV using JavaCPP HOT 30
- Memory leaks HOT 2
- Cannot find class file javax.swing.JFrame HOT 1
- FFMpegFrameRecorder is sending an unusually high keyframe rate HOT 4
- IPImage's use of awt classes are causing crashes on some device. HOT 1
- cvSetImageROI() and cvWarpAffine() made false color image HOT 15
- Get duration base on the audio track HOT 9
- EXCEPTION_ACCESS_VIOLATION HOT 2
- cvLoadImage is not work in my android app HOT 15
- Problem using ffmpegframegrabber with a remote camera that needs login HOT 2
- Reconnect problem on Android HOT 1
- Problems with android devices having Android 3.0 or lower version but works well with Android 4.0 and higher version HOT 5
- error: javacv for android on eclipse
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from javacv.