masonearles / 3dleafct Goto Github PK
View Code? Open in Web Editor NEWRandom forest segmentation of 3D leaf microCT images
Random forest segmentation of 3D leaf microCT images
I just got this error:
Working on scan: 1 of 1
Traceback (most recent call last):
File "MLmicroCT.py", line 1328, in <module>
main()
File "MLmicroCT.py", line 1256, in main
filepath,grid_name,phase_name,label_name,Th_grid,Th_phase,gridphase_train_slices_subset,gridphase_test_slices_subset,label_train_slices_subset,label_test_slices_subset,image_process_bool,train_model_bool,full_stack_bool,post_process_bool,epid_value,bg_value,spongy_value,palisade_value,ias_value,vein_value,folder_name = openAndReadFile("../settings/"+filenames[j])
File "MLmicroCT.py", line 835, in openAndReadFile
ias_value = int(myFile.readline().rstrip('\n'))
ValueError: invalid literal for int() with base 10: 'test1'
I think it''s because the input_key.txt
file hasn't been updated since you use multiple tissues. I added what was missing (adding spongy and veins) and it worked.
I was just looking at the new automated leaf traits measurements. I do not agree with the current definition of Sm. I think it should be only the surface of the airspace. Adding the surfaces of the two mesophylls and the veins adds duplicates some surfaces, and the point of thresholding the veins is to remove them from surface estimation. However, measuring the airspace is prone to some errors.
Two ways to define the surface area of the mesophyll cells Sm would be (for which I have no idea to program in python):
Also, I would like to see two Sm computed:
It would also be easy to compute Ames/Vmes. I define Vmes as the volume of the mesophyll (mesophyll + airspace), but for consistency with potential extrapolation from the literature, the should be also Vmes = (mesophyll + vein + airspace).
Keep up the great work guys!
I got this error in the Read from file mode. I looked up the lines but have no idea what's happening.
***LOAD AND ENCODE LABEL IMAGE VECTORS***
Traceback (most recent call last):
File "MLmicroCT.py", line 1328, in <module>
main()
File "MLmicroCT.py", line 1280, in main
rf_transverse,FL_train,FL_test,Label_train,Label_test = train_model(gridrec_stack,phaserec_stack,label_stack,localthick_stack,gridphase_train_slices_subset,gridphase_test_slices_subset,label_train_slices_subset,label_test_slices_subset)
File "MLmicroCT.py", line 709, in train_model
Label_test = LoadLabelData(ls, label_test, "transverse")
File "MLmicroCT.py", line 665, in LoadLabelData
labelimg_in_rot_sub = labelimg_in_rot[sub_slices,:,:]
IndexError: index 12 is out of bounds for axis 0 with size 12
I'm finally using the command line version from the new repo on a new leaf. I'm also using a new computer, and many packages are missing. It would be nice to run a script that check if all the dependencies are installed. I think I had to install 4-5 packages.
I've run into this issue in the previous version but forgot about it. It is still in the latest notebook.
The original code produced this error:
prediction_transverse_prob_imgs = class_prediction_transverse_prob.reshape((
-1,
label_stack.shape[1],
label_stack.shape[2],
4),
order="F")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-38-e940f9abb33f> in <module>()
5 label_stack.shape[2],
6 4),
----> 7 order="F")
8 prediction_transverse_imgs = class_prediction_transverse.reshape((
9 -1,
ValueError: cannot reshape array of size 8254710 into shape (387,1422,4)
I've changed the 4
to class_prediction_transverse_prob.shape[1]
, which in my case is 5
, and it worked, but I'm not sure this is the right way to work around the error. I don't know if there are other places where values like this should be changed to a computed value in an object (I haven't run into any other errors like this).
I like the way the updated code dumps files into directories. I added some lines to my version to check if the directory is present and to create it if it is not. In my case, I'll dump everything to a "ML" folder in my species-specific folder.
import os
if os.path.exists(filepath + "ML") == False:
os.mkdir(filepath + "ML")
One could also create a name for this dumping directory and just call that name instead.
Just found a typo in the leaf traits notebook (LeafTraits.ipynb
). When converting the µm^2 values, it is written they are converted to m^2, but that's actually mm^2. For m^2, it's 10^12.
Currently, for several io.imread
or io.imsave
, there is a full path written. This should be changed to filepath + 'name of file'
for consistency and ease (as in the third block of the notebook).
This is more an opened question and maybe future to-do, but my first try with the script was a hard to label image I feel. I ended up training the algorithm on 30 contiguous slices and testing it on 10 contiguous slices (this took almost 4 hours), using cells, air, veins, tape, and the outside of the leaf as labels. The training ended up being quite good, as below:
However, after running the trained algorithm on the whole stack, I still ended up not so good labelling, like these:
Tape leaking into the cells and veins
Cells leaking into the tape and outside of leaf as air
Just looking at the veins seems to be an appropriate result given the minimal effort I have to give to get this. So my question is, is it possible to train the algorithm in 3D (i.e. looking at all surrounding voxels) instead of training it on specific images. This would probably increase the computing time, but would benefit the labelling of harder to measure stacks. If you look at the mid-vein image above, there's a few interruptions in the vein in the middle. I don't know how to do this or how difficult it would be to implement it.
I don't know how it is in your final program setup, but I have added this statement after opening the raw images:
label_stack_nb = [position of the labelled stacks]
This is then called either through:
label_stack_nb[3:7]
for a range of values
or through
itemgetter(0,2,4)(label_stack_nb)
for specific values (you to import this function: from operator import itemgetter
).
I used it in my version, as I find there is a better consistency between the two sets of stacks (label vs. full).
I got this error when saving the stacks (I remove some # to shorten the lines). Don't know why the 4th element is only 623 (623 is the width of the stack, and 2411 is the number of slices). I also ran it in the manual mode and still had the same error.
***SAVING PREDICTED STACK***
Post-processing...
100%|###| 2411/2411 [00:03<00:00, 693.44it/s]
100%|###| 2411/2411 [00:01<00:00, 1978.07it/s]
100%|###| 2411/2411 [00:01<00:00, 2216.60it/s]
100%|###| 623/623 [00:05<00:00, 118.02it/s]
100%|###| 2411/2411 [00:03<00:00, 704.88it/s]
Traceback (most recent call last):
File "MLmicroCT.py", line 1328, in <module>
main()
File "MLmicroCT.py", line 1315, in main
processed = final_smooth(step2,vein_value,spongy_value,palisade_value,epid_value,ias_value,bg_value)
File "MLmicroCT.py", line 228, in final_smooth
d = (tileB*c)
ValueError: operands could not be broadcast together with shapes (2411,163,623) (2411,164,623)
When trying to troubleshoot it manually from the command line code, I got trapped, so I went to run the post processing jupyter notebook. I think I found the error. I have a stack with an odd value for height (327). Numpy, when dividing that value in 2, rounds down. So, in the original code, you have:
# Define 3D array of distances from lower value of img.shape[1] to median value# Define
rangeA = range(0,img3.shape[1]/2)
tileA = np.tile(rangeA,(img3.shape[2],img3.shape[0],1))
tileA = np.moveaxis(tileA,[0,1,2],[2,0,1])
tileB = np.flip(tileA,1)
Actually, tileB
shouldn't have the same range as tileA
if there if both are uneven, and I think this is accounted for elsewhere in the code. I made it work around like this:
rangeB = range(img3.shape[1]/2, img3.shape[1])
tileB = np.tile(rangeB,(img3.shape[2],img3.shape[0],1))
tileB = np.moveaxis(tileB,[0,1,2],[2,0,1])
tileB = np.flip(tileB, 1)
I think the post-processing now works, but didn't perform so well on my leaf. There were a lot of background intrusion into the airspace, mainly because the epidermis was thin in some splaces and I guess it was considered as dangling epidermis there. It did a good job on the veins.
Nice work on the postprocessing! Maybe there could be an option to what the user wants to choose, e.g. only correct for veins. Maybe my stack was just crappy! :D
Auto-detecting the resolution beforehand would be really useful, as most often we define it in ImageJ. I have no idea how to do this, but here are potential ways to do it. Hopefully they work well with files saved in ImageJ.
https://stackoverflow.com/questions/21697645/how-to-extract-metadata-from-a-image-using-python
https://stackoverflow.com/questions/765396/exif-manipulation-library-for-python
That would be nice!
Currently, the predicted image, RFPredictCTStack_out
, is saved to 16-bits using img_as_int
. This is too much since there as so few labels, so img_as_ubyte
(saving to 8-bits) should be used.
Further, since the number of labels could change from one user to another, and since img_as_
requires values between 0 and 1 (i.e. they must be divided by the total number of labels), there should be a call to the actual number of labels. I suggest the following, but the np.unique
could actually be called within the img_as_ubyte
function:
uniq_labs = np.unique(RFPredictCTStack_out[1])
io.imsave(filepath + 'FILENAME.tif', img_as_ubyte(RFPredictCTStack_out / len(uniq_labs)))
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.