Comments (20)
@jat255 So by channels you mean a particular energy bin on the CCD? So across the entire image, with (e.g.) 1024 energy/frequencies channels/bins, it is identically zero across the whole image?
Just checking because I'll test this real quick to see what's up. Making it a very small random value would work, but I'm a bit surprised at the problem now that I understand what's going on.
from pymcr.
We should clarify that in our implementation of MCR, we first perform an SVD decomposition using numpy.linalg.svd. The resulting factors are then rotated by varimax before being given as the input for PyMCR.
from pymcr.
In general, all 0's is a no-no (in all sorts of optimization methods) because there is no instability to cause the algorithm to move/evolve. Better is to seed with small random values.
I could foresee adding a check-for-constants/0's and give a warning to the log
from pymcr.
Do you see any negative consequences of adding an infinitesimally-small value to channels that are uniformly-zero before proceeding with the MCR algorithm? I suppose it could preferentially bias the solution positive or negative?
from pymcr.
@jat255 why not just drop those channels? They are virtually information-less for the algorithm, so no need to include them I think.
from pymcr.
@CCampJr we could write an example on this subject!
from pymcr.
@AndrewHerzing @jat255 I may have misinterpreted "spectral channel". Do you mean portions of spectra or do you mean, e.g., you seeded in 3 initial guess spectra and 2 of them were just 0's?
@charlesll I agree!
from pymcr.
@CCampJr I mean portions of the spectra. i.e. if you have an x y spectrum image map of spectra, if one of the channels is zero-valued at every x and y position, then you'll get the error described.
@charlesll I don't think we can "drop" channels (at least in the HyperSpy sense). If for instance, you have a dead pixel on your detector or something, removing the channel entirely would break the axis calibration. Admittedly, having all zero values is probably a pretty rare occurrence, but it could feasibly happen in very sparse maps, or if your detector has some incorrect offset or something (which was the case that caused me to notice the problem).
I don't know MCR intimately, but my hunch is that the best solution will just to make the zero channels actually be infinitesimally small (but definite), so that the algorithm can proceed.
from pymcr.
@jat255 So I did not have this problem. See the images below. I set the 50th and 100th spectral bin to be identically 0 at every image pixel. Note the "max spectrum..." image is the maximum spectral intensity across the entire HSI image (I'm using the Demo dataset in the Jupyter NB)
from pymcr.
I just saw you responded, but I'll send this comment through anyway...
Yes, zero across the whole image. Here's an example of what I mean (plotted within HyperSpy). This is two signals made of three gaussians with added noise. In one of them, I set the 25th energy channel to zero. This is a "pixel-first" view of the data (i.e. a map of spectra):
And this is "energy-first" (i.e. a series of images):
When I run our "decomposition" routine using pyMCR, I get a valid result for the "no-zeros" case, but not for the one with one channel forced to zero:
from pymcr.
I don't believe HyperSpy is doing anything unusual, but I will try to reproduce using only pyMCR
and not HyperSpy to make sure I see the same behavior... Give me a few to work it out.
from pymcr.
Here's a more intense example (I randomly killed half of the channels). Just for info I used:
# Just a copy of my hyperspectral image data
hsi2 = 1*hsi
# Kill half the spectral pixels across the image
locs_to_kill = np.random.randint(0,200,100)
hsi2[...,locs_to_kill] = 0
initial_spectra2 = 1*initial_spectra
initial_spectra2[...,locs_to_kill] = 0
mcrar = McrAR(max_iter=100, st_regr='OLS', c_regr='OLS',
c_constraints=[ConstraintNonneg(), ConstraintNorm()],
st_constraints=[ConstraintNonneg()])
mcrar.fit(hsi2.reshape((-1, wn.size)), ST=initial_spectra2, verbose=True)
print('\nFinal MSE: {:.7e}'.format(mcrar.err[-1]))
from pymcr.
I don't believe HyperSpy is doing anything unusual, but I will try to reproduce using only
pyMCR
and not HyperSpy to make sure I see the same behavior... Give me a few to work it out.
Maybe it's a dumb question, but are your spectra/images integer type? Weird things happen if you put in an image with integer type and then work in float space.
from pymcr.
Unfortunately, no they're floats. It's required by the SVD that we run prior to feeding those in as initial conditions to pyMCR
from pymcr.
- How many SVD components are you keeping?
- Make sure the input spectra (into MCR) have positive values?
- If you all want, you can send me data (numpy binaries would be fine)
- HSI
- SVD-varimax'd initial spectra
from pymcr.
The number of SVD components depends on the data but in the cases we have tested it is around 3 or 4. After Varimax rotation, the code attempts to find the sign of the majority of the signal and then sets that to positive. In other words, if the SVD output is mostly negative after Varimax, the sign would be flipped. I'll let Josh send you the data as it is his that we have been working on the most.
from pymcr.
Ok, here's some dummy data that has been exhibiting the issue. There are four data files in that zip. HSI_data_unfolded.npy
is a 100x100x40 (x,y,energy) data array unfolded into 2D, so it's actually 10000x40. The HSI_data_SVD.npy
is the result of the SVD PCA (as run with s.decomposition(True)
in HyperSpy), and then the HSI_SVD_varimax.npy
is the same data, but after the varimax rotation. McrAL.pkl
is the McrAR
object after fitting, which can be opened by:
import pickle
with open('McrAL.pkl', 'rb') as fl:
mcr_result = pickle.load(fl)
This was the HyperSpy/numpy code used to generate that data and test the MCR implementation:
import hyperspy.api as hs
import numpy as np
import matplotlib.pylab as plt
import scipy.stats as stats
mu1 = -10
variance1 = 900
sigma1 = np.sqrt(variance1)
x1 = np.linspace(mu1 - 3*sigma1, mu1 + 3*sigma1, 200)
x1 = np.linspace(-100,100)
y1 = 10000*stats.norm.pdf(x1, mu1, sigma1)
mu2 = 5
variance2 = 700
sigma2 = np.sqrt(variance2)
x2 = np.linspace(mu2 - 3*sigma2, mu2 + 3*sigma2, 200)
x2 = np.linspace(-100,100)
y2 = 10000*stats.norm.pdf(x2, mu2, sigma2)
mu3 = 25
variance3 = 1000
sigma3 = np.sqrt(variance3)
x3 = np.linspace(-100,100)
y3 = 10000*stats.norm.pdf(x3, mu3, sigma3)
s = hs.signals.Signal1D(np.zeros([100,100,50], np.int32))
s.metadata.General.title = 'example spectrum image'
s.inav[0:50,:] = y1
s.inav[50:75,:] = y2
s.inav[75:,:] = y3
s.add_poissonian_noise()
s = s.isig[:40]
s2 = s.deepcopy()
s2.metadata.General.title = 'example spectrum image w/ zeros'
s2.isig[29] = 0
s2.change_dtype('float32')
s2.decomposition(True)
s2.decomposition(algorithm='MCR', output_dimension=3)
The code that runs pyMCR
in HyperSpy is on an experimental branch here. It looks like the problem arises because the C_opt_
attribute of the McrAR
object has NaN values where the spectral channel is zero. This propagates into our factors_out
array here, and then since we do a normalization here, we get NaNs throughout the whole thing.
from pymcr.
@CCampJr : @jat255 and I have discussed this and we think the simplest way to proceed is for us to check for any NaN's in the C_opt_ output of the MCR and set them to zero prior to the final normalization. We will return a warning to alert the user that this has been done. Do you think this sounds okay? A more fundamental question is whether the presence of the zero's in the input data will affect the output of the MCR and whether we should be warning users about this prior to performing the fit in the first place. Does anything in your experience suggest that this is something we should be worried about?
from pymcr.
@AndrewHerzing @jat255 Sorry for my delay: was off in the woods hiking Fri-Sun.
I'll be curious to see why things are failing as it's not obvious to me.
from pymcr.
I'm just getting back to this now. Looks like we never quite figured out what was going wrong here... For the sake of getting this included in HyperSpy (see hyperspy/hyperspy#2172) I think we'll go with the approach suggested by @AndrewHerzing:
to check for any NaN's in the C_opt_ output of the MCR and set them to zero prior to the final normalization. We will return a warning to alert the user that this has been done.
We can revisit this if necessary to figure it out "for real"
from pymcr.
Related Issues (17)
- On README.rst and Demo.ipynb HOT 1
- Change all address to point to usnistgov HOT 1
- Test Warning for semilearned
- Make documentation
- New Features for Version 0.4.X
- Add "quiet" mode of operation
- Add __version__ feature in a more improved way
- Make more examples
- Add Anaconda cloud badge to documentation
- Is matrix augmentation supported HOT 12
- MNT: Stop using ci-helpers in appveyor.yml
- Error related to missing libifport.so.5 required in scipy.linalg HOT 1
- Unimodality constraint HOT 5
- Error with the pyMCR demo HOT 2
- Use of constraints HOT 3
- Give me little more guide before making PR HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pymcr.