Git Product home page Git Product logo

Comments (20)

CCampJr avatar CCampJr commented on June 8, 2024 1

@jat255 So by channels you mean a particular energy bin on the CCD? So across the entire image, with (e.g.) 1024 energy/frequencies channels/bins, it is identically zero across the whole image?

Just checking because I'll test this real quick to see what's up. Making it a very small random value would work, but I'm a bit surprised at the problem now that I understand what's going on.

from pymcr.

AndrewHerzing avatar AndrewHerzing commented on June 8, 2024 1

We should clarify that in our implementation of MCR, we first perform an SVD decomposition using numpy.linalg.svd. The resulting factors are then rotated by varimax before being given as the input for PyMCR.

from pymcr.

CCampJr avatar CCampJr commented on June 8, 2024

@AndrewHerzing @jat255

In general, all 0's is a no-no (in all sorts of optimization methods) because there is no instability to cause the algorithm to move/evolve. Better is to seed with small random values.

I could foresee adding a check-for-constants/0's and give a warning to the log

from pymcr.

jat255 avatar jat255 commented on June 8, 2024

Do you see any negative consequences of adding an infinitesimally-small value to channels that are uniformly-zero before proceeding with the MCR algorithm? I suppose it could preferentially bias the solution positive or negative?

from pymcr.

charlesll avatar charlesll commented on June 8, 2024

@jat255 why not just drop those channels? They are virtually information-less for the algorithm, so no need to include them I think.

from pymcr.

charlesll avatar charlesll commented on June 8, 2024

@CCampJr we could write an example on this subject!

from pymcr.

CCampJr avatar CCampJr commented on June 8, 2024

@AndrewHerzing @jat255 I may have misinterpreted "spectral channel". Do you mean portions of spectra or do you mean, e.g., you seeded in 3 initial guess spectra and 2 of them were just 0's?

@charlesll I agree!

from pymcr.

jat255 avatar jat255 commented on June 8, 2024

@CCampJr I mean portions of the spectra. i.e. if you have an x y spectrum image map of spectra, if one of the channels is zero-valued at every x and y position, then you'll get the error described.

@charlesll I don't think we can "drop" channels (at least in the HyperSpy sense). If for instance, you have a dead pixel on your detector or something, removing the channel entirely would break the axis calibration. Admittedly, having all zero values is probably a pretty rare occurrence, but it could feasibly happen in very sparse maps, or if your detector has some incorrect offset or something (which was the case that caused me to notice the problem).

I don't know MCR intimately, but my hunch is that the best solution will just to make the zero channels actually be infinitesimally small (but definite), so that the algorithm can proceed.

from pymcr.

CCampJr avatar CCampJr commented on June 8, 2024

@jat255 So I did not have this problem. See the images below. I set the 50th and 100th spectral bin to be identically 0 at every image pixel. Note the "max spectrum..." image is the maximum spectral intensity across the entire HSI image (I'm using the Demo dataset in the Jupyter NB)

download (1)
download

from pymcr.

jat255 avatar jat255 commented on June 8, 2024

I just saw you responded, but I'll send this comment through anyway...

Yes, zero across the whole image. Here's an example of what I mean (plotted within HyperSpy). This is two signals made of three gaussians with added noise. In one of them, I set the 25th energy channel to zero. This is a "pixel-first" view of the data (i.e. a map of spectra):

image

And this is "energy-first" (i.e. a series of images):

image

When I run our "decomposition" routine using pyMCR, I get a valid result for the "no-zeros" case, but not for the one with one channel forced to zero:

image

from pymcr.

jat255 avatar jat255 commented on June 8, 2024

I don't believe HyperSpy is doing anything unusual, but I will try to reproduce using only pyMCR and not HyperSpy to make sure I see the same behavior... Give me a few to work it out.

from pymcr.

CCampJr avatar CCampJr commented on June 8, 2024

Here's a more intense example (I randomly killed half of the channels). Just for info I used:

# Just a copy of my hyperspectral image data
hsi2 = 1*hsi

# Kill half the spectral pixels across the image
locs_to_kill = np.random.randint(0,200,100)
hsi2[...,locs_to_kill] = 0

initial_spectra2 = 1*initial_spectra
initial_spectra2[...,locs_to_kill] = 0

mcrar = McrAR(max_iter=100, st_regr='OLS', c_regr='OLS', 
              c_constraints=[ConstraintNonneg(), ConstraintNorm()],
              st_constraints=[ConstraintNonneg()])

mcrar.fit(hsi2.reshape((-1, wn.size)), ST=initial_spectra2, verbose=True)
print('\nFinal MSE: {:.7e}'.format(mcrar.err[-1]))

download (2)
download (3)

from pymcr.

CCampJr avatar CCampJr commented on June 8, 2024

@jat255

I don't believe HyperSpy is doing anything unusual, but I will try to reproduce using only pyMCR and not HyperSpy to make sure I see the same behavior... Give me a few to work it out.

Maybe it's a dumb question, but are your spectra/images integer type? Weird things happen if you put in an image with integer type and then work in float space.

from pymcr.

jat255 avatar jat255 commented on June 8, 2024

Unfortunately, no they're floats. It's required by the SVD that we run prior to feeding those in as initial conditions to pyMCR

from pymcr.

CCampJr avatar CCampJr commented on June 8, 2024

@AndrewHerzing @jat255

  • How many SVD components are you keeping?
  • Make sure the input spectra (into MCR) have positive values?
  • If you all want, you can send me data (numpy binaries would be fine)
    • HSI
    • SVD-varimax'd initial spectra

from pymcr.

AndrewHerzing avatar AndrewHerzing commented on June 8, 2024

The number of SVD components depends on the data but in the cases we have tested it is around 3 or 4. After Varimax rotation, the code attempts to find the sign of the majority of the signal and then sets that to positive. In other words, if the SVD output is mostly negative after Varimax, the sign would be flipped. I'll let Josh send you the data as it is his that we have been working on the most.

from pymcr.

jat255 avatar jat255 commented on June 8, 2024

HSI_data.zip

Ok, here's some dummy data that has been exhibiting the issue. There are four data files in that zip. HSI_data_unfolded.npy is a 100x100x40 (x,y,energy) data array unfolded into 2D, so it's actually 10000x40. The HSI_data_SVD.npy is the result of the SVD PCA (as run with s.decomposition(True) in HyperSpy), and then the HSI_SVD_varimax.npy is the same data, but after the varimax rotation. McrAL.pkl is the McrAR object after fitting, which can be opened by:

import pickle
with open('McrAL.pkl', 'rb') as fl: 
    mcr_result = pickle.load(fl)                        

This was the HyperSpy/numpy code used to generate that data and test the MCR implementation:

import hyperspy.api as hs
import numpy as np
import matplotlib.pylab as plt
import scipy.stats as stats

mu1 = -10
variance1 = 900
sigma1 = np.sqrt(variance1)
x1 = np.linspace(mu1 - 3*sigma1, mu1 + 3*sigma1, 200)
x1 = np.linspace(-100,100)
y1 = 10000*stats.norm.pdf(x1, mu1, sigma1)

mu2 = 5
variance2 = 700
sigma2 = np.sqrt(variance2)
x2 = np.linspace(mu2 - 3*sigma2, mu2 + 3*sigma2, 200)
x2 = np.linspace(-100,100)
y2 = 10000*stats.norm.pdf(x2, mu2, sigma2)

mu3 = 25
variance3 = 1000
sigma3 = np.sqrt(variance3)
x3 = np.linspace(-100,100)
y3 = 10000*stats.norm.pdf(x3, mu3, sigma3)

s = hs.signals.Signal1D(np.zeros([100,100,50], np.int32))
s.metadata.General.title = 'example spectrum image'
s.inav[0:50,:] = y1
s.inav[50:75,:] = y2
s.inav[75:,:] = y3
s.add_poissonian_noise()
s = s.isig[:40]
s2 = s.deepcopy()
s2.metadata.General.title = 'example spectrum image w/ zeros'
s2.isig[29] = 0
s2.change_dtype('float32')
s2.decomposition(True)
s2.decomposition(algorithm='MCR', output_dimension=3)

The code that runs pyMCR in HyperSpy is on an experimental branch here. It looks like the problem arises because the C_opt_ attribute of the McrAR object has NaN values where the spectral channel is zero. This propagates into our factors_out array here, and then since we do a normalization here, we get NaNs throughout the whole thing.

from pymcr.

AndrewHerzing avatar AndrewHerzing commented on June 8, 2024

@CCampJr : @jat255 and I have discussed this and we think the simplest way to proceed is for us to check for any NaN's in the C_opt_ output of the MCR and set them to zero prior to the final normalization. We will return a warning to alert the user that this has been done. Do you think this sounds okay? A more fundamental question is whether the presence of the zero's in the input data will affect the output of the MCR and whether we should be warning users about this prior to performing the fit in the first place. Does anything in your experience suggest that this is something we should be worried about?

from pymcr.

CCampJr avatar CCampJr commented on June 8, 2024

@AndrewHerzing @jat255 Sorry for my delay: was off in the woods hiking Fri-Sun.

I'll be curious to see why things are failing as it's not obvious to me.

from pymcr.

jat255 avatar jat255 commented on June 8, 2024

I'm just getting back to this now. Looks like we never quite figured out what was going wrong here... For the sake of getting this included in HyperSpy (see hyperspy/hyperspy#2172) I think we'll go with the approach suggested by @AndrewHerzing:

to check for any NaN's in the C_opt_ output of the MCR and set them to zero prior to the final normalization. We will return a warning to alert the user that this has been done.

We can revisit this if necessary to figure it out "for real"

from pymcr.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.