Comments (11)
Update: Tested vs R Psych library, same issue. Realized that the problem seems to occurr when the correlation matrix is all positive. If even 1 value from it is negative, the results match. Maybe this helps debug...
from factor_analyzer.
Yes, I would be inclined not to change this, since it doesn't really strike me as a bug and we use eigh
pretty consistently throughout. Maybe we can mention it in the documentation?
from factor_analyzer.
Thanks for the follow up! I will look into this very soon.
from factor_analyzer.
Update: Tested vs R Psych library, same issue. Realized that the problem seems to occurr when the correlation matrix is all positive. If even 1 value from it is negative, the results match. Maybe this helps debug...
I can't seem to reproduce this issue. For example, using the data provided above (data
), here are my results:
import pandas as pd
from factor_analyzer import FactorAnalyzer
df = pd.DataFrame([[float(val) for val in row.split(' | ')]
for row in data.strip().split('\n')])
fa = FactorAnalyzer(method='minres',
n_factors=1,
rotation=None,
bounds=(0.005, 1),
is_corr_matrix=True).fit(df)
print(fa.loadings_)
[[0.3879858 ]
[0.66334567]
[0.32897377]
[0.55966426]
[0.66396016]
[0.81430826]
[0.8469053 ]
[0.63367546]
[0.44783303]
[0.69420312]
[0.58345214]
[0.6963522 ]]
This matches R's psych
library. Let me know if I'm missing something!
from factor_analyzer.
I'm using:
pandas 1.2.4
numpy 1.20.2
python 3.8.10
If you're getting correct results I would guess it's because of an older numpy version to be honest, and how it is used internally in factor_analyzer.
Thanks!
from factor_analyzer.
I encounter the same issue (negative factor loadings):
I'm using:
pandas 1.4.3
numpy 1.21.5
python 3.9.12
Which version of packages do you suggest for avoiding this please?
Thanks!
from factor_analyzer.
@celip38 please share your data, if possible, so we can try to reproduce the issue.
from factor_analyzer.
@desilinguist You can use the data I presented above to test this problem.
from factor_analyzer.
Thanks @Db-pckr. I can replicate this on my end too with the latest numpy library.
I poked around a bit and found that numpy.linalg.eigh()
used for the eigenvalue decomposition was returning an all-negative first eigenvector for this correlation matrix whereas if the more general – but less efficient – numpy.linalg.eig()
returns an all-positive first eigenvector, viz.
With eigh()
:
array([[-0.17816009],
[-0.30460323],
[-0.15106223],
[-0.25699352],
[-0.3048854 ],
[-0.37392409],
[-0.3888924 ],
[-0.2909789 ],
[-0.20564148],
[-0.31877273],
[-0.26791673],
[-0.31975958]])
and, with eig()
:
array([[0.17816009],
[0.30460323],
[0.15106223],
[0.25699352],
[0.3048854 ],
[0.37392409],
[0.3888924 ],
[0.2909789 ],
[0.20564148],
[0.31877273],
[0.26791673],
[0.31975958]])```
However, neither is incorrect because, as we know, if
So, while we could replace eigh()
with eig()
to force the results to match what SPSS and R do, I am not convinced that we need to do that since this is not really a bug.
@jbiggsets any thoughts?
from factor_analyzer.
Adding to the documentation sounds like a good idea. I'll do that!
from factor_analyzer.
Thanks a lot!
from factor_analyzer.
Related Issues (20)
- calculate_bartlett_sphericity() crashes with dataframe but not with numpy array HOT 1
- calculate_kmo() differs from psych.KMO() in R HOT 1
- FactorAnalyzer(method="principal") throws sklearn FutureWarning in randomized_svd() HOT 1
- how to get Proportion Explained, RMSR and chi-squared? HOT 1
- Current release on conda HOT 1
- Mistake in correlation-function HOT 2
- Only 3 factors appear in factor loading matrix, but "n_factors=5" as input. Likely my error but I cannot find it. HOT 1
- Add pre-commit checks and apply them to all existing code
- random initial values for rotation matrix in GPA rotations HOT 2
- Regression method to calculate factor scores HOT 1
- Optimization error
- Comparison with SPSS HOT 2
- Switch to using `nose2` for tests instead of `nose` HOT 1
- Add support for Python 3.11 HOT 1
- Remove pre-commit from install dependencies, add it to "dev" extra HOT 1
- get_factor_variance() returns ndarray which is not ordered by variance.
- SciPy sum function is deprecated causing Factor-Analyzer to fail HOT 4
- Loadings matrix has incorrect shape when using principal method with lapack HOT 1
- `UnboundLocalError` with principal-lapack method types and any oblique rotations HOT 1
- Add CFI and RMSEA goodness-of-fit metrics HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from factor_analyzer.