The license detector is detecting Apache-2.0 as a Pixar license.
The reason being that the Pixar license is a modified version of the Apache-2.0 so while calculating the similarity, Pixar scores greater than Apache-2.0 by a very little margin.
What I will do is if the similarity score of lets say N number of licenses are very close ( within a range [ most_similar, most_similar - a ] , where a will be a number defined after some tests ), the detector will give all N licenses as possible right answers.
Reproduction
Have a fairly big project to scan. Or just change the commons-compress dependency to v1.26.0 in this repo and scan this repo using java -jar phsyberdome-sca-cli-1.0.3-beta scan -src <path-to-clone>
The commons-compress v1.26.0 has a big dependency tree in itself. The scan would run for quite a while and then crash because of Heap Overflow Error.
Solution
Stop creating the dependency tree in-memory and keep writing it to disk at regular intervals.