Comments (2)
The numbers of matching and total hashes look extremely small. I wouldn't
trust any matches with fewer than 5-10 hash matches - 1 or 2 hashes are
very likely due to coincidences.
You should up your density until you're getting at least 20-40 total hashes
in your test segments. Also, it's not going to work well with very short
match patterns - I'd say 2 sec was a minimum. Single words sound too
little, and I don't really understand the use-case. Why would such a short
segment be repeated exactly?
It absolutely won't match different instances of the same word, even spoken
by the same speaker. That's a different recognition problem.
DAn.
On Wednesday, June 10, 2015, apiszcz [email protected] wrote:
I tested the samplerate for ingest of content and query values as follows.
Note Case 1 is correct. All source data is original 8KHZ.
--exact-count
--min-count 1
--density 100
--max-matches 10
--match-win 5
--pks-per-frame 2Case1:--samplerate 11000, first case is the query itself in the content,
second instance is a match, 62.1 seconds is another instance, All
detections in case 1 are correct.at 46.476 s with 2 of 9 hashes at rank 1
at 47.686 s with 1 of 9 hashes at rank 1
at 62.183 s with 1 of 9 hashes at rank 1Case 2: Here we have a match with itself, however it misses matches at
47.6 and 62 in the top 10 results.With 8KHZ data --samplerate 8000
at 46.048 s with 3 of 13 hashes at rank 1
at 332.416 s with 2 of 13 hashes at rank 1
at 12.672 s with 1 of 13 hashes at rank 1
at 13.472 s with 1 of 13 hashes at rank 1
at 97.408 s with 1 of 13 hashes at rank 1
at 121.056 s with 1 of 13 hashes at rank 1
at 146.304 s with 1 of 13 hashes at rank 1
at 257.216 s with 1 of 13 hashes at rank 1
at 323.008 s with 1 of 13 hashes at rank 1—
Reply to this email directly or view it on GitHub
#9.
from audfprint.
Thank you for the feedback, use case is fingerprint detection on short
segments, I'm exploring parameter ranges for reasonable hit rates. Agree,
2-4 seconds seems more reasonable. Recognition, understand.
On Sat, Jun 13, 2015 at 1:20 AM, Dan Ellis [email protected] wrote:
The numbers of matching and total hashes look extremely small. I wouldn't
trust any matches with fewer than 5-10 hash matches - 1 or 2 hashes are
very likely due to coincidences.You should up your density until you're getting at least 20-40 total hashes
in your test segments. Also, it's not going to work well with very short
match patterns - I'd say 2 sec was a minimum. Single words sound too
little, and I don't really understand the use-case. Why would such a short
segment be repeated exactly?It absolutely won't match different instances of the same word, even spoken
by the same speaker. That's a different recognition problem.DAn.
On Wednesday, June 10, 2015, apiszcz [email protected] wrote:
I tested the samplerate for ingest of content and query values as
follows.
Note Case 1 is correct. All source data is original 8KHZ.
--exact-count
--min-count 1
--density 100
--max-matches 10
--match-win 5
--pks-per-frame 2Case1:--samplerate 11000, first case is the query itself in the content,
second instance is a match, 62.1 seconds is another instance, All
detections in case 1 are correct.at 46.476 s with 2 of 9 hashes at rank 1
at 47.686 s with 1 of 9 hashes at rank 1
at 62.183 s with 1 of 9 hashes at rank 1Case 2: Here we have a match with itself, however it misses matches at
47.6 and 62 in the top 10 results.With 8KHZ data --samplerate 8000
at 46.048 s with 3 of 13 hashes at rank 1
at 332.416 s with 2 of 13 hashes at rank 1
at 12.672 s with 1 of 13 hashes at rank 1
at 13.472 s with 1 of 13 hashes at rank 1
at 97.408 s with 1 of 13 hashes at rank 1
at 121.056 s with 1 of 13 hashes at rank 1
at 146.304 s with 1 of 13 hashes at rank 1
at 257.216 s with 1 of 13 hashes at rank 1
at 323.008 s with 1 of 13 hashes at rank 1—
Reply to this email directly or view it on GitHub
#9.—
Reply to this email directly or view it on GitHub
#9 (comment).
from audfprint.
Related Issues (20)
- Incorrect time range HOT 1
- Incorrect Time range. HOT 1
- Convert .afpk files to mp3? HOT 1
- Problem with "spreadpeaksinvector"
- Reduce memory usage HOT 2
- Use audfprint as a module
- How to increase bits for storing IDs and timestamp?
- illustrate
- Scan every folder in every fingerprint base named as folder HOT 3
- UNICODE chaaracters ERROR HOT 11
- Show more than 1 finded matched names in results. HOT 4
- Can this algorithm load the historical features into memory first, so that the matching speed is improved, but I don't know how to modify your basic code HOT 7
- Hello, if the song is 1 million (the duration of the song is about 3 minutes), the matching time is very slow, how to optimize it? HOT 1
- I found that according to The time and hash generated by audfprint are not continuous in time, which led me to use the binary method to search for the same hash, conduct a statistical ranking of the number of hashes, and search for the original version of individual audio (the climax audio), and the ranking is not the first.
- I found that some match matches are inaccurate. I want to know, is the time and hash generated by this audfprint continuous at the start-end time, or is it a peak value, which leads to the fact that if one song is 1 minute, another song 3 minutes, there are a lot of hashes in the previous minute in the last two minutes of the next song, which leads to the fact that the hash statistics are larger than one minute, resulting in inaccurate sorting, then the match situation is inaccurate, how to optimize this? HOT 2
- output matching different between windows and linux. I created a database of filehashes which is around 340MB. When i try to query around 100 mp3 files my output is different between my windows machine and my raspberri pi. The database is identical, the query is identical but the windows machine finds significantly more matches. Both are running python 3.9. Database was created on the windows machine and transfered to the pi. Anyone encountered something similar? HOT 7
- Question about concatenate afpk files HOT 2
- How to avoid big % Dropped HOT 5
- Can someone take on audfprint-gui for audfprint or create a new gui for it? HOT 2
- Ability to split pklzs into smaller sizes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from audfprint.