Large memory footprint and computational times,about dpwe/audfprint

Comments (6)

dpwe commented on August 16, 2024

The system is designed for reference items of a few minutes and queries of
tens of seconds. It will work with files an hour long, but beyond that I
don't know. Can you break your files into 1 hour chunks?

DAn.

On Friday, January 29, 2016, eamonnkenny [email protected] wrote:

The software seems to take over my whole Debian Jessie dual quad core
machine (Intel i7) when performing a pre compute on a 24 hour video
obtained from http://oireachtasdebates.oireachtas.ie/ with ncores=1 files
are about 4-6.4Gb in size but should they be fully loaded at the time of
processing? Or does the algorithm require full loading of the video to jump
around within it?

Also some process in 2 minutes whilst others take 150 minutes.

I'm using density=50.

—
Reply to this email directly or view it on GitHub
#18.

from audfprint.

eamonnkenny commented on August 16, 2024

Thanks Dan,

1hr chunks is probably achievable and I can throw away the video for the
analysis and then glean it back at the end using avconv and mediainfo.
Basically, I'm looking for a parliamentary talk in a 24hr session. To
find it I split each 24hr into 1hr sessions, then check the snippets to
find their start time. If then exist in say Dail_20060208-11.wav at time
135.2 seconds then I just use: 3600 * 11 + 135.2 to get the state time
and pymediainfo will give the duration of any snippet file using:

duration = float( pymediainfo.MediaInfo.parse( snippetFile
).tracks[0].duration ) / 1000.0

I'll see how I get on. Thanks for your help.

On 29/01/16 11:53, Dan Ellis wrote:

The system is designed for reference items of a few minutes and queries of
tens of seconds. It will work with files an hour long, but beyond that I
don't know. Can you break your files into 1 hour chunks?

DAn.

On Friday, January 29, 2016, eamonnkenny [email protected] wrote:

The software seems to take over my whole Debian Jessie dual quad core
machine (Intel i7) when performing a pre compute on a 24 hour video
obtained from http://oireachtasdebates.oireachtas.ie/ with ncores=1
files
are about 4-6.4Gb in size but should they be fully loaded at the time of
processing? Or does the algorithm require full loading of the video
to jump
around within it?

Also some process in 2 minutes whilst others take 150 minutes.

I'm using density=50.

—
Reply to this email directly or view it on GitHub
#18.

—
Reply to this email directly or view it on GitHub
#18 (comment).

Best Regards,
Eamonn Kenny

*Dr. Eamonn Kenny* | Research Fellow

ADAPT Centre
O'Reilly Building Phone: +353 (0) 1 896 1335
Trinity College Dublin Mobile: +353 (0) 86 309 3967
Dublin 2 E-mail: [email protected]
Ireland www.adaptcentre.ie http://www.adaptcentre.ie/
https://twitter.com/adaptcentrehttps://www.facebook.com/ADAPTCentre?fref=tshttps://www.youtube.com/channel/UC9--qVutTtyLyhZCJR7rY5ghttps://www.linkedin.com/company/adapt-centre

from audfprint.

dpwe commented on August 16, 2024

To get accurate time indices within 1hr chunks you'll need to increase
--maxtime 262144 or something. By default (--maxtime 16384) it aliases on
a 6 minute window.

DAn.

On Fri, Jan 29, 2016 at 9:53 AM, eamonnkenny [email protected]
wrote:

Thanks Dan,

1hr chunks is probably achievable and I can throw away the video for the
analysis and then glean it back at the end using avconv and mediainfo.
Basically, I'm looking for a parliamentary talk in a 24hr session. To
find it I split each 24hr into 1hr sessions, then check the snippets to
find their start time. If then exist in say Dail_20060208-11.wav at time
135.2 seconds then I just use: 3600 * 11 + 135.2 to get the state time
and pymediainfo will give the duration of any snippet file using:

duration = float( pymediainfo.MediaInfo.parse( snippetFile
).tracks[0].duration ) / 1000.0

I'll see how I get on. Thanks for your help.

On 29/01/16 11:53, Dan Ellis wrote:

The system is designed for reference items of a few minutes and queries
of
tens of seconds. It will work with files an hour long, but beyond that I
don't know. Can you break your files into 1 hour chunks?

DAn.

On Friday, January 29, 2016, eamonnkenny [email protected]
wrote:

The software seems to take over my whole Debian Jessie dual quad core
machine (Intel i7) when performing a pre compute on a 24 hour video
obtained from http://oireachtasdebates.oireachtas.ie/ with ncores=1
files
are about 4-6.4Gb in size but should they be fully loaded at the time
of
processing? Or does the algorithm require full loading of the video
to jump
around within it?

Also some process in 2 minutes whilst others take 150 minutes.

I'm using density=50.

—
Reply to this email directly or view it on GitHub
#18.

—
Reply to this email directly or view it on GitHub
#18 (comment).

Best Regards,
Eamonn Kenny

Dr. Eamonn Kenny | Research Fellow
ADAPT Centre
O'Reilly Building Phone: +353 (0) 1 896 1335
Trinity College Dublin Mobile: +353 (0) 86 309 3967
Dublin 2 E-mail: [email protected]
Ireland www.adaptcentre.ie http://www.adaptcentre.ie/
https://twitter.com/adaptcentre<
https://www.facebook.com/ADAPTCentre?fref=ts><
https://www.youtube.com/channel/UC9--qVutTtyLyhZCJR7rY5g><
https://www.linkedin.com/company/adapt-centre>

—
Reply to this email directly or view it on GitHub
#18 (comment).

from audfprint.

eamonnkenny commented on August 16, 2024

Good to know, thanks. I tried one previously with the default of 16384
and it worked without any problem. I ran the match query with
--min-count 20 --max-matches 100 and --match-win 2 and found that the
result was very exact.

On 29/01/16 14:58, Dan Ellis wrote:

To get accurate time indices within 1hr chunks you'll need to increase
--maxtime 262144 or something. By default (--maxtime 16384) it aliases on
a 6 minute window.

DAn.

On Fri, Jan 29, 2016 at 9:53 AM, eamonnkenny [email protected]
wrote:

Thanks Dan,

1hr chunks is probably achievable and I can throw away the video for the
analysis and then glean it back at the end using avconv and mediainfo.
Basically, I'm looking for a parliamentary talk in a 24hr session. To
find it I split each 24hr into 1hr sessions, then check the snippets to
find their start time. If then exist in say Dail_20060208-11.wav at time
135.2 seconds then I just use: 3600 * 11 + 135.2 to get the state time
and pymediainfo will give the duration of any snippet file using:

duration = float( pymediainfo.MediaInfo.parse( snippetFile
).tracks[0].duration ) / 1000.0

I'll see how I get on. Thanks for your help.

On 29/01/16 11:53, Dan Ellis wrote:

The system is designed for reference items of a few minutes and
queries
of
tens of seconds. It will work with files an hour long, but beyond
that I
don't know. Can you break your files into 1 hour chunks?

DAn.

On Friday, January 29, 2016, eamonnkenny [email protected]
wrote:

The software seems to take over my whole Debian Jessie dual quad
core
machine (Intel i7) when performing a pre compute on a 24 hour video
obtained from http://oireachtasdebates.oireachtas.ie/ with ncores=1
files
are about 4-6.4Gb in size but should they be fully loaded at the
time
of
processing? Or does the algorithm require full loading of the video
to jump
around within it?

Also some process in 2 minutes whilst others take 150 minutes.

I'm using density=50.

—
Reply to this email directly or view it on GitHub
#18.

—
Reply to this email directly or view it on GitHub
#18 (comment).

Best Regards,
Eamonn Kenny

Dr. Eamonn Kenny | Research Fellow
ADAPT Centre
O'Reilly Building Phone: +353 (0) 1 896 1335
Trinity College Dublin Mobile: +353 (0) 86 309 3967
Dublin 2 E-mail: [email protected]
Ireland www.adaptcentre.ie http://www.adaptcentre.ie/
https://twitter.com/adaptcentre<
https://www.facebook.com/ADAPTCentre?fref=ts><
https://www.youtube.com/channel/UC9--qVutTtyLyhZCJR7rY5g><
https://www.linkedin.com/company/adapt-centre>

—
Reply to this email directly or view it on GitHub
#18 (comment).

—
Reply to this email directly or view it on GitHub
#18 (comment).

Best Regards,
Eamonn Kenny

*Dr. Eamonn Kenny* | Research Fellow

from audfprint.

eamonnkenny commented on August 16, 2024

Hi Dan,

When I go for maxtime = 262144 must this value to be built into the
precomputed large files, the smaller file ingestion or just the matching
at the end, or all 3?

On 29/01/16 14:58, Dan Ellis wrote:

To get accurate time indices within 1hr chunks you'll need to increase
--maxtime 262144 or something. By default (--maxtime 16384) it aliases on
a 6 minute window.

DAn.

On Fri, Jan 29, 2016 at 9:53 AM, eamonnkenny [email protected]
wrote:

Thanks Dan,

1hr chunks is probably achievable and I can throw away the video for the
analysis and then glean it back at the end using avconv and mediainfo.
Basically, I'm looking for a parliamentary talk in a 24hr session. To
find it I split each 24hr into 1hr sessions, then check the snippets to
find their start time. If then exist in say Dail_20060208-11.wav at time
135.2 seconds then I just use: 3600 * 11 + 135.2 to get the state time
and pymediainfo will give the duration of any snippet file using:

duration = float( pymediainfo.MediaInfo.parse( snippetFile
).tracks[0].duration ) / 1000.0

I'll see how I get on. Thanks for your help.

On 29/01/16 11:53, Dan Ellis wrote:

The system is designed for reference items of a few minutes and
queries
of
tens of seconds. It will work with files an hour long, but beyond
that I
don't know. Can you break your files into 1 hour chunks?

DAn.

On Friday, January 29, 2016, eamonnkenny [email protected]
wrote:

The software seems to take over my whole Debian Jessie dual quad
core
machine (Intel i7) when performing a pre compute on a 24 hour video
obtained from http://oireachtasdebates.oireachtas.ie/ with ncores=1
files
are about 4-6.4Gb in size but should they be fully loaded at the
time
of
processing? Or does the algorithm require full loading of the video
to jump
around within it?

Also some process in 2 minutes whilst others take 150 minutes.

I'm using density=50.

—
Reply to this email directly or view it on GitHub
#18.

—
Reply to this email directly or view it on GitHub
#18 (comment).

Best Regards,
Eamonn Kenny

Dr. Eamonn Kenny | Research Fellow
ADAPT Centre
O'Reilly Building Phone: +353 (0) 1 896 1335
Trinity College Dublin Mobile: +353 (0) 86 309 3967
Dublin 2 E-mail: [email protected]
Ireland www.adaptcentre.ie http://www.adaptcentre.ie/
https://twitter.com/adaptcentre<
https://www.facebook.com/ADAPTCentre?fref=ts><
https://www.youtube.com/channel/UC9--qVutTtyLyhZCJR7rY5g><
https://www.linkedin.com/company/adapt-centre>

—
Reply to this email directly or view it on GitHub
#18 (comment).

—
Reply to this email directly or view it on GitHub
#18 (comment).

Best Regards,
Eamonn Kenny

*Dr. Eamonn Kenny* | Research Fellow

from audfprint.

dpwe commented on August 16, 2024

The maxtime parameter has to be specified at the time when the database
file is first created. After that, it is read from the database file.

DAn.

On Mon, Feb 1, 2016 at 7:26 AM, eamonnkenny <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

Hi Dan,

When I go for maxtime = 262144 must this value to be built into the
precomputed large files, the smaller file ingestion or just the matching
at the end, or all 3?

On 29/01/16 14:58, Dan Ellis wrote:

To get accurate time indices within 1hr chunks you'll need to increase
--maxtime 262144 or something. By default (--maxtime 16384) it aliases on
a 6 minute window.

DAn.

On Fri, Jan 29, 2016 at 9:53 AM, eamonnkenny <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:

Thanks Dan,

1hr chunks is probably achievable and I can throw away the video for
the
analysis and then glean it back at the end using avconv and mediainfo.
Basically, I'm looking for a parliamentary talk in a 24hr session. To
find it I split each 24hr into 1hr sessions, then check the snippets to
find their start time. If then exist in say Dail_20060208-11.wav at
time
135.2 seconds then I just use: 3600 * 11 + 135.2 to get the state time
and pymediainfo will give the duration of any snippet file using:

duration = float( pymediainfo.MediaInfo.parse( snippetFile
).tracks[0].duration ) / 1000.0

I'll see how I get on. Thanks for your help.

On 29/01/16 11:53, Dan Ellis wrote:

The system is designed for reference items of a few minutes and
queries
of
tens of seconds. It will work with files an hour long, but beyond
that I
don't know. Can you break your files into 1 hour chunks?

DAn.

On Friday, January 29, 2016, eamonnkenny <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:

The software seems to take over my whole Debian Jessie dual quad
core
machine (Intel i7) when performing a pre compute on a 24 hour video
obtained from http://oireachtasdebates.oireachtas.ie/ with
ncores=1
files
are about 4-6.4Gb in size but should they be fully loaded at the
time
of
processing? Or does the algorithm require full loading of the video
to jump
around within it?

Also some process in 2 minutes whilst others take 150 minutes.

I'm using density=50.

—
Reply to this email directly or view it on GitHub
#18.

—
Reply to this email directly or view it on GitHub
<#18 (comment)
.

Best Regards,
Eamonn Kenny

Dr. Eamonn Kenny | Research Fellow
ADAPT Centre
O'Reilly Building Phone: +353 (0) 1 896 1335
Trinity College Dublin Mobile: +353 (0) 86 309 3967
Dublin 2 E-mail: [email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
Ireland www.adaptcentre.ie http://www.adaptcentre.ie/
https://twitter.com/adaptcentre<
https://www.facebook.com/ADAPTCentre?fref=ts><
https://www.youtube.com/channel/UC9--qVutTtyLyhZCJR7rY5g><
https://www.linkedin.com/company/adapt-centre>

—
Reply to this email directly or view it on GitHub
#18 (comment).

—
Reply to this email directly or view it on GitHub
#18 (comment).

Best Regards,
Eamonn Kenny

Dr. Eamonn Kenny | Research Fellow
ADAPT Centre
O'Reilly Building Phone: +353 (0) 1 896 1335
Trinity College Dublin Mobile: +353 (0) 86 309 3967
Dublin 2 E-mail: [email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
Ireland www.adaptcentre.ie http://www.adaptcentre.ie/
https://twitter.com/adaptcentre<
https://www.facebook.com/ADAPTCentre?fref=ts><
https://www.youtube.com/channel/UC9--qVutTtyLyhZCJR7rY5g><
https://www.linkedin.com/company/adapt-centre>

—
Reply to this email directly or view it on GitHub
#18 (comment).

from audfprint.

Large memory footprint and computational times about audfprint HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent