Comments (21)
Actually, it seems more like a calibration issue: rt-app gives very different calibration results from one run to another, but when a good value is found, rt-app seems to execute workloads reliably.
EDIT: both issues are present, and seems to share a common cause
from rt-app.
The known-good version I tried was 482e47a
from rt-app.
Hi Douglas,
I have run on my hikey 20 times : "rt-app doc/examples/template.json" with the master branch and the calibration stays the same: 151ns. I have also checked the log file of the thread, the variation of the duration of the 10msec run event is 18usec.
from rt-app.
I also tried with static link:
The calibration returns 18 times 500ns and 2 times 501ns (obviously the ldexp is not the same) and the variation of the 10msec run event is 123usec.
I also run tests on the big core of hikey960 and the results are quite similar: the calibration always return 138ns and the variation of the 10 msec run event is only 7usec
from rt-app.
Do you see anything strange in the traces I've attached to the issue ?
from rt-app.
i don't see any strange things.
Have you tried to run without enabling trace ? and only using the log file with a memory buffer ?
from rt-app.
Have you tried to run without enabling trace ?
Calibration runs as root without tracing. We just run rt-app with idle states disabled, performance cpufreq gov and the rest of the userspace frozen.
and only using the log file with a memory buffer ?
Do you mean logging to a file in tmpfs, or a specific rt-app option ?
from rt-app.
Have you tried to run without enabling trace ?
Calibration runs as root without tracing. We just run rt-app with idle states disabled, performance cpufreq gov and the rest of the userspace frozen.
and only using the log file with a memory buffer ?
Do you mean logging to a file in tmpfs, or a specific rt-app option ?
"log_size" : 1024, as an example so you will not access file while running events
or even
"log_size" : "disable"
from rt-app.
I'll give it a go. Is that the same logs as enabled by logstats
(we disable it) ? Can't find a reference to logstats
in the documentation.
Also found that util of the task has some kind of low freq component (~120ms period) with the current rt-app. That happened on CPU1 of my Juno R0, although CPU2 was apparently unaffected (they are the same kind of big core).
from rt-app.
logstats ? Do you mean log_timing() ?
For the low freq component, i will try to reproduce it
from rt-app.
@derkling added "logstats" global option in the JSON produced by LISA, so I assume rt-app knows about it. This is supposed to enable/disable the generation of slack logs, now that they can also be emitted as ftrace events.
Let me know if you are interested in a trace where this strange util variation is observed, and fixed by the change to the busy loop
from rt-app.
There is no "logstats" is master branch sha1: 9a50d76
This raises the question : have you tried the master branch of rt-app ?
from rt-app.
Tests when changing the busy loop were conducted on master @ 9a50d76
So the comparison between rt-app versions with same toolchain are somewhat valid.
But rt-app used within LISA is:
rt-app v1.0-95-g72ab18b (2019-09-05 14:26:07 BST)
This commit SHA1 does not seem to exist in rt-app. @derkling Do you remember if the build of rt-app you upstreamed in LISA ?
EDIT: it's probably 72ab18b
(and not g72ab18b
), so it's almost the latest master. However, I also cannot find references to logstats
in the code ...
from rt-app.
I'll try with "log_size" : "disable"
rather than logstats
from rt-app.
I've tried with "log_size": "disable"
and I get the following results on Juno R2.
JSON: rta_ntaskscpumigration.json.txt
Task utilization with current version of rt-app:
With rt-app from #90 :
These results seem to be reproducible (I ran multiple iterations with each, always getting similar results).
I also ran an integration cycle with the modified version of rt-app and it removed some failures, most notably on the CPUMigration tests that these graphs are taken from.
from rt-app.
I have run your json file on my hikey but still can't reproduce your instability. I would say that this is even quite stable.
chart.pdf
Then, the theoretical range of util_avg for your migrX-X tasks is [159-211] but with #90 your range is around [110-150]
could you try to run my rt-app binary ? so we can check if your instability comes from your compilation env
rt-app.gz
from rt-app.
Douglas,
Any update on this problem ? Have you tried the binary that I posted ?
from rt-app.
Hi Douglas,
Have you done any progress on this topic ?
from rt-app.
Hi @vingu-linaro, sorry for the response delay. The current state of things seems to be:
- I think we get issues more when (at least) two tasks are scheduled at the same time on the same CPU like in the trace I posted here. There is plenty of idle time so I can't really explain it, and that may be a wrong lead
- PELT can give surprising results at time (although not at that scale AFAIK), so the relation
duty cycle <=> utilization
might not be so simple. In the meantime, I've added some duty-cycle oriented plots to LISA so it might be time to revisit the trace I posted here.
note: The CPUMigration test in LISA will soon not trigger that issue anymore, since it is going to measure the duty cycle from the trace directly to avoid this kind of issue.
from rt-app.
Hi @vingu-linaro, sorry for the response delay. The current state of things seems to be:
- I think we get issues more when (at least) two tasks are scheduled at the same time on the same CPU like in the trace I posted here. There is plenty of idle time so I can't really explain it, and that may be a wrong lead
I have just rerun run your json file on my hikey and tasks are scheduled simultaneously on the same CPU AFAICT. And the util_avg of all tasks stays quite stable: +/-1 at most
- PELT can give surprising results at time (although not at that scale AFAIK), so the relation
duty cycle <=> utilization
might not be so simple. In the meantime, I've added some duty-cycle oriented plots to LISA so it might be time to revisit the trace I posted here.
yes there is no linear relation between duty cycle and utilization because the period also impact the utilization. The formula is:
max utilization : (1-y^r) / (1-y^p) with r running time and p the period (kim that the step is 1024us)
min utilization : max utilization * y^(p-r)
note: The CPUMigration test in LISA will soon not trigger that issue anymore, since it is going to measure the duty cycle from the trace directly to avoid this kind of issue.
from rt-app.
PELT can give surprising results at time
I was referring to this kind of behavior
The red lines Y coordinate give the duration of the corresponding rtapp activation. Black lines are the same for the sleep part. The util signal has a weird non-symmetrical oscillation even in parts where the duty cycle stays stable (between t=3.75 and t=4).
The current PELT simulator we have seems to work well, except in that case where it gives different results, so I assume there is something tricky going on, but I've not been able to pinpoint what it is exactly. The issue can be reproduced with the Invariance test in LISA [1] (it's not failing all the times, but you should get at least 2% of failed runs or so).
I don't think this issue and the one we are discussing here are related, but who knows ...
PS: This kind of plot can be reproduced in a notebook with:
task = 'the_rtapp_task_name'
trace = Trace('trace.dat')
# plot util
axis = trace.analysis.load_tracking.plot_task_signals(task, signals=['util])
activation_axis = axis.twinx()
# Plot activation/sleep and activation "background bands". You can also replace "duration=True" by "duty_cycle=True" with similar results
trace.analysis.tasks.plot_task_activation(task, alpha=0.2, axis=activation_axis, duration=True)
axis.legend()
from rt-app.
Related Issues (20)
- Python 3 compatibility HOT 2
- Create hackbench workload with rt-app
- Print in the log file the core used HOT 1
- broken examples HOT 2
- impossible to reset uclamp
- Change in uclamp fails if not accompanied by policy
- uclamp and taskgroup mix issue
- Asymmetrical treatment of "priority" and "policy"
- error with ftrace examples HOT 1
- This isn't an issue rather a question HOT 2
- usage: rt-app <taskset.json> HOT 4
- continue_running confused someone, signal handler behaviour is undefined HOT 7
- sequence of 'phases' object in rt-app config is undefined - should use an array instead? HOT 5
- Looks like this question is still valid and unanswered. :-)
- Merge GRUB reclaiming
- RTapp fails to run with the right runtime when given a float
- mlockall always applied by default without graceful failure
- rt-app ignores invalid task names
- generated gnuplot plot file is wrong
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rt-app.