Comments (23)
is this happening in-print? What is your platform / cpu speed? Could it be related to endstops being triggering when the ISR thing starts? How long does this take, maybe ~4 seconds? How often does this happen?
Is it reproducable, or random?
Thanks for the report!
from marlin.
I can confirm this. It happens while printing when a layer is almost finished. Printing stops some seconds while showing many "Error:0 ISR overtaking itself." Errors. Then it resumes with the next layer. After about 10-15 layers the firmware is crashed and doesn't respond anymore. (print was centered, far away from endstops)
Marlin 1.0.0 Beta 1 compiled with Arduino 0023
Orca v0.2 with Gen6
Pronterface Version unknown (mid/end October)
SFACT V42.4.1 11.09.15
from marlin.
once the ISR sends this routine, its likely that it will fail the next time, as serial sending blocks. Maybe it would recover faster if we send the message only once. It is not a solution.But it would be nice to know what causes this.
anyone, can you print the same file again, and is the error at the same location? Whats the frequency of the sanguinolulu?
ratenki: whats your microcontroller frequency, 16 mhz?
from marlin.
I am on a ramps 1.3 with arduino mega128 @16mhz iirc. I won't try again on the same print (yoda, 12 hours into a 13 hour print), but I'll see if it happens with my fine detail test print and report back if it is consistently failing in the same spot.
I also have increased my hardware buffer from 16 to 32 blocks, which seemed to make it more robust, but didn't eliminate the problem (and I may be fooling myself, as I didn't do A/B testing).
from marlin.
I just tried again. It will fail reliable but at random points. One common behavior is that pronterface becomes very instable/crashed when printing stops. After restarting pronterface you can reconnect without restarting the printer. ( I restarted the printer befor every try to be sure)
- try:
starting from home position. It crashed after the first z-move ( lifting nozzle). Many "Error:28265 ISR overtaking itself." and then "endstop hit" but the opto is physically not triggered. - try:
printed about 10 layers successfully. Then many "Error:28265 ISR overtaking itself." . Moved 1-2 cm and finally crashed with many "Error:28265 ISR overtaking itself." - try:
after lifting the nozzle, many isr errors, then moving about 2cm and befor this move finished it crashed with this:
" ...
Error:26478 ISR overtaking itself.
Error:26478 ISR overtaking itself.
Error:26478 ISR overtaking itself.
Error:
echo:endstops hit: Z:1.91
"
no opto was physically triggered
electronic is standart mendel-parts Gen6
from marlin.
Pronterface will block waiting for results, so that isn't terribly surprising. I have crashed it by holding the arduino down in reset or unplugging the usb. Since the firmware is crashing, this is kind of expected.
Interesting that the numbers we see ie Error:$integer ISR overtaking itself. are all different... /me off to read source.
from marlin.
A little further examination of the code has me thinking we are lucky the heaters aren't running away.
Also, I added a local patch to only send the serial error once, instead of every overrun. I'll commit back if it helps.
Finally, I had the crash again, it happens at totally different places on every run for me.
from marlin.
Re-ran with the following changes:
// in configuration.h
#define BUFSIZE 5
#define BLOCK_BUFFER_SIZE 32
// in stepper.cpp
volatile int errorsent=0;
//in ISR(TIMER1_COMPA_vect)
if(busy){
if(errorsent){
SERIAL_ERROR_START
SERIAL_ERROR(*(unsigned short *)OCR1A);
SERIAL_ERRORLNPGM(" ISR overtaking itself.");
errorsent=1;
}
return;
}
///etc the rest of ISR
errorsent=0;
This prevented duplicate ISR error sends. It doesn't do anything to address the root cause, although I wonder if a larger minimum segment size would prevent the overruns by merging multiple short moves? I currently have 5, but with 16x microstepping, that is < 1/10 of a mm on my machine. Perhaps I could do 10? Just spitballing...
from marlin.
@rantenki: Thank you, that's it.
side effect: On all tests before, the print was not centered as it should be. Now it's correctly centered!
...
my 2 hour print just finished successfully with no ISR Error.
from marlin.
Unfortunately this fix isn't addressing the root cause, whatever it might be, although I suspect short segments resulting in the ISR taking longer than the inter-step time. We can probably figure that out with static analysis, but that doesn't sound like fun. ;) It isn't going to help people with 8Mhz and/or low ram (ie: atmega168) either.
from marlin.
But compared with a failed print this is a big step ahead. And with this fix it's for me more stable than the "official" 0.9.xx versions. ( improved resolution/quality would be really great too...)
from marlin.
I have been thinking a bit out what could be a possible reason for this. The only things i find that are not100% deterministic when printing idedntical files, concerning the ISR's operation, are the two following:
- the fill state of the buffer. If the the buffer runs empty,the velocities and accelerations can change. Also, in case of low buffer, the ISR might or might maybe take a different, longer branch. In this case, larger movement buffer could help a tiny bit, as ratenki might or might not have seen.
- Endstop hits or Endstop triggered for a very very short time due to electromagnetic noise. The latter would only be visible by lost microsteps= nearly not at all in normal print, if the firmware would only drop steps if the endstop is triggered. Counter this is the fact, that marlin then would stop the whole move, and this would lead to a probably visible shift. i honestly had one big layer shift, but i think that it was just a normal problem too-much-extrusion-bumping-into-the-rough-surface.
- It could maybe happen that the stepper ISR is delayed by the temperature-timer interrupt interrupt it, or maybe also a serial receive ISR, of which I honestly don't know the operational time.
If it is the timer, we could do some manual interrupt priorisation: an enum/define with values 0 1 2 that are stored in a volatile uint8_t. While operating the moveISR or tempISR writes a 1 or 2 respectively. After finishing, both write a 0.
Before starting the moveISR checks, if the uint8_t is active either by itself=very bad, or by the tempISR=continue. If the tempISR is called, it checks if the moveISR is active, and if so, returns. If this helps, also more clever things can be done, e.g. the tempISR counting how many times it did not go active, and telling the moveISR to let it finish for once after a nr of failures.
from marlin.
blender64. You also seems to have a problem with your endstops. (False triggers)
I will change Marlin so that very small pulses will not be seens as endstop triggers. But you should also fix the problem in your printer.
Increase the distance between the endstop cables and the motor cables. Or even better use shielded cables.
ISR warning with a large number are probably caused by the endstops.
The ISR warning with 0 and 32 is a FW bug that I am investigating. The OCR1A should not be 0 or 32. I never write it with a value < 100. (A least not directly)
from marlin.
rantentki,
Can you reproduce this error. One of the problems for me is that I can not reproduce it.
from marlin.
Erik: I have seen some problem that manifests as pause in print, but with autonomous sd card printing. I can try to have a Serial connected host software to monitor whats going on. I suspect it could be the same cause, so at least I can test. The annoying problem: it only happens for me 3 times in 3h cummulated print time.
from marlin.
rantenku,
Can you test with the latest version? It now should display the message only one time and not in the ISR.
from marlin.
ErikZalm: Yep, I can reproduce anytime I want, although I am out of the lab today so I cannot try again til tomorrow. I'll give the new version a shot tomorrow.
Sorry, my work schedule is a bit hectic so it is tough to carve out time to test things.
from marlin.
rantenki,
Can you give a description on how you can reproduce it? That would be a big help.
from marlin.
Yep, very long runs with lots of very fine detail and short line segments with the default configuration. If I scale yoda down to 1/4 size it happens after a couple of hours of print time. Once I get back later today I can attach the gcode that triggers it (several hours in).
from marlin.
I removed the nesting from the stepper ISR. This was showing this message.
The nesting was allowed to prevent serial errors. This is now done by checking the serial line in the stepper ISR.
from marlin.
bkubicek/ErikZalm: The newest version as of Saturday doesn't seem to have the same issue, and I didn't get any failures during some pretty long prints which previously have failed. That said, this is not entirely deterministic, so it isn't guaranteed that the bug is fixed until we get a lot of repeated non-failures (and I don't need that many yodas). Pulling the serial writes out of the ISR certainly seems to have reduced the cascading nature of the failure, but I think we need a ton of testing before we can call it closed.
As for the setting of a volatile enum, perhaps we can locate a good arduino semaphore library (I understand that freeRTOS has some), because there is always a risk of a race in the check/set conditional that will result in incorrect state being read/set.
Man, stuff like this really reminds me why I don't miss embedded dev ;)
from marlin.
rantenki,
Thanks for testing. The saturday version improved it but I still had the problem when printing on high speed.
I decided to remove the nesting from the stepper routine. This was needed for the serial communication but I put a serial check in the stepper ISR. The nesting is not needed anymore.
from marlin.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
from marlin.
Related Issues (20)
- [BUG] Delta goes insane HOT 4
- [BUG] M906 (Set stepper current) is executed when read from sd card, not in the gcode sequence. HOT 1
- [BUG] Motherboard won't communicate with tft neither Pronterface with enabled Input shaping HOT 11
- [BUG] Power Loss Resume not working HOT 2
- [BUG] (printer restarts on and on) HOT 9
- [BUG] (ENDER-5-plus - Pause/Resume and Filament Sensor not working correctly)
- [BUG] "#define Y2_DRIVER_TYPE xxxx" but "Y2_SPEP_PIN" does not work in 2.1.2.2 HOT 3
- [BUG] Temperature::preheat_end_ms_hotend missing in 2.1.2.2 HOT 4
- [BUG] M400 and G4 do not sync properly or repeatable between G[0|1|2|3] and M42 S* commands HOT 1
- [FR] Input Shaping Calibration Pattern support for Prusa mk3.5 HOT 1
- [BUG] redundant temperature reports when heating with wait HOT 9
- [BUG] Homing fail causes print issues on start HOT 9
- [BUG] Nonlinear extrusion is applied for unretracts HOT 13
- [BUG] Auto-home with a probe on v.2.1.2.2 crashes nozzle into bed HOT 29
- [BUG] Double Click BabyStep does not work on the AVR board. HOT 1
- [BUG] 2.1.2.2 with Input Shaping locks up after flash HOT 8
- [BUG] Part cooling fan not adjustable HOT 2
- [BUG] Unable to compile SKR 1.4 with Anet ET5X TFT HOT 3
- [BUG] G26 with SEGMENT_LEVELED_MOVES disabled with UBL has out of bound moves. HOT 1
- [BUG] #define Y2_DRIVER_TYPE does not work in 2.1.2.2 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from marlin.