Git Product home page Git Product logo

Comments (9)

mp-se avatar mp-se commented on July 16, 2024

I have the same issue when trying to upgrade to 3.1.1

[update] This sample code works on the previous version but return error -9 on the v3.1.1 version of the esp8266 arduino fwk.

WiFiClient _wifi;
MQTTClient mqtt(128);

mqtt.begin("myhost", 1883, _wifi);
mqtt.connect("test", "", "");

if (mqtt.publish("test/topic", "value")) {
Serial.printf("MQTT publish successful" CR);
} else {
Serial.printf("MQTT publish failed with error %d" CR, mqtt.lastError());
}

mqtt.disconnect();

from arduino-mqtt.

mp-se avatar mp-se commented on July 16, 2024

Here are some dumps with debugging on for the two frameworks: It looks like it fails to get the ACK due to timeout when sending the initial data to the server (my first thoughts)

These were the flags I added.
-D DEBUG_ESP_HTTP_CLIENT
-D DEBUG_ESP_HTTP_SERVER
-D DEBUG_ESP_PORT=Serial
-D DEBUG_ESP_WIFI
-D DEBUG_ESP_SSL
-D DEBUG_ESP_CORE

Working one (older fwk)

.---- start testing for mqtt bug
[hostByName] Host: 192.168.1.16 is a IP!
:ref 1
:wr 18 0
:wrc 18 18 0
:ack 18
:rn 4
:rd 1, 4, 0
:rdi 4, 1
:rd 1, 4, 1
:rdi 3, 1
:rd 2, 4, 2
:rdi 2, 2
:c0 2, 4
:wr 19 0
:wrc 19 19 0
MQTT publish successful
:wr 2 0
:wrc 2 2 0
:ack 19
:ack 2
:rcl pb=0 sz=-1
:abort
:ur 1
:dsrcv 0
:del
---- end testing for mqtt bug

Failing (3.1.1)

.---- start testing for mqtt bug
[hostByName] Host: 192.168.1.16 is IP!
:ref 1
:wr 18 0
:wrc 18 18 0
:wustmo
:close
MQTT publish failed with error -9
:ur 1
:dsrcv 0
:del
---- end testing for mqtt bug

from arduino-mqtt.

home-s-dev avatar home-s-dev commented on July 16, 2024

I had the same problem, just with esp8266 core 3.1.1 (and not with 3.1.0 or previous 3.0.x)

The problem seems related to the change in delay() coming with ESP8266 core ver 3.1.1
Core

I'm not 100% sure, but changing https://github.com/256dpi/arduino-mqtt/blob/master/src/MQTTClient.cpp#L72 from delay(0) to delay(1) can fix. No other occurrence of "delay" is present in the code...

I hope this can help.

from arduino-mqtt.

mp-se avatar mp-se commented on July 16, 2024

You were spot on. Changing delay(0) to delay(1) fixes the problem with running under 3.1.1. I will do some more tests and issue a PR for this later on.

from arduino-mqtt.

mp-se avatar mp-se commented on July 16, 2024

I did some more testing. Trying to use delayMicroseconds instead of delay but that didnt work. I'm wondering if the issue is connected to timing and reading the result from the initial connection.

Changing from delay(1) to yield() also works. I'm not sure which option is best in this case.

Changing a behaviors' in the arduino core like this would probably break more libraries, so I'm curious if this was intentional or they introduced a new bug.

from arduino-mqtt.

home-s-dev avatar home-s-dev commented on July 16, 2024

If I have pieced together correctly, old version of delay(0) results in calling __esp_delay(0)

extern "C" void __esp_delay(unsigned long ms) {
    if (ms) {
        os_timer_setfn(&delay_timer, (os_timer_func_t*)&delay_end, 0);
        os_timer_arm(&delay_timer, ms, ONCE);
    }
    else {
        esp_schedule();
    }
    esp_suspend();
    if (ms) {
        os_timer_disarm(&delay_timer);
    }
}

and eventually in esp_schedule() followed by esp_suspend().

On the contrary, new version of delay(0) requires more steps with overload versions of esp_delay and esp_try_delay.
In this last function, the argument passed to esp_delay (alias __esp_delay) can be modified according to recurrent scheduled functions (compute_scheduled_recurrent_grain):

bool esp_try_delay(const uint32_t start_ms, const uint32_t timeout_ms, const uint32_t intvl_ms) {
    uint32_t expired = millis() - start_ms;
    if (expired >= timeout_ms) {
        return true; // expired
    }

    // compute greatest chunked delay with respect to scheduled recurrent functions
    uint32_t grain_ms = std::gcd(intvl_ms, compute_scheduled_recurrent_grain());

    // recurrent scheduled functions will be called from esp_delay()->esp_suspend()
    esp_delay(grain_ms > 0 ?
        std::min((timeout_ms - expired), grain_ms):
        (timeout_ms - expired));

    return false; // expiration must be checked again
}

This manipulation can lead to a final call of __esp_delay with a non-zero argument.
In this scenario, esp_schedule is not triggered.

The delayMicroseconds function does not yield to other tasks, so I assume it can't be a valid option.

On the other hand, yield (alias __yield) seems very similar to former delay(0)

extern "C" void __yield() {
    if (cont_can_suspend(g_pcont)) {
        esp_schedule();
        esp_suspend_within_cont();
    }
    else {
        panic();
    }
}

Sorry for the long post.

from arduino-mqtt.

mp-se avatar mp-se commented on July 16, 2024

If I have pieced together correctly, old version of delay(0) results in calling __esp_delay(0)

extern "C" void __esp_delay(unsigned long ms) {
    if (ms) {
        os_timer_setfn(&delay_timer, (os_timer_func_t*)&delay_end, 0);
        os_timer_arm(&delay_timer, ms, ONCE);
    }
    else {
        esp_schedule();
    }
    esp_suspend();
    if (ms) {
        os_timer_disarm(&delay_timer);
    }
}

and eventually in esp_schedule() followed by esp_suspend().

On the contrary, new version of delay(0) requires more steps with overload versions of esp_delay and esp_try_delay. In this last function, the argument passed to esp_delay (alias __esp_delay) can be modified according to recurrent scheduled functions (compute_scheduled_recurrent_grain):

bool esp_try_delay(const uint32_t start_ms, const uint32_t timeout_ms, const uint32_t intvl_ms) {
    uint32_t expired = millis() - start_ms;
    if (expired >= timeout_ms) {
        return true; // expired
    }

    // compute greatest chunked delay with respect to scheduled recurrent functions
    uint32_t grain_ms = std::gcd(intvl_ms, compute_scheduled_recurrent_grain());

    // recurrent scheduled functions will be called from esp_delay()->esp_suspend()
    esp_delay(grain_ms > 0 ?
        std::min((timeout_ms - expired), grain_ms):
        (timeout_ms - expired));

    return false; // expiration must be checked again
}

This manipulation can lead to a final call of __esp_delay with a non-zero argument. In this scenario, esp_schedule is not triggered.

The delayMicroseconds function does not yield to other tasks, so I assume it can't be a valid option.

On the other hand, yield (alias __yield) seems very similar to former delay(0)

extern "C" void __yield() {
    if (cont_can_suspend(g_pcont)) {
        esp_schedule();
        esp_suspend_within_cont();
    }
    else {
        panic();
    }
}

Sorry for the long post.

Great investigation, you really traced the code. The question is if this change is a bug in the new arduino framework and it will be fixed in future versions or not. I would regard this as a bug in the arduino framework since it changes the behaviour and is not backwards compatible.

from arduino-mqtt.

256dpi avatar 256dpi commented on July 16, 2024

Thank you all for the detailed investigation into this issue!

I just released a new version (2.5.1) of the library that increases the delay to 1ms which hopefully results in a proper scheduler yield on all RTOS platforms.

I refrained from adding compiler conditionals to check the existence of "yield()" to keep the code simple for the time being. But, if this does not fix the issue or causes others, we can look into adding those as necessary.

from arduino-mqtt.

256dpi avatar 256dpi commented on July 16, 2024

I'm closing this. Please reopen if necessary.

from arduino-mqtt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.