Git Product home page Git Product logo

Comments (8)

ckormanyos avatar ckormanyos commented on September 7, 2024 1

Thanks for the detailed analysis and questions. Yes, the double-read of the 32-bit system tick is intended to ensure consistency of reading the value without stopping interrupts or stopping the timer resource. This method assumes that the act of reading the system_tick takes less time than the underlying 128usec of the timer period. And this could be an issue if the system has lots of other time-consuming interrupts. It might be better to use a 16-bit timer as the underlying resource. But I am not sure if the microcontroller has one.

The volatile qualification on the variables is intended to persuade the compiler to avoid aggressively optimizing the read-order of the system_tick and the hardware registers. But using volatile does not, by standard, guarantee that this is the case. So we have a situation that requires analyzing the underlying assemby or verifying in some other way the proper read-order of the code. This means that the code is not portable and probably needs to be verified when using other compilers or other optimization settings. This is an uncomfortable situation, potentially capable of being improved. But I decided to go this way for this particular design.

The self-written subset of the STL for the AVR-CGG compiler is a very good resource indeed. This is discussed in one of the later chaters of the book---but only briefly. I have used this STL subset for various compilers (not only GCC).

from real-time-cpp.

ckormanyos avatar ckormanyos commented on September 7, 2024 1

I agree that the specification of volatile is vague. In fact the code, even as written, is not guaranteed to be free from re-ordering. So we are operating in unspecified territory here.

Another note: The book uses timer0 compare register a for interrupt service of the system tick. Whereas the companion code at git has switched to timer0 overflow.

Another note about +1 (adding one) to the system tick: The underlying timer0 frequency is 2MHz. So when we synthesize the entire 32-bit consistent microsecond tick, we add the lower byte (8 bits) from the actual timer count register to the upper three bytes of the 32-bit value. Ultimately, this means that the value of the timer0 counter register needs division by two before being synthesized into the 32-bit consistent usec tick. The addition of one provides a rounding correction for the timer0 register prior to division by 2.

Cheers, Chris

from real-time-cpp.

ajneu avatar ajneu commented on September 7, 2024

One question I do have:

can we be certain, that the compiler will not rearrange the order of the following lines ??? :

 // Do the first read of the timer0 counter and the system tick.
const timer_register_type tim0_cnt_1 = mcal::reg::access<timer_address_type, timer_register_type, mcal::reg::tcnt0>::reg_get();
const mcal::gpt::value_type sys_tick_1 = system_tick;

// Do the second read of the timer0 counter.
const timer_register_type tim0_cnt_2 = mcal::reg::access<timer_address_type, timer_register_type, mcal::reg::tcnt0>::reg_get();

This order absolutely may not be rearranged!! (The reading of system_tick, needs to be framed (preceded and then followed) by the reading of tcnt0).

Checking the assembly listing

cd ~/real-time-cpp/ref_app/src/mcal/avr
# create assembler code:
avr-g++ -std=c++11 -I . -I ../../util/STL -I ../.. -S -fverbose-asm -g -O2 mcal_gpt.cpp -o mcal_gpt.s 
# create asm interlaced with source lines:
avr-as -alhnd mcal_gpt.s > mcal_gpt.lst

we get

;;;; tim0_cnt_1 = mcal::reg::access<timer_address_type, timer_register_type, mcal::reg::tcnt0>::reg_get();
;;;; ##########################
.LM11:
    in r19,0x26  ;  D.5116, MEM[(volatile unsigned char *)70B]
.LBE23:
.LBE22:
    .stabs  "mcal_gpt.cpp",132,0,0,.Ltext4
.Ltext4:
    .stabn  68,0,63,.LM12-.LFBB4
.LM12:
;;;; sys_tick_1 = system_tick;
;;;; ##########################
    lds r24,_ZN12_GLOBAL__N_111system_tickE  ;  sys_tick_1, system_tick
    lds r25,_ZN12_GLOBAL__N_111system_tickE+1    ;  sys_tick_1, system_tick
    lds r26,_ZN12_GLOBAL__N_111system_tickE+2    ;  sys_tick_1, system_tick
    lds r27,_ZN12_GLOBAL__N_111system_tickE+3    ;  sys_tick_1, system_tick
.LBB24:
.LBB25:
    .stabs  "../../mcal/mcal_reg_access_template.h",132,0,0,.Ltext5
.Ltext5:
    .stabn  68,0,26,.LM13-.LFBB4
.LM13:
;;;; tim0_cnt_2 = mcal::reg::access<timer_address_type, timer_register_type, mcal::reg::tcnt0>::reg_get();
;;;; ##########################
    in r18,0x26  ;  D.5116, MEM[(volatile unsigned char *)70B]
.LBE25:
.LBE24:
    .stabs  "mcal_gpt.cpp",132,0,0,.Ltext6
.Ltext6:
    .stabn  68,0,71,.LM14-.LFBB4
.LM14:
;;;; ((tim0_cnt_2 >= tim0_cnt_1) ? ... : ...)
;;;; ##########################
    cp r18,r19   ;  D.5116, D.5116
    brlo .L8     ; ,
    .stabn  68,0,70,.LM15-.LFBB4
.LM15:
;;;; // if-branch (i.e. (tim0_cnt_2 >= tim0_cnt_1) is true)
;;;; ##########################
    mov r18,r19  ;  D.5115, D.5116
    ldi r19,0    ;  D.5115
    subi r18,-1  ;  D.5115,
    sbci r19,-1  ;  D.5115,
    lsr r19  ;  D.5115
    ror r18  ;  D.5115
    .stabn  68,0,71,.LM16-.LFBB4
.LM16:
    mov r22,r24  ;  D.5113, sys_tick_1
    mov r23,r25  ;  D.5113, sys_tick_1
    mov r24,r26  ;  D.5113, sys_tick_1
    mov r25,r27  ;  D.5113, sys_tick_1
    or r22,r18   ;  D.5113, D.5115
    ret
.L8:
;;;; // else-branch (i.e. (tim0_cnt_2 < tim0_cnt_1) is true)
;;;; ##########################
    .stabn  68,0,71,.LM17-.LFBB4
.LM17:
    lds r22,_ZN12_GLOBAL__N_111system_tickE  ;  D.5113, system_tick
    lds r23,_ZN12_GLOBAL__N_111system_tickE+1    ;  D.5113, system_tick
    lds r24,_ZN12_GLOBAL__N_111system_tickE+2    ;  D.5113, system_tick
    lds r25,_ZN12_GLOBAL__N_111system_tickE+3    ;  D.5113, system_tick
    ldi r19,0    ;  D.5115
    subi r18,-1  ;  D.5115,
    sbci r19,-1  ;  D.5115,
    lsr r19  ;  D.5115
    ror r18  ;  D.5115
    or r22,r18   ;  D.5113, D.5115
    ret

so at least according to this listing the order is correct.

But is this also guaranteed for every C++-standards-conforming compiler?
Thanks.
A.

(Side-comment question: Is .LM16 above actually wasting assembly cycles by copying the system_tick it read previously (at the beginning - see .LM12); or is this necessary [maby] because if we'd have read into r22 to r25 right from the start it could have been modified by some compare or some other instruction-result ???)
.
.
.
Edit:
... Addendum...
Oh I've just checked the code again... and realized the following:
All 3 statements (whose order needs to be held) have volatile; and I think that this is sufficient to guarantee the order, right?

  • system_tick is volatile: ref
  • the readings via reg_get() (here and here) are volatile: ref

from real-time-cpp.

ajneu avatar ajneu commented on September 7, 2024

Thanks for your reply.

But using volatile does not, by standard, guarantee that this is the case.

I decided to check this and ... man... that standard legalese is so vague! ...
shrouding everything to become relative to "the eye of the beholder" (i.e. relative to individual opinion on what is written).

Anyway, here are some opinions:
http://stackoverflow.com/a/2535246
http://stackoverflow.com/questions/14785639/may-accesses-to-volatiles-be-reordered
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf#page=7

Basically this November 2014 C++ Standard ref
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf#page=22
says:

1.9.8.5 A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.

1.9.8.1 Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

(So it's all up to our interpretation of "stricly"!?!)

Browsing what people have written... here is my opinion:
When working with a compiler that had a good compiler-team (one that values correctness more, than "just getting a value"!), volatile will not be re-ordered in a single thread of execution (or in interrupt routines).
Obviously with multiple threads of execution, those threads run when they want, so there is no guarantee of how one thread interacts with another, unless we use semaphores, memory barriers etc.

Ultimately (if it's really important), I'll rather be safe than sorry... and then would do as you have written:

[...] we have a situation that requires analyzing the underlying assemby or verifying in some other way the proper read-order of the code

from real-time-cpp.

ajneu avatar ajneu commented on September 7, 2024

Hi Chris,

I've rechecked your code.
It has a bug! but yes it's oh so sneaky, and hard to spot! ;)

Bug is here and here, i.e. in the bottom 2 lines of the following:

// Perform the consistency check.
const mcal::gpt::value_type consistent_microsecond_tick
  = ((tim0_cnt_2 >= tim0_cnt_1) ? mcal::gpt::value_type(sys_tick_1  | std::uint8_t(std::uint16_t(std::uint16_t(tim0_cnt_1) + 1U) >> 1U))
                                : mcal::gpt::value_type(system_tick | std::uint8_t(std::uint16_t(std::uint16_t(tim0_cnt_2) + 1U) >> 1U)));

The fix is as follows:

// Perform the consistency check.
const mcal::gpt::value_type consistent_microsecond_tick
  = ((tim0_cnt_2 >= tim0_cnt_1) ? mcal::gpt::value_type(sys_tick_1  + std::uint8_t(std::uint16_t(std::uint16_t(tim0_cnt_1) + 1U) >> 1U))
                                : mcal::gpt::value_type(system_tick + std::uint8_t(std::uint16_t(std::uint16_t(tim0_cnt_2) + 1U) >> 1U)));

(can you spot it? -> change bit-or to plus!)

or better still (let's get rid of that +1, which serves no rounding purposes. It is unnecessary, complex and just eats processor cycles in an interrupt). I recommend:

// Perform the consistency check.
const mcal::gpt::value_type consistent_microsecond_tick
  = ((tim0_cnt_2 >= tim0_cnt_1) ? mcal::gpt::value_type(sys_tick_1  | std::uint8_t(tim0_cnt_1 >> 1U))
                                : mcal::gpt::value_type(system_tick | std::uint8_t(tim0_cnt_2 >> 1U)));

I've sent you a pull request for this one.

Here's a nice test-program, that prints some tables, so that we can see what's going on:
(By the way: only after having written a 1st rudimentary version of this test-program, did I notice the bug)

#include <iostream>
#include <iomanip>
#include <bitset>
#include <climits>
#include <cstdint>
#include <array>

using namespace std;

//#define arr_len(ARRAY) (sizeof(ARRAY)/sizeof(ARRAY[0]))

/*
template<typename T, size_t len> 
constexpr size_t arr_len(T(&)[len])
{
  return len;
}
*/

//#define str_len(ARRAY) ((sizeof(ARRAY)/sizeof(ARRAY[0]))-1)

template<size_t len> 
constexpr size_t str_len(const char(&)[len])
{
  return len-1;
}

template<size_t len>
//void print_heading(const size_t (&arr_width)[len], const char * const(&arr_str)[len])
void print_heading(const array<size_t, len>& arr_width, const array<const char * const, len>& arr_str)
{
  // print heading
  cout << left;

  if (len == 0) // one little safety check
    return;

  for (size_t i = 0; ; ) {
    cout << setw(arr_width[i]) << arr_str[i];
    if (++i == len) {
      cout << '\n';
      break;
    }
    cout << " | ";
  }

  cout << setfill('-');
  for (size_t i = 0; ; ) {
    cout << setw(arr_width[i]) << "";
    if (++i == len) {
      cout << '\n';
      break;
    }
    cout << " | ";
  }

  cout << setfill(' ')
       << right;
}

typedef std::uint32_t value_type;
typedef std::uint8_t  timer_register_type;


// BUGGY //////
value_type calc_buggy(value_type sys_tick_1, timer_register_type tim0_cnt_1)
{
  return sys_tick_1 | std::uint8_t(std::uint16_t(std::uint16_t(tim0_cnt_1) + 1U) >> 1U);
}

// FIX //////
value_type calc_fix(value_type sys_tick_1, timer_register_type tim0_cnt_1)
{
  return sys_tick_1 + std::uint8_t(std::uint16_t(std::uint16_t(tim0_cnt_1) + 1U) >> 1U);
}

// BETTER //////
value_type calc_better(value_type sys_tick_1, timer_register_type tim0_cnt_1)
{
  return sys_tick_1 | std::uint8_t(tim0_cnt_1 >> 1U);
}

int main()
{
  value_type          sys_tick_1;
  timer_register_type tim0_cnt_1;
  value_type          buggy;
  value_type          fix;
  value_type          better;

  // table heading string
  const char str_sys_tick_1[] = "sys_tick_1";
  const char str_tim0_cnt_1[] = "tim0_cnt_1";
  const char str_buggy[]      = "buggy (with +1)";
  const char str_fix[]        = "fix (with +1)";
  const char str_better[]     = "better (without +1)";

  // width of bit-pattern (number of bits)
  const size_t wbp0 = CHAR_BIT*sizeof(sys_tick_1);
  const size_t wbp1 = CHAR_BIT*sizeof(tim0_cnt_1);
  const size_t wbp2 = CHAR_BIT*sizeof(buggy);
  const size_t wbp3 = CHAR_BIT*sizeof(fix);
  const size_t wbp4 = CHAR_BIT*sizeof(better);

  // width of table columns
  const size_t w_sys_tick_1 = max(str_len(str_sys_tick_1), wbp0);
  const size_t w_tim0_cnt_1 = max(str_len(str_tim0_cnt_1), wbp1);
  const size_t w_buggy      = max(str_len(str_buggy)     , wbp2);
  const size_t w_fix        = max(str_len(str_fix)       , wbp3);
  const size_t w_better     = max(str_len(str_better)    , wbp4);


  // initial values
  constexpr value_type          SYS_TICK_1_BEGIN = 0xFFFFFF80U;
  constexpr timer_register_type TIM0_CNT_1_BEGIN = 0xFDu;
  constexpr timer_register_type TIM0_CNT_1_END   = 0x04U;



  //////////// BINARY BITPATTERN /////////////////////////////////
  // print heading
  /*
  print_heading((size_t[])            {  w_sys_tick_1,   w_tim0_cnt_1,   w_buggy,   w_fix,   w_better},
                (const char* const []){str_sys_tick_1, str_tim0_cnt_1, str_buggy, str_fix, str_better});
  */
  print_heading(array<size_t, 5>{            {  w_sys_tick_1,   w_tim0_cnt_1,   w_buggy,   w_fix,   w_better}},
                array<const char* const, 5>{ {str_sys_tick_1, str_tim0_cnt_1, str_buggy, str_fix, str_better}});

  sys_tick_1 = SYS_TICK_1_BEGIN;

  for (tim0_cnt_1 = TIM0_CNT_1_BEGIN; tim0_cnt_1 != TIM0_CNT_1_END; ++tim0_cnt_1) {
    if (tim0_cnt_1 == 0x00U)
      sys_tick_1 += 0x80U;

    buggy  = calc_buggy (sys_tick_1, tim0_cnt_1);
    fix    = calc_fix   (sys_tick_1, tim0_cnt_1);
    better = calc_better(sys_tick_1, tim0_cnt_1);

    //      width                 bit-pattern
    //      ------------------    ------------------------
    cout << setw(w_sys_tick_1) << bitset<wbp0>{sys_tick_1} << " | "
         << setw(w_tim0_cnt_1) << bitset<wbp1>{tim0_cnt_1} << " | "
         << setw(w_buggy)      << bitset<wbp2>{buggy}      << " | "
         << setw(w_fix)        << bitset<wbp3>{fix}        << " | "
         << setw(w_better)     << bitset<wbp4>{better}     << '\n';

  }

  cout << "\n\n";



  //////////// BASE 10 NUMBERS /////////////////////////////////
  // print heading
  /*
  print_heading((size_t[])            {str_len(str_sys_tick_1), str_len(str_tim0_cnt_1), str_len(str_buggy), str_len(str_fix), str_len(str_better)},
                (const char* const []){        str_sys_tick_1,          str_tim0_cnt_1,          str_buggy,          str_fix,          str_better});
  */
  print_heading(array<size_t, 5>{            {str_len(str_sys_tick_1), str_len(str_tim0_cnt_1), str_len(str_buggy), str_len(str_fix), str_len(str_better)}},
                array<const char* const, 5>{ {        str_sys_tick_1,          str_tim0_cnt_1,          str_buggy,          str_fix,          str_better}});

  sys_tick_1 = SYS_TICK_1_BEGIN;

  for (tim0_cnt_1 = TIM0_CNT_1_BEGIN; tim0_cnt_1 != TIM0_CNT_1_END; ++tim0_cnt_1) {
    if (tim0_cnt_1 == 0x00U)
      sys_tick_1 += 0x80U;

    buggy  = calc_buggy (sys_tick_1, tim0_cnt_1);
    fix    = calc_fix   (sys_tick_1, tim0_cnt_1);
    better = calc_better(sys_tick_1, tim0_cnt_1);

    //      width                            number
    //      -----------------------------    ---------------
    cout << setw(str_len(str_sys_tick_1)) << sys_tick_1      << " | "
         << setw(str_len(str_tim0_cnt_1)) << int(tim0_cnt_1) << " | "
         << setw(str_len(str_buggy))      << buggy           << " | "
         << setw(str_len(str_fix))        << fix             << " | "
         << setw(str_len(str_better))     << better          << '\n';

  }


  return 0;
}

The program prints the following:
The 3rd line under the heading divider ("---") of 3rd column (buggy (with +1)) ... shows the effect of the bug. Once in binary, and once base 10.

sys_tick_1                       | tim0_cnt_1 | buggy (with +1)                  | fix (with +1)                    | better (without +1)             
-------------------------------- | ---------- | -------------------------------- | -------------------------------- | --------------------------------
11111111111111111111111110000000 |   11111101 | 11111111111111111111111111111111 | 11111111111111111111111111111111 | 11111111111111111111111111111110
11111111111111111111111110000000 |   11111110 | 11111111111111111111111111111111 | 11111111111111111111111111111111 | 11111111111111111111111111111111
11111111111111111111111110000000 |   11111111 | 11111111111111111111111110000000 | 00000000000000000000000000000000 | 11111111111111111111111111111111
00000000000000000000000000000000 |   00000000 | 00000000000000000000000000000000 | 00000000000000000000000000000000 | 00000000000000000000000000000000
00000000000000000000000000000000 |   00000001 | 00000000000000000000000000000001 | 00000000000000000000000000000001 | 00000000000000000000000000000000
00000000000000000000000000000000 |   00000010 | 00000000000000000000000000000001 | 00000000000000000000000000000001 | 00000000000000000000000000000001
00000000000000000000000000000000 |   00000011 | 00000000000000000000000000000010 | 00000000000000000000000000000010 | 00000000000000000000000000000001


sys_tick_1 | tim0_cnt_1 | buggy (with +1) | fix (with +1) | better (without +1)
---------- | ---------- | --------------- | ------------- | -------------------
4294967168 |        253 |      4294967295 |    4294967295 |          4294967294
4294967168 |        254 |      4294967295 |    4294967295 |          4294967295
4294967168 |        255 |      4294967168 |             0 |          4294967295
         0 |          0 |               0 |             0 |                   0
         0 |          1 |               1 |             1 |                   0
         0 |          2 |               1 |             1 |                   1
         0 |          3 |               2 |             2 |                   1

Regards,
ajneu

from real-time-cpp.

ckormanyos avatar ckormanyos commented on September 7, 2024

I agree. That is an actual bug. Good catch. I will merge the change next week. Thanks and regards, Chris.

from real-time-cpp.

ckormanyos avatar ckormanyos commented on September 7, 2024

The suggested correction has been merged. Thanks again and good catch on a hard-to-find bug. Regards, Chris.

from real-time-cpp.

ajneu avatar ajneu commented on September 7, 2024

Your welcome.

Thanks for your book and sharing your code and ideas. I need to make some time, to really read and "work" your book. It looks highly interesting!
A.

from real-time-cpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.