Git Product home page Git Product logo

rev's Introduction

rev : RISC-V Native CPU Model for SST

rev

NOTICE: REV HAS MIGRATED THE PRIMARY BRANCH FROM MASTER TO MAIN ON FEBRUARY 05, 2024

Getting Started

The Rev SST component is designed to provide cycle-based simulation capabilities of an arbitrary RISC-V core or cores. Rev utilizes the Sandia Structural Simulation Toolkit as the core parallel discrete event simulation framework. We utilize the standard SST "core" component libraries to build the Rev component. As a result, Rev can be attached to any other existing SST component for full system and network simulations.

The Rev model is unique in the scope of other SST CPU models in that is provides users the ability to load compiled binaries (ELF binaries). Rather than requiring input in the form of textual assembly or hex dumps, SST contains a RISC-V compatible loader function that generates all the necessary symbol tables and addressing modes for the target RISC-V CPU. Further, Rev permits users to generate simulation configurations that contain heterogeneous RISC-V designs with support for disparate extensions.

The Rev component infrastructure can also be extended to include custom instruction extensions. This provides users the ability to design new instruction templates without requiring modifications to the core crack+decode infrastructure. For more information on creating custom templates, see the developer documentation.

Prerequisites

Given that this is an SST external component, the primary prerequisite is a current installation of the SST Core. The Rev building infrastructure assumes that the sst-config tool is installed and can be found in the current PATH environment.

Rev relies on CMake for building the component from source. The minimum required version for this is 3.19.

Building

Building the Rev SST component from source using CMake (>= 3.19) can be performed as follows:

git clone
cd rev/build
cmake -DRVCC=/path/to/riscv/compiler/exe ..
make -j
make install

Additional build options include: * -DBUILD_ASM_TESTING=ON

After a successful build you can test your install with:

make test

You can also run a single test with:

ctest -R <test_name>

where you can substitute test_name with the name of the test, for example:

ctest -R TEST_EX1

will run the test found in test/ex1. See the full list of tests in test/CMakeLists.txt.

Building Compatible Compilers

As mentioned above, the Rev SST model supports standard ELF binary payloads as input to the model. As a result, we need a cross compilation framework to build source code into suitable binaries. We highly recommend building a suitable RISC-V compiler from source. This will permit you to tune the necessary options in order to support a multitude of different standard, optional and custom extensions.

We recommend compiling the riscv-gnu-toolchain using the multilib option. This is analogous to the following:

git clone https://github.com/riscv/riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init --recursive
./configure --prefix=/opt/riscv --enable-multilib
make -j

Example Execution

Component Options

The Rev SST component contains the following options:

Parameter Required? Type Description
verbose unsigned integer Values of 0-8. Increasing values increase the verbosity of output
numCores X unsigned integer Values of 1-N. Sets the number of cores in the simulation
clock X Hertz "xGHz", "xKHz". Sets the clock frequency of the device.
memSize X unsigned integer Sets the size of physical memory in bytes
machine X "[Core:Arch]" "[0:RV32I],[1:RV64G]". Sets the RISC-V architecture for the target core
startAddr X "[Core:StartAddr]" "[0:0x00010144],[1:0x123456]". Sets the entry point for each core
memCost "[Core:Min:Max]" "[0:1:10],[1:50:100]", Sets the minimum and maximum latency (in cycles) for each core's memory load
program X string "example.exe". Sets the target ELF executable
table string "/path/to/table.txt". Sets the path the instruction cost table
splash 0/1 Default=0. Setting to 1 displays the Rev bootsplash
enable_nic 0/1 Default=0. Setting to 1 enables a standard NIC
enable_pan 0/1 Default=0. Setting to 1 enables a PAN NIC
enable_test 0/1 Default=0. Setting to 1 enables the internal PAN test harness
enable_pan_stats 0/1 Default=0. Setting to 1 enables internal statistics for PAN commands
enableRDMAMbox 0/1 Default=1. Setting to 1 enables the internal RDMA Mailbox for applications to initiate messages
msgPerCycle unsigned integer Default=1. Sets the number of messages to inject per cycle
testIters unsigned integer Default=255. Sets the number of iterations for each PAN test loop

Deriving the ELF Entry Point

The latest version of Rev no longer requires the user to manually derive the starting address for binaries that contain a main() function. If the user specifies the starting address as 0x00, then the Rev loader will automatically derive the main() symbol address and use it as the starting address. From here, the Rev model will perform an initial setup and reset of the target core or cores in the same manner as prescribed by the RISC-V ABI. Most users will expect to execute their application starting at the main() function. If the user requires a different starting address or the target payload does not contain a main() function, then the user must manually derive the address. Given an executable that has been compiled (example.exe), we may derive the entry point address using the tool chain's objdump tool. An example of doing so is as follows:

riscv64-unknown-elf-objdump -dC example.exe | grep "<main>"

This will give us output similar to the following:

00010144 <main>:

Using this, the address 0x00010144 becomes our entry point address.

The Rev component model has the ability to start execution at valid address in the RISC-V text space. However, keep in mind, that Rev assumes no prior state when starting execution (start from reset). As a result, the user cannot assume that the Rev model will prepopulate any memory or register state outside of what is provided when executing from main().

Multicore Execution

As mentioned above, Rev has the ability to execute multiple, heterogeneous cores in the same simulation. However, if users seek to execute multiple, homogeneous cores, there is an additional configuration option for doing so. For example, if you seek to simulate 8 homogeneous cores, set numCores to 8 and use the following configuration parameter for the machine option:

"machine" : "[CORES:RV64G]"

This CORES option sets all 8 cores to RV64G. Similarly, if you seek to start all the cores at the same startAddr, you can use the same option as follows:

"startAddr : "[CORES:0x00000000]"

Sample Execution

Executing one of the included sample tests can be performed as follows:

export REV_EXE=ex1.exe
sst rev-test-ex1.py

Adding Tests to the test suite

To add tests to the Rev test suite, edit test/CMakeLists.txt. By default, the tests look for the SST output string "Program Execution Complete" and have a max runtime of 30 seconds. Both of these values are user defined with the test/CMakeLists.txt file.

All tests should follow the existing directory structure and be added to test/<your_new_test_name>/

CTest will look in your newly created folder for a shell script, this is the script that will build the RISC-VV executable using the RISC-V compiler. See test/ext/run_ex1.sh for an example.

Contributing

We welcome outside contributions from corporate, academic and individual developers. However, there are a number of fundamental ground rules that you must adhere to in order to participate. These rules are outlined as follows:

  • By contributing to this code, one must agree to the licensing described in the top-level LICENSE file.
  • All code must adhere to the existing C++ coding style. While we are somewhat flexible in basic style, you will adhere to what is currently in place. This includes camel case C++ methods and inline comments. Uncommented, complicated algorithmic constructs will be rejected.
  • We support compilaton and adherence to C++ standard methods. All new methods and variables contained within public, private and protected class methods must be commented using the existing Doxygen-style formatting. All new classes must also include Doxygen blocks in the new header files. Any pull requests that lack these features will be rejected.
  • All changes to functionality and the API infrastructure must be accompanied by complementary tests All external pull requests must target the devel branch. No external pull requests will be accepted to the master branch.
  • All external pull requests must contain sufficient documentation in the pull request comments in order to be accepted.

Extension Development

See the developer documentation.

License

See the LICENSE file

Authors

Acknowledgements

  • TBD

rev's People

Contributors

donofrio avatar jleidel avatar kpgriesser avatar leekillough avatar mgrzywac avatar rkabrick avatar sysarchbuild avatar ukasz avatar vcave avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rev's Issues

tlbSize never initialized

tlbSize in RevMem.h/cc is never initialized or modified. I see the following in RevMem::AddToTLB

if (LRUQueue.size() == tlbSize) {
      uint64_t LRUvAddr = LRUQueue.back();
      LRUQueue.pop_back();
      TLB.erase(LRUvAddr);
    }

However, tlbSize is never initialized to anything, so this check can potentially fail to generate reproducible results.

From a simple ori test, we see:

free(): invalid pointer
[node002:50392] *** Process received signal ***
[node002:50392] Signal: Aborted (6)
[node002:50392] Signal code:  (-6)
[node002:50392] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x155554a73890]
[node002:50392] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x15555372ae97]
[node002:50392] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x15555372c801]
[node002:50392] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x89897)[0x155553775897]
[node002:50392] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x9090a)[0x15555377c90a]
[node002:50392] [ 5] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4cc)[0x155553783e1c]
[node002:50392] [ 6] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN9__gnu_cxx13new_allocatorISt10_List_nodeImEE10deallocateEPS2_m+0x37)[0x155541de3201]
[node002:50392] [ 7] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZNSt16allocator_traitsISaISt10_List_nodeImEEE10deallocateERS2_PS1_m+0x2b)[0x155541de21f2]
[node002:50392] [ 8] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZNSt7__cxx1110_List_baseImSaImEE11_M_put_nodeEPSt10_List_nodeImE+0x28)[0x155541de0e7a]
[node002:50392] [ 9] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZNSt7__cxx114listImSaImEE8_M_eraseESt14_List_iteratorImE+0x6f)[0x155541de0847]
[node002:50392] [10] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZNSt7__cxx114listImSaImEE8pop_backEv+0x36)[0x155541ddf9fe]
[node002:50392] [11] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU6RevMem8AddToTLBEmm+0x13e)[0x155541ddc5ba]
[node002:50392] [12] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU6RevMem12CalcPhysAddrEmm+0x1bf)[0x155541ddc80b]
[node002:50392] [13] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU6RevMem8WriteMemEmmPv+0xc9)[0x155541ddcccf]
[node002:50392] [14] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader14WriteCacheLineEmmPv+0xf4)[0x155541dd6d40]
[node002:50392] [15] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader9LoadElf64EPcm+0x172)[0x155541dd7a3e]
[node002:50392] [16] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader7LoadElfEv+0x364)[0x155541dd8aa2]
[node002:50392] [17] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU9RevLoaderC2ENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_PNS0_6RevMemEPNS_6OutputE+0xa6)[0x155541dd69ae]
[node002:50392] [18] /scratch/jenkins/workspace/AGILE_General/Rev-devel-GCC11/src/librevcpu.so(_ZN3SST6RevCPU6RevCPUC2EmRNS_6ParamsE+0x202a)[0x155541da7faa]

Infinite loop on std::sort

The following code compiled with clang16.0 is never ending when N is 24 or larger.

//clang++ -march=rv64imafd -o rand.exe rand.cpp
#include <algorithm>

//23 ok
//24 infinite loop
#define N 24

unsigned long long mem[N];

int main() {
        std::sort(mem, mem + N);
}

ELF executable attached: sort.zip

Memory test fails on 2 byte accesses

The following code fails on rev-devel (tested sha: cd753f7)
The same test works fine with 1,4,8 byte wide accesses.

#include <stdio.h>
#define u16 unsigned short
#define u8 unsigned char
#define assert(x) if (!(x)) { asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); }

/*
 * Write 0 to a memory, hammer down ones around it,
 * check that it is still 0 at the end.
 */

#define hammer(type) int hammer_##type(type *addr, size_t size) { \
        int ret = 0; \
        type ff = (type)0xffffffffffff; \
        for (size_t i = 2; i < size - 2; i++) { \
                addr[i] = 0; \
                for (size_t j = 0; j < 10; j++) { \
                        addr[i - 1] = ff; \
                        addr[i - 2] = ff; \
                        addr[i + 1] = ff; \
                        addr[i + 2] = ff; \
                } \
                ret |= (addr[i] != 0);  \
        } \
        return ret; \
}

hammer(u16)

int test_3(void *addr, size_t size) {
        int ret = 0;

        ret |= hammer_u16(addr, size / 2);
        return ret;
}

/* Memory to test */
#define SIZE 10000
u8 mem[SIZE];

int main(){
        assert(!test_3(mem, SIZE));
}

Exit syscall should return exit code

Currently calling exit syscall ends simulation with an error code of 255.
This is due to implementation, which calls fatal log with -1 as an argument:
output->fatal(CALL_INFO, -1, ExitString.c_str());

The proper way is the following:
output->fatal(CALL_INFO, status, ExitString.c_str());

With that in place return code of the test program can be checked from shell environment.

Infinite loop on strlen

Compile
riscv64-linux-gnu-gcc tc3.cpp -static -o tc3.exe
and run program:

int main() {
char* txt = "some really nice text";
int l = strlen(txt);
return 0;
}

or alternatively:
int main() {
char* txt = "some really nice text";

    int i =0;
    while ( *txt++ != '\0')
            i++;
    return 0;

}

REV enters infinite loop and never finishes execution.

Infinite instruction execution in syscalls/fork test

  • code example from: # syscalls/fork test
  • Build with RISC-V gcc version: gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04.1)
  • run command in test dir: sst rev-test.py

Execution stuck on mv instruction after ecall to fork

10efc >> <main>:        fa010113                addi    sp,sp,-96                                                             
10f00 >> <main>:        04113c23                sd      ra,88(sp)                                                             
10f04 >> <main>:        04813823                sd      s0,80(sp)                                                             
10f08 >> <main>:        06010413                addi    s0,sp,96                                                              
10f0c >> <main>:        00067797                auipc   a5,0x67
10f10 >> <main>:        0c47b783                ld      a5,196(a5) # 77fd0 <_GLOBAL_OFFSET_TABLE_+0x60>                       
10f14 >> <main>:        0007b703                ld      a4,0(a5)                                                              
10f18 >> <main>:        fee43423                sd      a4,-24(s0)                                                            
10f1c >> <main>:        00000713                li      a4,0                                                                  
10f20 >> <main>:        e4dff0ef                jal     ra,10d6c <rev_fork>                                                   
10d6c >>   <rev_fork>:  fe010113                addi    sp,sp,-32                                                             
10d70 >>   <rev_fork>:  00813c23                sd      s0,24(sp)                                                             
10d74 >>   <rev_fork>:  02010413                addi    s0,sp,32                                                              
10d78 >>   <rev_fork>:  0dc00893                li      a7,220                                                                
10d7c >>   <rev_fork>:  00000073                ecall                                                                         
10d80 >>   <rev_fork>:  00050793                mv      a5,a0                                                                 
10d80 >>   <rev_fork>:  00050793                mv      a5,a0                                                                 
RevCPU[cpu:clockTick:17000]: Cycle: 17
RevCPU[cpu:DecodeInst:17000]: Core 0 ; Thread 0; PC:InstPayload = 0x10d7c:0x73
RevCPU[cpu:ClockTick:17000]: Core 0 ; Thread 0; Executing PC= 0x10d7c
RevCPU[cpu:ClockTick:17000]: Core 0; HartID 0; PID 1024 - Exception Raised: ECALL with code = 220
RevCPU[cpu:ClockTick:17000]: Core 0 ; ThreadID 0; Retiring PC= 0x10d7c
RevCPU[cpu:clockTick:18000]: Cycle: 18
RevCPU[cpu:DecodeInst:18000]: Core 0 ; Thread 0; PC:InstPayload = 0x10d80:0x50793
RevCPU[cpu:ClockTick:18000]: Core 0 ; Thread 0; Executing PC= 0x10d80
RevCPU[cpu:ClockTick:18000]: Core 0 ; ThreadID 0; Retiring PC= 0x10d80
RevCPU[cpu:clockTick:19000]: Cycle: 19
RevCPU[cpu:DecodeInst:19000]: Core 0 ; Thread 0; PC:InstPayload = 0x10d80:0x50793
RevCPU[cpu:ClockTick:19000]: Core 0 ; Thread 0; Executing PC= 0x10d80
RevCPU[cpu:ClockTick:19000]: Core 0 ; ThreadID 0; Retiring PC= 0x10d80
RevCPU[cpu:clockTick:20000]: Cycle: 20
RevCPU[cpu:DecodeInst:20000]: Core 0 ; Thread 0; PC:InstPayload = 0x10d80:0x50793
RevCPU[cpu:ClockTick:20000]: Core 0 ; Thread 0; Executing PC= 0x10d80
RevCPU[cpu:ClockTick:20000]: Core 0 ; ThreadID 0; Retiring PC= 0x10d80
RevCPU[cpu:clockTick:21000]: Cycle: 21
RevCPU[cpu:DecodeInst:21000]: Core 0 ; Thread 0; PC:InstPayload = 0x10d80:0x50793
RevCPU[cpu:ClockTick:21000]: Core 0 ; Thread 0; Executing PC= 0x10d80
RevCPU[cpu:ClockTick:21000]: Core 0 ; ThreadID 0; Retiring PC= 0x10d80
RevCPU[cpu:clockTick:22000]: Cycle: 22
RevCPU[cpu:DecodeInst:22000]: Core 0 ; Thread 0; PC:InstPayload = 0x10d80:0x50793
RevCPU[cpu:ClockTick:22000]: Core 0 ; Thread 0; Executing PC= 0x10d80
RevCPU[cpu:ClockTick:22000]: Core 0 ; ThreadID 0; Retiring PC= 0x10d80

Infinite loop on element remove from list

below you can find call from main() and first iterations in list_pop()

8000308c >> <main>:     00810513                add     a0,sp,8
80003090 >> <main>:     88cff0ef                jal     8000211c <towers_clear>
8000211c >> <towers_clear>:     00852703                lw      a4,8(a0)
80002120 >> <towers_clear>:     04e05063                blez    a4,80002160 <towers_clear+0x44>
80002124 >> <towers_clear>:     9a818813                add     a6,gp,-1624 # 80003270 <g_nodeFreeList>
80002128 >> <towers_clear>:     01053783                ld      a5,16(a0)
8000212c >> <towers_clear>:     00883683                ld      a3,8(a6)
80002130 >> <towers_clear>:     00c0006f                j       8000213c <towers_clear+0x20>
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>
80002134 >> <towers_clear>:     00060793                mv      a5,a2
80002138 >> <towers_clear>:     00058693                mv      a3,a1
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>
80002134 >> <towers_clear>:     00060793                mv      a5,a2
80002138 >> <towers_clear>:     00058693                mv      a3,a1
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>
80002134 >> <towers_clear>:     00060793                mv      a5,a2
80002138 >> <towers_clear>:     00058693                mv      a3,a1
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>
80002134 >> <towers_clear>:     00060793                mv      a5,a2
80002138 >> <towers_clear>:     00058693                mv      a3,a1
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>
80002134 >> <towers_clear>:     00060793                mv      a5,a2
80002138 >> <towers_clear>:     00058693                mv      a3,a1
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>
80002134 >> <towers_clear>:     00060793                mv      a5,a2
80002138 >> <towers_clear>:     00058693                mv      a3,a1
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>

This is infinitte loop

80002134 >> <towers_clear>:     00060793                mv      a5,a2
80002138 >> <towers_clear>:     00058693                mv      a3,a1
8000213c >> <towers_clear>:     0087b603                ld      a2,8(a5)
80002140 >> <towers_clear>:     fff7071b                addw    a4,a4,-1
80002144 >> <towers_clear>:     00d7b423                sd      a3,8(a5)
80002148 >> <towers_clear>:     00078593                mv      a1,a5
8000214c >> <towers_clear>:     fe0714e3                bnez    a4,80002134 <towers_clear+0x18>

Address not mapped 0xd8 regresion

Since 9e46d73 we see each test fails on unmapped address (LLVM)

root@uk:/home/dev/riscv/off/rev/test/ex2# REV_EXE=ex2.exe sst ../wk2/config.py
WARNING: Building component "cpu" with no links assigned.
RevCPU[cpu:RevCPU:0]: Building Rev with 1 cores
[uk:60506] *** Process received signal ***
[uk:60506] Signal: Segmentation fault (11)
[uk:60506] Signal code: Address not mapped (1)
[uk:60506] Failing at address: 0xd8
[uk:60506] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7efe6a11c520]
[uk:60506] [ 1] /home/dev/riscv/off/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader14WriteCacheLineEmmPv+0x3b)[0x7efe638e1e1b]
[uk:60506] [ 2] /home/dev/riscv/off/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader9LoadElf64EPcm+0xff)[0x7efe638e319f]
[uk:60506] [ 3] /home/dev/riscv/off/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader7LoadElfEv+0x1bd)[0x7efe638e39ad]
[uk:60506] [ 4] /home/dev/riscv/off/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU9RevLoaderC2ENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_PNS0_6RevMemEPNS_6OutputE+0xc4)[0x7efe638e3c54]
[uk:60506] [ 5] /home/dev/riscv/off/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU6RevCPUC1EmRNS_6ParamsE+0x125c)[0x7efe638c93bc]
[uk:60506] [ 6] /home/dev/riscv/off/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST3ELI14DerivedBuilderINS_6RevCPU6RevCPUENS_9ComponentEJmRNS_6ParamsEEE6createEmS6_+0x2a)[0x7efe638d8fea]
[uk:60506] [ 7] sst(_ZN3SST7Factory15CreateComponentEmRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_6ParamsE+0x403)[0x5632bbb7bbb3]
[uk:60506] [ 8] sst(_ZN3SST15Simulation_impl13performWireUpERNS_11ConfigGraphERKNS_8RankInfoEm+0x8e)[0x5632bbbbbe0e]
[uk:60506] [ 9] sst(+0xea62e)[0x5632bbb1062e]
[uk:60506] [10] sst(main+0x18c3)[0x5632bbaf7de3]
[uk:60506] [11] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7efe6a103d90]
[uk:60506] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7efe6a103e40]
[uk:60506] [13] sst(_start+0x25)[0x5632bbb0dda5]
[uk:60506] *** End of error message ***
Segmentation fault (core dumped)

Errors when setting up multiple cores

While setting up multiple cores, the parameters for machine models, start address, instruction table and memory cost can not be parsed correctly. It seems that InitXXX functions in RevOpts.cc have bugs. Moving std::vector<std::string> vstr; inside the for loop can solve the issue.

[P2] Main argument passing does not work

Argument passing to the program is not working, here is an example:


#define assert(x) if (!(x)) { asm(".dword 0x00000000"); }

int main(int argc, char *argv[]) {
        /*
         * SST logs the following:
         * RevCPU[cpu:LoadProgramArgs:0]: Loading program argv[0] = args.exe
         * RevCPU[cpu:LoadProgramArgs:0]: Loading program argv[1] = x
         * RevCPU[cpu:LoadProgramArgs:0]: Loading program argv[2] = 2
         */
        assert(argc == 3);
        assert(argv[0][0] == 'a');
        assert(argv[0][1] == 'r');
        assert(argv[0][2] == 'g');
        assert(argv[0][3] == 's');
        assert(argv[0][4] == '.');
        assert(argv[0][5] == 'e');
        assert(argv[0][6] == 'x');
        assert(argv[0][7] == 'e');
        assert(argv[0][8] == 0);

        assert(argv[1][0] == 'x');
        assert(argv[1][1] == 0);
        assert(argv[2][0] == '2');
        assert(argv[2][1] == 0);
        return 0;
}

argc is actually 0, argv[0][0] is as well 0.
~

FPU ops issues

We see basic problems with floating point operations.
To give an example, the following test fails on devel branch:

int main() {
        float z = 1;
        assert(z > 0);
}

Calling unimplemented syscall should fail

Return code of syscall table's at() function is not checked. unordered_map returns std::out_of_range when the key (syscall number) is not present. Execution of uniplemented syscall should fail immediately with an error message to avoid debugging code which resulted in wrong behavior of the syscall.

AUIPC simulation bug

The auipc instruction for RV64 has a bug when encountering negative immediates and it generates an incorrect address. The bug is because of that it did not sign-extend the value after bit shifting.

To reproduce the issue, compile the following C program with riscv64-unknown-elf-gcc -march=rv64imafd. Since we have some issues with decoding the compressed instruction, the test is compiled without the compressed option.

#include <stdio.h>
int main( int argc, char **argv ){
  printf( "Test Print...\n" );
}

Here is a snippet of the simulation output:
image

When investigating in the disassembled object file, we can see that the correct jump address should be 0x10fe0 instead of 0x100010fe0.

   1297c:	00050913          	mv	s2,a0
   12980:	00048513          	mv	a0,s1
   12984:	ffffe097          	auipc	ra,0xffffe
   12988:	65c080e7          	jalr	1628(ra) # 10fe0 <_malloc_r>
   1298c:	02051063          	bnez	a0,129ac <__smakebuf_r+0x98>
   12990:	01041783          	lh	a5,16(s0)

Sign-extending the bit shifted immediate will solve the issue:

  static bool auipc(RevFeature *F, RevRegFile *R,RevMem *M, RevXbgas *Xbgas, RevInst Inst) {
        uint64_t tmp;
        if( F->IsRV32() ){
          R->RV32[Inst.rd] = 0x00;
          R->RV32[Inst.rd] = (Inst.imm << 12) + dt_u32(R->RV32_PC,32);
          R->RV32_PC += Inst.instSize;
        }else{
          SEXT(tmp, Inst.imm << 12, 32);
          R->RV64[Inst.rd] = 0x00;
          R->RV64[Inst.rd] = tmp + R->RV64_PC;
          R->RV64_PC += Inst.instSize;
        }
        return true;
      }

broken bge/ble instructions

The following c code fails due to reading data out of memory because conditional branch was not taken.
First two iterations are fine, but third does not stop causing an assert via comparing garbage from the stack.

#define assert(x)                                                              \
  if (!(x)) {                                                                  \
    asm(".byte 0x00");                                                         \
    asm(".byte 0x00");                                                         \
    asm(".byte 0x00");                                                         \
    asm(".byte 0x00");                                                         \
  }

struct FE{
        int ec;
};

int x[] = {0, 1};
int Forums_ = 2;

int main() {
        struct FE forums[2];
        forums[0].ec = 0;
        forums[1].ec = 1;

        for (int i = 0; i < Forums_; i++) {
                assert(x[i] == forums[i].ec);
        }
        return 0;
}

Compiled as:

$ clang -march=rv64imafd tmp.c -O2 -o tmp.exe

The execution log shows that branch at bge instruction is not taken when a1 and a2 registers are equal.

00000000000111c0 <main>:
   111c0: 13 01 01 ff   addi    sp, sp, -16
   111c4: 37 35 01 00   lui     a0, 19
   111c8: 83 25 45 f2   lw      a1, -220(a0)
   111cc: 13 06 10 00   li      a2, 1
   111d0: 13 16 06 02   slli    a2, a2, 32
   111d4: 23 34 c1 00   sd      a2, 8(sp)
   111d8: 63 50 b0 04   blez    a1, 0x11218 <main+0x58>
   111dc: 13 06 00 00   li      a2, 0
   111e0: b7 36 01 00   lui     a3, 19
   111e4: 93 86 c6 f1   addi    a3, a3, -228
   111e8: 13 07 81 00   addi    a4, sp, 8
   111ec: 6f 00 40 01   j       0x11200 <main+0x40>
   111f0: 13 06 16 00   addi    a2, a2, 1
   111f4: 13 07 47 00   addi    a4, a4, 4
   111f8: 93 86 46 00   addi    a3, a3, 4
   **111fc: 63 5e b6 00   bge     a2, a1, 0x11218 <main+0x58>**
   11200: 83 a7 06 00   lw      a5, 0(a3)
   11204: 03 28 07 00   lw      a6, 0(a4)
   11208: e3 84 07 ff   beq     a5, a6, 0x111f0 <main+0x30>
   1120c: 00 00         <unknown>
   1120e: 00 00         <unknown>
   11210: 83 25 45 f2   lw      a1, -220(a0)
   11214: 6f f0 df fd   j       0x111f0 <main+0x30>
   11218: 13 05 00 00   li      a0, 0
   1121c: 13 01 01 01   addi    sp, sp, 16
   11220: 67 80 00 00   ret

And the offending part is:


RevCPU[cpu:DecodeInst:29000]: Core 0 ; Thread 0; PC:InstPayload = **0x111fc:**0xb65e63
RevCPU[cpu:ClockTick:29000]: Core 0 ; Thread 0; Executing PC= 0x111fc
RevCPU[cpu:ClockTick:29000]: Core 0 ; ThreadID 0; Retiring PC= 0x111fc
RevCPU[cpu:ClockTick:29000]: Core 0 REG[0] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[1] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[2] 0x7e7ffed0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[3] 0x1371c
RevCPU[cpu:ClockTick:29000]: Core 0 REG[4] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[5] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[6] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[7] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[8] 0x1371c
RevCPU[cpu:ClockTick:29000]: Core 0 REG[9] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[10] 0x13000
**RevCPU[cpu:ClockTick:29000]: Core 0 REG[11] 0x2
RevCPU[cpu:ClockTick:29000]: Core 0 REG[12] 0x2**
RevCPU[cpu:ClockTick:29000]: Core 0 REG[13] 0x12f24
RevCPU[cpu:ClockTick:29000]: Core 0 REG[14] 0x7e7ffee0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[15] 0x1
RevCPU[cpu:ClockTick:29000]: Core 0 REG[16] 0x1
RevCPU[cpu:ClockTick:29000]: Core 0 REG[17] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[18] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[19] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[20] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[21] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[22] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[23] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[24] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[25] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[26] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[27] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[28] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[29] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[30] 0x0
RevCPU[cpu:ClockTick:29000]: Core 0 REG[31] 0x0
RevCPU[cpu:clockTick:30000]: Cycle: 30
RevCPU[cpu:DecodeInst:30000]: Core 0 ; Thread 0; PC:InstPayload = **0x11200**:0x6a783

[P2] mret instruction corrupts x0 register

Follow steps from #64
See also: #72

At PC=0x190 REV executes mret instruction which leads to x0 corruption.
Problematic instruction is:
73 00 20 30 mret
it's get decoded as lui and 0x30200000 value is written into x0 (it seems 2 bytes from mret instruction was used to overwrite x0).

RevCPU[cpu:ClockTick:77000]: Core 0 ; ThreadID 0; Retiring PC= 0x8000018c
RevCPU[cpu:clockTick:78000]: Cycle: 78
RevCPU[cpu:DecodeInst:78000]: Core 0 ; Thread 0; PC:InstPayload = 0x80000190:0x30200073
RevCPU[cpu:ClockTick:78000]: Core 0 ; Thread 0; Executing PC= 0x80000190
--------------- lui 32
RevCPU[cpu:ClockTick:78000]: Core 0 ; ThreadID 0; Retiring PC= 0x80000190
RevCPU[cpu:ClockTick:78000]: Core 0 REG[0] 0x0 -> 0x30200000
RevCPU[cpu:clockTick:79000]: Cycle: 79
RevCPU[cpu:DecodeInst:79000]: Core 0 ; Thread 0; PC:InstPayload = 0x80000194:0x200193
RevCPU[cpu:ClockTick:79000]: Core 0 ; Thread 0; Executing PC= 0x80000194

Now, while I understand we might not support mret instuction in REV - it should not corrupt any other data.
I think in case REV executes not supported instruction it should:

  • write this into to log file - with some details
  • not corrupt code/data

I debuged this on mret instruction, but I see sometimes issues with ret/jal jumping into wrong address and this might be something related.
full_run.txt

See full execution log in attachment

[P2] strstr fail

We see the following strstr test failing on devel branch (tested sha: cd753f7)

#include <stdio.h>
#include <string.h>
#define assert(x)                                                              \
  if (!(x)) {                                                                  \
    asm(".byte 0x00");                                                         \
    asm(".byte 0x00");                                                         \
    asm(".byte 0x00");                                                         \
    asm(".byte 0x00");                                                         \
  }

const char * const text = "Hello I am a normal text with some pattern hidden inside.";

int main() {
        assert(strstr(text, "pattern") == text + 35);
        return 0;
}

Infinite loop on divw instruction

Compile and run program:

int main() {
        for (int i=0;i<2;i++)
        {
                int zz = (i+1)/i;
        }
        return 0;
}

Execution enters infinite loop on divw instruction:

628 >> <main>:  fe010113                addi    sp,sp,-32
62c >> <main>:  00813c23                sd      s0,24(sp)
630 >> <main>:  02010413                addi    s0,sp,32
634 >> <main>:  fe042423                sw      zero,-24(s0)
638 >> <main>:  0280006f                j       660 <main+0x38>
660 >> <main>:  fe842783                lw      a5,-24(s0)
664 >> <main>:  0007871b                sext.w  a4,a5
668 >> <main>:  00100793                li      a5,1
66c >> <main>:  fce7d8e3                bge     a5,a4,63c <main+0x14>
63c >> <main>:  fe842783                lw      a5,-24(s0)
640 >> <main>:  0017879b                addiw   a5,a5,1
644 >> <main>:  0007879b                sext.w  a5,a5
648 >> <main>:  fe842703                lw      a4,-24(s0)
64c >> <main>:  02e7c7bb                divw    a5,a5,a4
64c >> <main>:  02e7c7bb                divw    a5,a5,a4
64c >> <main>:  02e7c7bb                divw    a5,a5,a4
64c >> <main>:  02e7c7bb                divw    a5,a5,a4
64c >> <main>:  02e7c7bb                divw    a5,a5,a4
64c >> <main>:  02e7c7bb                divw    a5,a5,a4
64c >> <main>:  02e7c7bb                divw    a5,a5,a4

Segfault on ELF loading

REV segfaults when trying to read ELF with large (11MB+) global table in .bss.
Code to reproduce:
char t[1024*1024*12]; int main() { return 0; }

Trace:
`
[user] *** Process received signal ***
[user] Signal: Segmentation fault (11)
[user] Signal code: Invalid permissions (2)
[user] Failing at address: 0x7f477689f000
[user] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f477d201520]
[user] [ 1] /home/dev/riscv/off/frameworks.simulators.agile.rev/test/math/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU6RevMem8WriteMemEmmPv+0x2b6)[0x7f47769440f6]
[user] [ 2] /home/dev/riscv/off/frameworks.simulators.agile.rev/test/math/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader9LoadElf64EPcm+0x1d9)[0x7f477693eea9]
[user] [ 3] /home/dev/riscv/off/frameworks.simulators.agile.rev/test/math/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU9RevLoader7LoadElfEv+0x1bd)[0x7f477693f5fd]
[user] [ 4] /home/dev/riscv/off/frameworks.simulators.agile.rev/test/math/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU9RevLoaderC2ENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_PNS0_6RevMemEPNS_6OutputE+0xc4)[0x7f477693f8a4]
[user] [ 5] /home/dev/riscv/off/frameworks.simulators.agile.rev/test/math/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU6RevCPUC1EmRNS_6ParamsE+0x11a2)[0x7f4776924f02]
[user] [ 6] /home/dev/riscv/off/frameworks.simulators.agile.rev/test/math/frameworks.simulators.agile.rev/src/librevcpu.so(ZN3SST3ELI14DerivedBuilderINS_6RevCPU6RevCPUENS_9ComponentEJmRNS_6ParamsEEE6createEmS6+0x2a)[0x7f4776934dfa]
[user] [ 7] sst(_ZN3SST7Factory15CreateComponentEmRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_6ParamsE+0x403)[0x556dd755fbb3]
[user] [ 8] sst(_ZN3SST15Simulation_impl13performWireUpERNS_11ConfigGraphERKNS_8RankInfoEm+0x8e)[0x556dd759fe0e]
[user] [ 9] sst(+0xea62e)[0x556dd74f462e]
[user] [10] sst(main+0x18c3)[0x556dd74dbde3]
[user] [11] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f477d1e8d90]
[user] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f477d1e8e40]
[user] [13] sst(_start+0x25)[0x556dd74f1da5]
[user] *** End of error message ***
Segmentation fault (core dumped)

`

Failing dual memset

The following code does not work properly. Tested on devel branch (84260da).
First memset produces correct results, but we hit assert after a second memset.

#include <string.h>
#define assert(x) if (!(x)) { asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); }
#define N (1024)
char mem[N];

int main() {
         memset(mem, 42, N);
         for(int i = 0; i < N; i++)
             assert(mem[i] == 42);
         memset(mem, 0, N);
         for(int i = 0; i < N; i++)
                 assert(mem[i] == 0);
        return 0;
}

[P0] Atomic increment doesn't work

We observe cpp atomic operations to fail on data sizes wider than char (tested sha: cd753f7 compiled with LLVM)
Simplest example of such behavior is depicted below.

#include <atomic>
#define assert(x) if (!(x)) { asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); }

int main() {
        std::atomic<int> a;
        a = 0;
        a++;
        assert(a == 1);
}

[P1] Improper behaviour of auipc instruction

Download and run rv64ui-p-auipc test from riscv
Expected: test passing without trouble
Actual result: test fails, full execution log:

800000a0 >> <reset_vector>:  13 0a 00 00        li      s4, 0
800000a4 >> <reset_vector>:  93 0a 00 00        li      s5, 0
800000a8 >> <reset_vector>:  13 0b 00 00        li      s6, 0
800000ac >> <reset_vector>:  93 0b 00 00        li      s7, 0
800000b0 >> <reset_vector>:  13 0c 00 00        li      s8, 0
800000b4 >> <reset_vector>:  93 0c 00 00        li      s9, 0
800000b8 >> <reset_vector>:  13 0d 00 00        li      s10, 0
800000bc >> <reset_vector>:  93 0d 00 00        li      s11, 0
800000c0 >> <reset_vector>:  13 0e 00 00        li      t3, 0
800000c4 >> <reset_vector>:  93 0e 00 00        li      t4, 0
800000c8 >> <reset_vector>:  13 0f 00 00        li      t5, 0
800000cc >> <reset_vector>:  93 0f 00 00        li      t6, 0
800000d0 >> <reset_vector>:  73 25 40 f1        csrr    a0, mhartid
800000d4 >> <reset_vector>:  63 10 05 00        bnez    a0, 0x800000d4 <reset_vector+0x80>
800000d8 >> <reset_vector>:  97 02 00 00        auipc   t0, 0
800000dc >> <reset_vector>:  93 82 02 01        addi    t0, t0, 16
800000e0 >> <reset_vector>:  73 90 52 30        csrw    mtvec, t0
800000e4 >> <reset_vector>:  73 50 44 74        csrwi   1860, 8
800000e8 >> <reset_vector>:  97 02 00 00        auipc   t0, 0
800000ec >> <reset_vector>:  93 82 02 01        addi    t0, t0, 16
800000f0 >> <reset_vector>:  73 90 52 30        csrw    mtvec, t0
800000f4 >> <reset_vector>:  73 50 00 18        csrwi   satp, 0
800000f8 >> <reset_vector>:  97 02 00 00        auipc   t0, 0
800000fc >> <reset_vector>:  93 82 02 02        addi    t0, t0, 32
80000100 >> <reset_vector>:  73 90 52 30        csrw    mtvec, t0
80000104 >> <reset_vector>:  93 02 f0 ff        li      t0, -1
80000108 >> <reset_vector>:  93 d2 b2 00        srli    t0, t0, 11
8000010c >> <reset_vector>:  73 90 02 3b        csrw    pmpaddr0, t0
80000110 >> <reset_vector>:  93 02 f0 01        li      t0, 31
80000114 >> <reset_vector>:  73 90 02 3a        csrw    pmpcfg0, t0
80000118 >> <reset_vector>:  73 50 40 30        csrwi   mie, 0
8000011c >> <reset_vector>:  97 02 00 00        auipc   t0, 0
80000120 >> <reset_vector>:  93 82 42 01        addi    t0, t0, 20
80000124 >> <reset_vector>:  73 90 52 30        csrw    mtvec, t0
80000128 >> <reset_vector>:  73 50 20 30        csrwi   medeleg, 0
8000012c >> <reset_vector>:  73 50 30 30        csrwi   mideleg, 0
80000130 >> <reset_vector>:  93 01 00 00        li      gp, 0
80000134 >> <reset_vector>:  97 02 00 00        auipc   t0, 0
80000138 >> <reset_vector>:  93 82 02 ed        addi    t0, t0, -304
8000013c >> <reset_vector>:  73 90 52 30        csrw    mtvec, t0
80000140 >> <reset_vector>:  13 05 10 00        li      a0, 1
80000144 >> <reset_vector>:  13 15 f5 01        slli    a0, a0, 31
80000148 >> <reset_vector>:  63 5c 05 00        bgez    a0, 0x80000160 <reset_vector+0x10c>
80000160 >> <reset_vector>:  97 02 00 80        auipc   t0, 524288
80000164 >> <reset_vector>:  93 82 02 ea        addi    t0, t0, -352
80000168 >> <reset_vector>:  63 8a 02 00        beqz    t0, 0x8000017c <reset_vector+0x128>
8000016c >> <reset_vector>:  73 90 52 10        csrw    stvec, t0
80000170 >> <reset_vector>:  b7 b2 00 00        lui     t0, 11
80000174 >> <reset_vector>:  9b 82 92 10        addiw   t0, t0, 265
80000178 >> <reset_vector>:  73 90 22 30        csrw    medeleg, t0
8000017c >> <reset_vector>:  73 50 00 30        csrwi   mstatus, 0
80000180 >> <reset_vector>:  97 02 00 00        auipc   t0, 0
80000184 >> <reset_vector>:  93 82 42 01        addi    t0, t0, 20
80000188 >> <reset_vector>:  73 90 12 34        csrw    mepc, t0
8000018c >> <reset_vector>:  73 25 40 f1        csrr    a0, mhartid
80000190 >> <reset_vector>:  73 00 20 30        mret
80000194 >> <test_2>:  93 01 20 00      li      gp, 2
80000198 >> <test_2>:  17 25 00 00      auipc   a0, 2
8000019c >> <test_2>:  13 05 c5 71      addi    a0, a0, 1820
800001a0 >> <test_2>:  ef 05 40 00      jal     a1, 0x800001a4 <test_2+0x10>
800001a4 >>   <test_2>:  33 05 b5 40    sub     a0, a0, a1
800001a8 >>   <test_2>:  b7 23 00 00    lui     t2, 2
800001ac >>   <test_2>:  9b 83 03 71    addiw   t2, t2, 1808
800001b0 >>   <test_2>:  63 14 75 02    bne     a0, t2, 0x800001d8 <fail>
800001b4 >>   <test_3>:  93 01 30 00    li      gp, 3
800001b8 >>   <test_3>:  17 e5 ff ff    auipc   a0, 1048574
800001bc >>   <test_3>:  13 05 c5 8f    addi    a0, a0, -1796
800001c0 >>   <test_3>:  ef 05 40 00    jal     a1, 0x800001c4 <test_3+0x10>
800001c4 >>     <test_3>:  33 05 b5 40          sub     a0, a0, a1
800001c8 >>     <test_3>:  b7 e3 ff ff          lui     t2, 1048574
800001cc >>     <test_3>:  9b 83 03 8f          addiw   t2, t2, -1808
800001d0 >>     <test_3>:  63 14 75 00          bne     a0, t2, 0x800001d8 <fail>
800001d8 >>     <fail>:  0f 00 f0 0f    fence
800001dc >>     <fail>:  63 80 01 00    beqz    gp, 0x800001dc <fail+0x4>
800001e0 >>     <fail>:  93 91 11 00    slli    gp, gp, 1
800001e4 >>     <fail>:  93 e1 11 00    ori     gp, gp, 1
800001e8 >>     <fail>:  93 08 d0 05    li      a7, 93
800001ec >>     <fail>:  13 85 01 00    mv      a0, gp
800001f0 >>     <fail>:  73 00 00 00    ecall

REV segfault on workload execution

Workload executes for couple of seconds and then crashes. This happens with both gcc and llvm compiled binary.

RevCPU[cpu:DecodeInst:319977000]: Core 0 ; Thread 0; PC:InstPayload = 0x8000b10c:0xb50533
RevCPU[cpu:ClockTick:319977000]: Core 0 ; Thread 0; Executing PC= 0x8000b10c
RevCPU[cpu:ClockTick:319977000]: Core 0 ; ThreadID 0; Retiring PC= 0x8000b10c
RevCPU[cpu:DecodeInst:319978000]: Core 0 ; Thread 0; PC:InstPayload = 0x8000b110:0x53087
RevCPU[cpu:ClockTick:319978000]: Core 0 ; Thread 0; Executing PC= 0x8000b110
 *** Process received signal ***
 Signal: Segmentation fault (11)
 Signal code: Invalid permissions (2)
 Failing at address: 0x7fdd64028088
 [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fdd6c917520]
 [ 1] rev/src/librevcpu.so(_ZN3SST6RevCPU6RevMem7ReadMemEmmPv+0xb0)[0x7fdd68afdec0]
 [ 2] rev/src/librevcpu.so(_ZN3SST6RevCPU6RevMem10ReadDoubleEm+0x24)[0x7fdd68afe1f4]
[ 3] rev/src/librevcpu.so(_ZN3SST6RevCPU5RV32D3fldEPNS0_10RevFeatureEPNS0_10RevRegFileEPNS0_6RevMemENS0_7RevInstE+0x38)[0x7fdd68b12b18]
[ 4] rev/src/librevcpu.so(_ZN3SST6RevCPU7RevProc9ClockTickEm+0x388)[0x7fdd68b0b618]
[ 5] rev/src/librevcpu.so(_ZN3SST6RevCPU6RevCPU9clockTickEm+0xa6)[0x7fdd68ae0e66]
[ 6] sst(_ZN3SST5Clock7executeEv+0x10c)[0x55a840109cec]
[ 7] sst(_ZN3SST15Simulation_impl3runEv+0x3d3)[0x55a84019ef93]
[ 8] sst(+0xf6e54)[0x55a8400ebe54]
[ 9] sst(main+0x18a3)[0x55a8400d1753]
[10] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fdd6c8fed90]
[11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fdd6c8fee40]
[12] sst(_start+0x25)[0x55a8400e93e5]
 *** End of error message ***
Segmentation fault

Config file:

import os
import sst
import sys

program = os.getenv("REV_EXE", "xxx")

# Define SST core options
sst.setProgramOption("timebase", "1ps")
sst.setProgramOption("stopAtCycle", "0s")

# Tell SST what statistics handling we want
sst.setStatisticLoadLevel(4)

max_addr_gb = 1

# Define the simulation components
comp_cpu = sst.Component("cpu", "revcpu.RevCPU")
comp_cpu.addParams({
        "verbose" : verbose_level,                    # Verbosity
        "numCores" : 1,                               # Number of cores
        "clock" : "1.0GHz",                           # Clock
        "memSize" : 1024*1024*10,                     # Memory size in bytes
        "machine" : "[0:RV64IMAFD]",                  # Core:Config; RV64I for core 0
        "memCost" : "[0:1:10]",                       # Memory loads required 1-10 cycles
        "program" : program,                          # Target executable
        "splash" : 1                                  # Display the splash message
})

sst.setStatisticOutput("sst.statOutputCSV")
sst.enableAllStatisticsForAllComponents()

# EOF

Issues of Pipeline when using memHierarchy

The load instructions could retire early before the data is committed to the registers.

To reproduce the issue, I use the following test program and use the same configuration file as memh_2, which uses memHierarchy as the memory backend. The program is compiled with riscv64-unknown-elf-gcc (12.2.0).

int main(int argc, char **argv){
  int i = 9;
  int j = 10;
  i = i + j;
  if( i == 19 )
    while(1);
  else
    return i;
}

The code will not terminate if all load instructions work correctly. However, the simulation ends in about 453 ns of the simulated time.

If the memory backend is switched to the built-in memory, it works as expected.

Programmer can write value to x0 (zero) register

x0 (zero) register is hardwired 0. This means any write to this register should be discarded and all read of this register must return 0.

Run below code
int main() {
asm (" li a0, 5; addi zero, a0, 0; addi a0, zero, 6;") ;
return 0;
}

I expect that a0=6, but it's 11 instead.

This is related to #64.

See attached logs for more details with tracked register changes
zero_bug

RevLoader Should Automatically Find `main` symbol

Currently, users are required to specify the PC of the first instruction. While this is useful for debugging and running firmware, libraries, etc, it would be very convenient to modify the loader and the RevCPU infrastructure such that the loader automatically starts execution at the main symbol when no other options are specified.

'li' instruction behaves incorrectly.

Steps to reproduce:

  1. clone https://github.com/riscv/riscv-tests
  2. use riscv gcc compiler to build isa test suite
  3. run test rv64ui-p-xori.
  4. Result - program execution ends with:
RevCPU[cpu:ClockTick:247000]: Core 0; HartID 0; PID 1024 - Exception Raised: ECALL with code = 16713805
FATAL: RevCPU[cpu:ExecEcall:247000]: Ecall Code = 16713805 not foundSST Fatal Backtrace Information:
    0 : sst(_ZNK3SST6Output5fatalEjPKcS2_iS2_z+0x4ae) [0x56207fbcac3e]
    1 : /home/dev/agile/17_libc_tests/wf_ci/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU7RevProc9ExecEcallEv+0x134) [0x7efe703f7834]
    2 : /home/dev/agile/17_libc_tests/wf_ci/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU7RevProc9ClockTickEm+0xd05) [0x7efe703fd295]
    3 : /home/dev/agile/17_libc_tests/wf_ci/frameworks.simulators.agile.rev/src/librevcpu.so(_ZN3SST6RevCPU6RevCPU9clockTickEm+0xa6) [0x7efe703b6fe6]
    4 : sst(_ZN3SST5Clock7executeEv+0x10c) [0x56207fb5b9cc]
    5 : sst(_ZN3SST15Simulation_impl3runEv+0x3d3) [0x56207fbf44d3]
    6 : sst(+0xf808e) [0x56207fb3e08e]
    7 : sst(main+0x186b) [0x56207fb23e0b]
    8 : /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7efe79e25d90]
    9 : /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7efe79e25e40]
   10 : sst(_start+0x25) [0x56207fb3b4b5]
------------------------------------------------------------------------

This can be observed with others tests as well - I tested rv64ui-p-and, rv64ui-p-lw and fails in similar way.

See attached image for reference - before 'ecall' execution does 'li a7, 93' but value in A7 seems corrupted after that instruction and then ecall fails with that corrupted value
li_bug

SLTI instruction fails

After enabling SLTI ISA test we observe fails related to sign extension.
On the other hand SLTUI tests are passing, where all numbers as treated as unsigned.

TEST_IMM_OP( 15, slti, 1, 0xffffffffffffffff, 0x001 );

00000000000111e4 <test_15>:
   111e4: 93 01 f0 00   li      gp, 15
   111e8: 93 00 f0 ff   li      ra, -1
   111ec: 13 a7 10 00   slti    a4, ra, 1
   111f0: 93 03 10 00   li      t2, 1
   111f4: 63 16 77 00   bne     a4, t2, 0x11200 <fail>
   111f8: 63 12 30 00   bne     zero, gp, 0x111fc <pass>

-1 is smaller than 1, so t2 should be set to 1, while it is not.

Other failing test cases are:

TEST_IMM_OP( 7,  slti, 1, 0xffffffff80000000, 0x000 );
TEST_IMM_OP( 8,  slti, 1, 0xffffffff80000000, 0x800 );
TEST_IMM_OP( 12, slti, 1, 0xffffffff80000000, 0x7ff );

Dual memset fail

The code listed below does not work properly.
Tested on devel branch (84260da) and standard config.
First memset produces correct results, but we hit an assert after a second memset call, which is unexpected.

#include <string.h>
#define assert(x) if (!(x)) { asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); }
#define N (1024)
char mem[N];

int main() {
         memset(mem, 42, N);
         for(int i = 0; i < N; i++)
             assert(mem[i] == 42);
         memset(mem, 0, N);
         for(int i = 0; i < N; i++)
                 assert(mem[i] == 0);
        return 0;
}

CMake builds should uninstall previous install

Cmake builds should uninstall previous installs in order to prevent poisoning the environment from the previous GNU Make instruction. CMake builds the librevcpu.so in a different directory than the base GNU Make infrastructure.

The install directive for CMake should:

  • check to see if revcpu is installed (sst-info revcpu)
  • if installed, do: sst-register -u revcpu
  • install the new verison sst-register revcpu

Application with std::string exits prematurely

Application with below source code exits prematurely.

#include
int main() {
std::string fname("/asdf/asdf/asdf/as/dfasd/f");
return 0;
}

To reproduce:

  1. use any riscv c++ compiler of your choice (gnu, llvm) and compile:
    riscv64-linux-gnu-g++ tc1.c -static -o tc1.exe
    below logs are generated with gnu g++
  2. run binary in REV model with configuration provided
    rev_config.txt
    REV_EXE=tc1.exe timeout 10s sst ../rev_config.txt
  3. Check execution trace (see attached file with parsed trace to asm instruction)
    trace.txt

Program execution stops on 'ret' instruction however that is not end of the program

[P1] Missing support for ELF init/fini sections

ELF's init section is used for things like functions defined with constructor attribute and static class members initialization. Currently init and fini sections are not executed before/after main.
The following code snippet represents the problem:

#define assert(x) if (!(x)) { asm(".dword 0x00000000"); }
int x = 0;
__attribute__((constructor)) void foo() {
        x++;
}

int main() {
        assert(x == 1);
        return 0;
}

AMOADD atomic ends up in an infinite loop

The following code is looping indefinetly on a basic config (ex2 config, devel branch):

int main() {
        std::atomic<int> a;
        a = 0;
        a++;
}

a++ is the offending instruction:

RevCPU[cpu:DecodeInst:11106000]: Core 0 ; Thread 0; PC:InstPayload = 0x11468:0x6b5252f
RevCPU[cpu:ClockTick:11106000]: Core 0 ; Thread 0; Executing PC= 0x11468
RevCPU[cpu:ClockTick:11106000]: Core 0 ; ThreadID 0; Retiring PC= 0x11468

11468: 2f 25 b5 06 amoadd.w.aqrl a0, a1, (a0)

[P3] Stack corruption for parent process when fork() syscall is called

The following test which checks if the stack has been properly copied for the child process created by fork() should not fail.
It fails on gcc and clang.

#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#include "../../../common/syscalls/syscalls.h"

#define assert(x) if (!(x)) { asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); asm(".byte 0x00"); }
int main() {
  int valOnStack = 1;
  rev_fork();
  assert(valOnStack==1);
  valOnStack = 0;

  return 0;
}

[P2] set/read floating-point exceptions does not work

Compile and run below code:

#define assert(x) if (!(x)) { asm(".dword 0x00000000"); }

#include <fenv.h>
#include <stdint.h>

void test_except(uint64_t e) {
	feclearexcept(FE_ALL_EXCEPT);
	int r = feraiseexcept(e);
	assert (r == 0);
	assert(fetestexcept(e) > 0);
}

int main() {
	test_except(FE_DIVBYZERO);
	test_except(FE_INEXACT);
	test_except(FE_INVALID);
        test_except(FE_OVERFLOW);
	test_except(FE_UNDERFLOW);
	return 0;
}

When calling fetestexcept() we should get value of previously set exception. In runtime it seems that no exception is raised.
See also:
https://en.cppreference.com/w/c/numeric/fenv/FE_exceptions
https://en.cppreference.com/w/cpp/numeric/fenv/fetestexcept

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.