Git Product home page Git Product logo

fastflow / fastflow Goto Github PK

View Code? Open in Web Editor NEW
268.0 268.0 63.0 139.39 MB

FastFlow pattern-based parallel programming framework (formerly on sourceforge)

Home Page: http://calvados.di.unipi.it

License: GNU General Public License v2.0

Roff 0.12% CMake 4.31% C++ 91.01% C 2.99% Makefile 1.32% Shell 0.24%
gpu-computing gpu-programming multicore parallel-algorithm parallel-programming parallelization patterns skeleton-framework

fastflow's Introduction

License: LGPL v3 License: MIT GitHub tag GitHub Issues

FastFlow: high-performance parallel patterns and building blocks in C++

FastFlow is a programming library implemented in modern C++ targeting multi/many-cores and distributed systems (the distributed run-time is experimental). It offers both a set of high-level ready-to-use parallel patterns and a set of mechanisms and composable components (called building blocks) to support low-latency and high-throughput data-flow streaming networks.

FastFlow simplifies the development of parallel applications modelled as a structured directed graph of processing nodes. The graph of concurrent nodes is constructed by the assembly of sequential and parallel building blocks as well as higher-level parallel patterns modelling typical schemas of parallel computations (e.g., pipeline, task-farm, parallel-for, etc.). FastFlow efficiency stems from the optimized implementation of the base communication and synchronization mechanisms and from its layered software design.

FastFlow's Building Blocks

FastFlow nodes represent sequential computations executed by a dedicated thread. A node can have zero, one or more input channels and zero, one or more output channels. As typical is in streaming applications, communication channels are unidirectional and asynchronous. They are implemented through Single-Producer Single-Consumer (SPSC) FIFO queues carrying memory pointers. Operations on such queues (that can have either bounded or unbounded capacity) are based on non-blocking lock-free synchronization protocol. To promote power-efficiency vs responsiveness of the nodes, a blocking concurrency control operation mode is also available.

The semantics of sending data references over a communication channel is that of transferring the ownership of the data pointed by the reference from the sender node (producer) to the receiver node (consumer) according to the producer-consumer model. The data reference is de facto a capability, i.e. a logical token that grants access to a given data or to a portion of a larger data structure. Based on this reference-passing semantics, the receiver is expected to have exclusive access to the data reference received from one of the input channels, while the producer is expected not to use the reference anymore.

The set of FastFlow building blocks is:

node. This is the basic abstraction of the building blocks. It defines the unit of sequential execution in the FastFlow library. A node encapsulates either user’s code (i.e. business logic) or RTS code. User’s code can also be wrapped by a FastFlow node executing RTS code to manipulate and filter input and output data before and after the execution of the business logic code. Based on the number of input/output channels it is possible to distinguish three different kinds of sequential nodes: standard node with one input and one output channel, multi-input with many inputs and one output channel, and finally multi-output with one input and many outputs. A generic node performs a loop that: i) gets a data item (through a memory reference to a data structure) from one of its input queues; ii) executes a functional code working on the data item and possibly on a state maintained by the node itself by calling its service method svc(); iii) puts a memory reference to the resulting item(s) into one or multiple output queues selected according to a predefined or user-defined policy.

node combiner. It allows the user to combine two nodes into one single sequential node. Conceptually, the operation of combining sequential nodes is similar to the composition of two functions. In this case, the functions are the service functions of the two nodes (e.g., the svc method). This building block promotes code reuse through fusion of already implemented nodes and it can also be used to reduce the threads used to run the data-flow network by executing the functions of multiple nodes by a single thread.

pipeline. The pipeline allows building blocks to be connected in a linear chain. It is used both as a container of building blocks as well as an application topology builder. At execution time, the pipeline building block models the data-flow execution of its building blocks on data elements flowing in a streamed fashion.

farm. It models functional replication of building blocks coordinated by a master node called Emitter. The simplest form is composed of two computing entities executed in parallel: a multi-output master node (the Emitter), and a pool of pipeline building blocks called Workers. The Emitter node schedules the data elements received in input to the Workers using either a default policy (i.e. round-robin or on-demand) or according to the algorithm implemented by the user code defined in its service method. In this second scenario, the stream elements scheduling is controlled by the user through a custom policy.

All-to-All The All-to-All (briefly A2A) building block defines two distinct sets of Workers connected accordig to the shuffle communication pattern. This means that each Worker in the first set (called L-Worker) is connected to all the Workers in the second set (called R-Workers). The user may implement any custom distribution policy in the L-Workers (e.g., sending each data item to a specific worker of the R-Worker set, broadcasting data elements, executing a by-key routing, etc). The default distribution policy is round-robin.

A brief description of the FastFlow building block software layer can be found here.

Available Parallel Patterns

In FastFlow, all parallel patterns available are implemented on top of building blocks. Parallel Patterns are parametric implementations of well-known structures suitable for parallelism exploitation. The high-level patterns currently available in FastFlow library are: ff_Pipe, ff_Farm/ff_OFarm, ParallelFor/ParallelForReduce/ParallelForPipeReduce, poolEvolution, ff_Map, ff_mdf, ff_DC, ff_stencilReduce.

Differenting from the building block layer, the parallel patterns layer is in continuous evolution. As soon as new patterns are recognized or new smart implementations are available for the existing patterns, they are added to the high-level layer and provided to the user.

Building the library

FastFlow is a header-only library, for the shared-memory run-time, there are basically no dependencies (but remember to run the script mapping_string.sh in the ff directory!). For the distributed-memory run-time, you need to install:

While Cereal is mandatory, OpenMPI installation is optional and can be disabled at compile-time by compiling the code with '-DDFF_EXCLUDE_MPI' (or make EXCLUDE_MPI=1). To compile the tests with the distributed run-time you need a recent compiler supporting the -std=c++20 standard (e.g., gcc 10 or above). In addition, by default the shared-memory version uses the non-blocking concurrency control mode, wherease the distributed version uses the blocking mode for its run-time system. You can control the concurrency control mode either at compile time (see the config.hpp file) or at run-time by calling the proper methods before running the application.

See the BUILD.ME file for instructions about building unit tests and examples. NOTES: currently, the cmake-based compilation of distributed tests has been disabled.

Supported Platforms

FastFlow is currently actively supported for Linux with gcc >4.8, x86_64 and ARM Since version 2.0.4, FastFlow is expected to work on any platform with a C++11 compiler.

FastFlow Maintainer

Massimo Torquati (University of Pisa) [email protected] [email protected]

FastFlow History

The FastFlow project started in the beginning of 2010 by Massimo Torquati (University of Pisa) and Marco Aldinucci (University of Turin). Over the years several other people (mainly from the Parallel Computing Groups of the University of Pisa and Turin) contributed with ideas and code to the development of the project. FastFlow has been used as run-time system in three EU founded research projects: ParaPhrase, REPARA and RePhrase. Currently is one of the tools used in the Euro-HPC project TEXTAROSSA.

More info about FastFlow and its parallel building blocks can be found here: Massimo Torquati (Pisa, PhD Thesis) "Harnessing Parallelism in Multi/Many-Cores with Streams and Parallel Patterns"

About the License

From version 3.0.1, FastFlow is released with a dual license: LGPL-3 and MIT.

How to cite FastFlow

Aldinucci, M. , Danelutto, M. , Kilpatrick, P. and Torquati, M. (2017). Fastflow: High‐Level and Efficient Streaming on Multicore. In Programming multi‐core and many‐core computing systems (eds S. Pllana and F. Xhafa). FF_DOI_badge

fastflow's People

Contributors

aldinuc avatar gerzin avatar jdgarciauc3m avatar keith-dev avatar lucarin91 avatar massimotorquati avatar mdrocco avatar nicolotonci avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastflow's Issues

failed svc_init seems stuck

Im trying to play around with the library on Ubuntu with CLang 6.0.1 targeting c++17. I use the fully-c++11 branch.

Im creating a ff_node with the following svc_init:

int svc_init() {
        console->info("Init Readile: {}", file_path);
        try {
            file_reader = make_unique<io::LineReader>(file_path);
        } catch(const io::error::can_not_open_file msg) {
            msg.what();
            console->error(fmt::format("Error: {}",msg.error_message_buffer));
            return -1;
        }
        return 0;
    }

I add the stage to a ff_Pipe:
ff_Pipe<Event> pipe(make_unique<ReadFile>("/tmp/access.log2"));
and run the ff_Pipe:
if (pipe.run_and_wait_end() < 0) error("running pipe");

If the svc_init fails and return -1, I would expect the pipeline to end and print the error, however it just hangs forever after printing:
ERROR: ff_thread, svc_init failed, thread exit!!!

Is this behavior expected ?

Compiler:

clang version 6.0.1 (tags/RELEASE_601/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/clang_6.0.1/bin

cmake build failed

Followed the instruction:

~$ cd build
~$ cmake ../ 
~$ make

...
[ 13%] Building CXX object tests/CMakeFiles/test_all-to-all7_NONBLOCKING.dir/test_all-to-all7.cpp.o
/home/tsung-wei/Code/fastflow/tests/test_multi_output.cpp: In function ‘int main()’:
/home/tsung-wei/Code/fastflow/tests/test_multi_output.cpp:113:13: error: missing template arguments before ‘pipe1’
     ff_Pipe pipe1(a, b);
             ^~~~~
/home/tsung-wei/Code/fastflow/tests/test_multi_output.cpp:114:5: error: ‘pipe1’ was not declared in this scope
     pipe1.wrap_around();
     ^~~~~
/home/tsung-wei/Code/fastflow/tests/test_multi_output.cpp:114:5: note: suggested alternative: ‘pipe2’
     pipe1.wrap_around();
     ^~~~~
     pipe2
/home/tsung-wei/Code/fastflow/tests/test_multi_output.cpp:115:13: error: missing template arguments before ‘pipe’
     ff_Pipe pipe(pipe1, c);
             ^~~~
/home/tsung-wei/Code/fastflow/tests/test_multi_output.cpp:116:14: error: request for member ‘run_and_wait_en’ in ‘pipe’, which is of non-class type ‘int(int*) throw ()’ {aka ‘int(int*)’}
     if (pipe.run_and_wait_end()<0) {
              ^~~~~~~~~~~~~~~~
make[2]: *** [tests/CMakeFiles/test_multi_output_BLOCKING.dir/build.make:63: tests/CMakeFiles/test_multi_output_BLOCKING.dir/test_multi_output.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2166: tests/CMakeFiles/test_multi_output_BLOCKING.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

Task dependency graph?

Any example or tutorial to how to use fastflow build a task dependency graph? ff_taskf tutorial page is blank with TBD. I didn't find any example either.

OCL inclusions in pipeline.hpp

pipeline.hpp:38
inclusion #include <ff/ocl/clEnvironment.hpp> should be within

#if defined(FF_OPENCL)

#endif

pipeline.hpp:414
the code

 if (fftree_ptr->hasOpenCLNode()) {
     // setup openCL environment
     clEnvironment::instance();
  }

should be within

#if defined(FF_OPENCL)

#endif

clang++ support

clang++ under linux fails compiling.

I checked this with clang++4.0 and clang++5.0.

Error message:

fastflow/ff/spin-lock.hpp:61:27: warning: braces around scalar initializer [-Wbraced-scalar-init]
    AtomicFlagWrapper():F(ATOMIC_FLAG_INIT) {}
                          ^~~~~~~~~~~~~~~~

Change the number of threads

Hello,

How could I modify the number of threads while running pipeline examples? According to README, the number of threads is defaulted to the number of cores. However, I would like to change the number of threads and see the runtime performance trend. How could I do that? Thank you so much.

Want more documents!!!

Is there any documentation about the dnode? I would like to use fastflow in a distributed system, but all docs online only shown that this framework is working well on SMP. I would very grateful if you could give me more detailed documentations about the dnode or any other pattern/building block, etc.

test_gw does not compile

Running make test_gw (as well as just make) leads to compile errors.

Some errors are easy (patch below), but I got stuck on this one:

$ make test_gw
Scanning dependencies of target test_gw
Building CXX object tests/d/CMakeFiles/test_gw.dir/test_gw.cpp.o
/home/paolo/Programmazione/fastflow/tests/d/test_gw.cpp: In function ‘int main(int, char**)’:
/home/paolo/Programmazione/fastflow/tests/d/test_gw.cpp:128:2: error: ‘InOut0’ was not declared in this scope
  InOut0 n0(name1,address1,name2,address2,&transport);
  ^~~~~~
/home/paolo/Programmazione/fastflow/tests/d/test_gw.cpp:128:2: note: suggested alternative: ‘InOut1’
  InOut0 n0(name1,address1,name2,address2,&transport);
  ^~~~~~
  InOut1
/home/paolo/Programmazione/fastflow/tests/d/test_gw.cpp:129:2: error: ‘n0’ was not declared in this scope
  n0.skipfirstpop(true);
  ^~
In file included from /home/paolo/Programmazione/fastflow/tests/d/test_gw.cpp:16:
/home/paolo/Programmazione/fastflow/ff/dnode.hpp: In instantiation of ‘int ff::ff_dinout<CommImplIn, CommImplOut>::initOut(const string&, const string&, int, typename CommImplOut::TransportImpl*, int, ff::dnode_cbk_t) [with CommImplIn = ff::zmqOnDemand; CommImplOut = ff::zmqFromAny; std::__cxx11::string = std::__cxx11::basic_string<char>; typename CommImplOut::TransportImpl = ff::zmqTransport; ff::dnode_cbk_t = void (*)(void*, void*)]’:
/home/paolo/Programmazione/fastflow/tests/d/test_gw.cpp:52:26:   required from here
/home/paolo/Programmazione/fastflow/ff/dnode.hpp:535:41: error: ‘bool ff::ff_dnode<ff::zmqFromAny>::skipdnode’ is protected within this context
         ff_dnode<CommImplOut>::skipdnode=false;
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/home/paolo/Programmazione/fastflow/ff/dnode.hpp:433:14: note: declared protected here
     bool     skipdnode;
              ^~~~~~~~~
/home/paolo/Programmazione/fastflow/ff/dnode.hpp:535:41: error: invalid use of non-static data member ‘ff::ff_dnode<ff::zmqFromAny>::skipdnode’
         ff_dnode<CommImplOut>::skipdnode=false;
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/home/paolo/Programmazione/fastflow/ff/dnode.hpp:433:14: note: declared here
     bool     skipdnode;
              ^~~~~~~~~
/home/paolo/Programmazione/fastflow/ff/dnode.hpp:536:34: error: ‘void (* ff::ff_dnode<ff::zmqFromAny>::cb)(void*, void*)’ is protected within this context
         ff_dnode<CommImplOut>::cb=cbk;
         ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
/home/paolo/Programmazione/fastflow/ff/dnode.hpp:440:13: note: declared protected here
 dnode_cbk_t ff_dnode<CommImpl>::cb=0;
             ^~~~~~~~~~~~~~~~~~
make[3]: *** [tests/d/CMakeFiles/test_gw.dir/build.make:63: tests/d/CMakeFiles/test_gw.dir/test_gw.cpp.o] Error 1
make[2]: *** [CMakeFiles/Makefile2:8024: tests/d/CMakeFiles/test_gw.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:8036: tests/d/CMakeFiles/test_gw.dir/rule] Error 2
make: *** [Makefile:2801: test_gw] Error 2

Patch for trivial issues:

diff --git a/tests/d/test_gw.cpp b/tests/d/test_gw.cpp
index 7d1b6ba..256a2de 100644
--- a/tests/d/test_gw.cpp
+++ b/tests/d/test_gw.cpp
@@ -11,6 +11,7 @@
  *
  */
 
+#include <iostream>
 #include <ff/node.hpp>
 #include <ff/dnode.hpp>
 #include <ff/d/inter.hpp>
@@ -53,7 +54,7 @@ public:
        return 0;
     }
     // TODO: increase/decrease granularity at run-time
-    void * svc(void *task) { return task);
+    void * svc(void *task) { return task; };
     
 protected:
     const std::string name1;
@@ -96,7 +97,7 @@ public:
        printf("InOut1 ending\n");
     }
 
-    virtual FFBUFFER * const get_out_buffer() const { return (FFBUFFER*)1;}
+    virtual FFBUFFER * get_out_buffer() const { return (FFBUFFER*)1;}
 
 protected:
     const std::string name1;

Compiler version:

g++ (Debian 8.2.0-4) 8.2.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Assessment of the difficulty in porting CPU architecture for the fastflow

Hello everyone! I am working on implementing a tool to assess the complexity of CPU architecture porting. It primarily focuses on RISC-V architecture porting. In fact, the tool may have an average estimate of various architecture porting efforts.My focus is on the overall workload and difficulty of transplantation in the past and future,even if a project has already been ported.As part of my dataset, I have collected the fastflow project. I would like to gather community opinions to support my assessment. I appreciate your help and response! Based on scanning tools, the porting complexity is determined to be moderate leaning towards simple, with a moderate amount of code related to the CPU architecture in the project. Is this assessment accurate?Do you have any opinions on personnel allocation and consumption time? I look forward to your help and response.

Sobel OCL test build fails

Whenever I try to build the Sobel test with a make it throws this error:

In file included from ../../../ff/stencilReduceOCL.hpp:47:0,
                 from ffsobel_pipe+mapOCL.cpp:56:
../../../ff/oclnode.hpp: At global scope:
../../../ff/oclnode.hpp:79:13: error: ‘fftype’ does not name a type; did you mean ‘wctype’?
     virtual fftype getFFType() const   { return OCL_WORKER; }
             ^~~~~~
             wctype

I couldn't find the definition of fftype, where should it be defined?

It also shows this error:

ffsobel_pipe+mapOCL.cpp: In lambda function:
ffsobel_pipe+mapOCL.cpp:270:35: error: ‘EOS’ was not declared in this scope
         return static_cast<Task*>(EOS);
                                   ^~~
ffsobel_pipe+mapOCL.cpp:270:35: note: suggested alternative: ‘EOF’
         return static_cast<Task*>(EOS);
                                   ^~~
                                   EOF
ffsobel_pipe+mapOCL.cpp: In lambda function:
ffsobel_pipe+mapOCL.cpp:283:35: error: ‘GO_ON’ was not declared in this scope
         return static_cast<Task*>(GO_ON);
                                   ^~~~~

Environment:
Ubuntu 18.04.4
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
FastFlow's git commit: 9ba56e7, error also happens in v3.0 (ad4dccf).

I just tested v2.2 (b396caa) and the error is gone.

Build Fails

[ 28%] Building CXX object tests/CMakeFiles/test_blk3_BLOCKING.dir/test_blk3.cpp.o
fastflow/tests/test_blk3.cpp: In member function ‘virtual fftask_t* First::svc(fftask_t*)’:
/home/kbw/Code/git/fastflow/tests/test_blk3.cpp:75:36: error: ‘BLK’ was not declared in this scope
   75 |                 while(!ff_send_out(BLK));
      |                                    ^~~
fastflow/tests/test_blk3.cpp:78:36: error: ‘NBLK’ was not declared in this scope
   78 |                 while(!ff_send_out(NBLK));
      |                                    ^~~~
fastflow/tests/test_blk3.cpp:81:36: error: ‘BLK’ was not declared in this scope
   81 |                 while(!ff_send_out(BLK));
      |                                    ^~~
make[2]: *** [tests/CMakeFiles/test_blk3_BLOCKING.dir/build.make:63: tests/CMakeFiles/test_blk3_BLOCKING.dir/test_blk3.cpp.o] Error 1

BLK and NBLK are undefined.

Feature request: ability to set CPU affinity manually for a node

Currently, FastFlow allows to set CPU affinity for nodes using either default_mapping (sets it in round-robin fashion over core IDs in FF_MAPPING_STRING) or no_mapping (allows a node to run on any CPU core).

Could you, please, make a small (but practically very important) addition to the code to allow a user to set CPU affinity manually for each node?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.