Git Product home page Git Product logo

Comments (20)

 avatar commented on August 19, 2024

One immediate solution (about to be used at MPI, Luis?) is to allow network routing from the HPC to one or more dedicated cylc servers, without opening up the routing more generally. This clearly requires some significant action to be taken by the HPCF sysadmins, however.

from cylc-flow.

 avatar commented on August 19, 2024

One-way communication (polling)? Perhaps we could have cylc spawn light-weight local (suite host) processes alongside each remote task, that poll the remote task host(s) to determine remote task progress (remote tasks might have to write progress updates to a standard file or something). Then the local processes, on detecting a change in the status of their associated remote tasks, could invoke the required cylc messaging locally (on the suite host) to update the running suite.

Not a very elegant solution, but maybe something like this would be necessary at sites where it is not possible (or rather not allowed) to route out if the compute nodes, even just to specific dedicated suite servers as above.

from cylc-flow.

 avatar commented on August 19, 2024

NIWA wizard Chris has suggested that some kind of remote port forwarding solution might be possible, using (or emulating what can be done with?) ssh. I'm no networking expert, but I think the gist of it was something like this: the ssh process that submits the remote task configures (how?) a local port on the task host (localhost:N?) that forwards traffic across the ssh tunnel to the right port on the suite host. The initial ssh connection would have to be kept open while the task runs.

This (if feasible) would have the advantage of not requiring any action to be taken by your friendly but change-resistant HPC sysadmins (I think?).

from cylc-flow.

m214089 avatar m214089 commented on August 19, 2024

We have been using port-forwarding for connecting from the HPC compute nodes to an external machine. The problem with this is that pyro, the high level message transfer component of cylc, is using a rather quite range of ports for communicating. And while the server port is fixed the port used by the client is random. The tool we use for port-forwarding is http://www.accordata.de/downloads/port-proxy/index.html.

from cylc-flow.

m214089 avatar m214089 commented on August 19, 2024

It came to my mind that we might get rid of pyro and write a server ourselves in the long term. The capabilities of pyro are not really explored by cylc. It is rather convenient to use it but as the protocol is very simple, we might even get along with something easier, where we do have more control over used ports. I try to discuss that with the network guys in our computing centre. I think we may even get a CS student to implement something, but first it makes sense to get a bit more understanding. We have cracked all protocol levels by now which are used on the client side as we are reimplementing those in C for various reasons not to be outlined here.

from cylc-flow.

matthewrmshin avatar matthewrmshin commented on August 19, 2024

We also have this problem - communications to the outside world can only be done via the login node. It is likely that we'll go down the dedicated server route in our new environment.

(In our old environment, we used to have a poor man's way of doing "port forwarding".)

from cylc-flow.

m214089 avatar m214089 commented on August 19, 2024

As we will do for the time being as well. Maybe it is at the end not a really big problem for operational or dedicated computing centres? But it will for more 'lightweight' users ...

from cylc-flow.

 avatar commented on August 19, 2024

Luis (m214089) and Matt, thanks for the info. Yes, I guess this is a bigger problem for individuals trying to use cylc than for institutions in complete charge of their own computing facility. But even so, it would be nice to be able to run cylc "out of the box" on most existing systems. If either of you understand how the port forwarding solutions work could you write a quick "how to" that we could include in the User Guide?

from cylc-flow.

 avatar commented on August 19, 2024

(issue closure above was accidental!)

Luis, you are right that Pyro is used very minimally by cylc, and we probably could quite easily replace it with a simple custom solution. In the early days cylc also used the Pyro Nameserver so that a dedicated server port was not required for each suite, but it was generally thought that this was a bad idea because it meant suites were not entirely independent (e.g. in principle a research suite could bring down an operational one by messing with the Pyro nameserver).

Update: see also Issue #72; for the moment we plan to go with Pyro4 when it gets connection authentication (probably in the next-plus-one release).

from cylc-flow.

dpmatthews avatar dpmatthews commented on August 19, 2024

For tasks running on hosts with no communication back, it should be possible to configure cylc to poll hosts for updates to the task status and to prevent the cylc task commands from attempting to communicate back to the server. See #86.

from cylc-flow.

 avatar commented on August 19, 2024

Note my original polling suggestion above #67 (comment)

A local co-process (for want of a better word) could be launched alongside every remote task that requires polling, to do the polling and then run the right cylc messaging commands locally, in effect masquerading as the remote task. In this way we could avoid complicating cylc itself with polling code (it might be time consuming to poll for hundreds of tasks...). The local co-processes would be easy for cylc to monitor by the normal pyro-based method.

from cylc-flow.

m214089 avatar m214089 commented on August 19, 2024

I have been thinking a bit on polling and it points out to be much worse than having the communication enabled back to any kind of external server or port-forwarding, because it increases the network traffic inside of the HPC machine. The problem is not the amount of data transferred, but the latency of the messages. So I think for the time being an introduction into port-forwarding is the best solution. A very stable tool, which we use for database connections out of our HPC machine can be found here:

http://www.accordata.net/downloads/port-proxy/index.html

It can be used on a per user base.

from cylc-flow.

hjoliver avatar hjoliver commented on August 19, 2024

@m214089 the kind of polling I was thinking of, at least in the first instance, was just using ssh to check remote files that report task progress. Does this result in the network traffic problem you're talking about?

from cylc-flow.

m214089 avatar m214089 commented on August 19, 2024

Yes, already this is too much ;-)

On 2012-05-10 4:46 , Hilary James Oliver wrote:

@m214089 https://github.com/m214089 the kind of polling I was thinking
of, at least in the first instance, was just using ssh to check remote
files that report task progress. Does this result in the network traffic
problem you're talking about?


Reply to this email directly or view it on GitHub
#67 (comment).

                          \\\\\\
                          (-0^0-)

--------------------------oOO--(_)--OOo-----------------------------

Luis Kornblueh Tel. : +49-40-41173289
Max-Planck-Institute for Meteorology Fax. : +49-40-41173298
Bundesstr. 53
D-20146 Hamburg Email: [email protected]
Federal Republic of Germany

from cylc-flow.

hjoliver avatar hjoliver commented on August 19, 2024

Luis, do you think it would be wrong to give cylc a task polling capability in spite of the above problem - it would allow users to test cylc on facilities that currently have no easy means of routing back out of the HPC. We can give warnings that it is not a good long term solution, and sysadmins can complain if it does cause a problem!

from cylc-flow.

m214089 avatar m214089 commented on August 19, 2024

For testing it is very usefull. So it does make sense. Another thing
it could be nice for on the long term would be to store in the
background the runtime of a certain coponent of the suite and if a new
run takes significantly longer to make a life check ...

And giving the tip to the users is good to because it does contain some
teaching aspects. Great solution you proposed!

On 2012-05-10 7:16 , Hilary James Oliver wrote:

Luis, do you think it would be wrong to give cylc a task polling
capability in spite of the above problem - it would allow users to test
cylc on facilities that currently have no easy means of routing back out
of the HPC. We can give warnings that it is not a good long term
solution, and sysadmins can complain if it does cause a problem!


Reply to this email directly or view it on GitHub
#67 (comment).

                          \\\\\\
                          (-0^0-)

--------------------------oOO--(_)--OOo-----------------------------

Luis Kornblueh Tel. : +49-40-41173289
Max-Planck-Institute for Meteorology Fax. : +49-40-41173298
Bundesstr. 53
D-20146 Hamburg Email: [email protected]
Federal Republic of Germany

from cylc-flow.

matthewrmshin avatar matthewrmshin commented on August 19, 2024

I wonder if it is possible to keep open a background process for a pseudo interactive ssh (bash) session to each remote host. Every now and then, the suite can send a polling command to the host via the same ssh session. It should return an output just like an interactive session. This should keep traffic to the minimum. It would be no different from a user keeping a terminal open to a remote host.

from cylc-flow.

hjoliver avatar hjoliver commented on August 19, 2024

Does opening a new connection result in significantly more network chatter than maintaining and using an already-open connection? And are interactive ssh sessions better in this respect than non-interactive ssh?

from cylc-flow.

matthewrmshin avatar matthewrmshin commented on August 19, 2024

The main advantage of a single ssh session per host is that it is less likely for the host to block the next ssh session because there are already too many sessions opened. There are probably other smaller advantages, e.g. it probably does generate slightly less network traffic, as it does not have to re-authenticate and re-run all the start up stuffs.

from cylc-flow.

hjoliver avatar hjoliver commented on August 19, 2024

Task polling is now complete and merged to master. I've copied a few remaining issues from above to #517.

from cylc-flow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.