neicnordic / dcache-endit-provider Goto Github PK
View Code? Open in Web Editor NEWdCache nearline storage provider for integrating with Endit
License: GNU Affero General Public License v3.0
dCache nearline storage provider for integrating with Endit
License: GNU Affero General Public License v3.0
Just for heads up, we tested an adapted version of endit-provider at TRIUMF with dCache v7.2. With 7.2, WatchingEnditNearlineStorage raised exceptions caused by the following one
java.lang.AbstractMethodError: Receiver class org.ndgf.endit.WatchingEnditNearlineStorage does not define or inherit an implementation of the resolved method 'void start()' of interface org.dcache.pool.nearline.spi.NearlineStorage.
It turned out that NearlineStorage in v7.2 introduced a new start() method
default void start() throws IOException {}
which was hidden by the private start() method in WENS. We changed access modifier of start() in WENS to public as a quick fix. So I'd like to inform that it may need the same or cleaner changes to this too.
The polling provider currently defaults to -threads=1, this is too low for anything except functional/developer tests.
Consequently, this should be changed to a more reasonable default. Earlier tests has shown that the big performance improvement happens when the thread count is increased to 20, so let's use that as the default instead.
As a reference, NDGF uses 200 threads on their production tape pools for absolute maximum performance when processing huge numbers of requests.
Considering the abort() function
I think it should probably delete the final destination file
as well if it exists, since the stage was aborted the pool will have no knowledge of the file if the provider was too quick in placing it there. This of course assumes that abort() isn't used for general cleanup after a successful stage.
It looks to me that there is no check in poll()
Hi,
I am trying to install the plugin for using it in a new dCache instance.
The compilation fails with the following error:
[ERROR] /root/dcache-endit-provider/src/main/java/org/ndgf/endit/AbstractEnditNearlineStorage.java:[137,24] method transformAsync in class com.google.common.util.concurrent.Futures cannot be applied to given types;
required: com.google.common.util.concurrent.ListenableFuture<I>,com.google.common.util.concurrent.AsyncFunction<? super I,? extends O>,java.util.concurrent.Executor
found: com.google.common.util.concurrent.ListenableFuture<java.lang.Void>,<anonymous com.google.common.util.concurrent.AsyncFunction<java.lang.Void,java.lang.Void>>
reason: cannot infer type-variable(s) I,O
(actual and formal argument lists differ in length)
As far as I see maven is using the right java version:
# mvn -version
Apache Maven 3.0.5 (Red Hat 3.0.5-17)
Maven home: /usr/share/maven
Java version: 1.8.0_342, vendor: Red Hat, Inc.
Java home: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.342.b07-1.el7_9.x86_64/jre
Default locale: en_US, platform encoding: ANSI_X3.4-1968
OS name: "linux", version: "3.10.0-1160.76.1.el7.x86_64", arch: "amd64", family: "unix"
Maybe I am missing something obvious, could you please help me with this?
Thank you.
Cristina
In order to allow for dsmc
to finish setting attributes etc there is a sleep()
in the StageTask poll()
:
I suspect that this sleep()
might be the reason for the somewhat unexpected behavior that the WatchingProvider is so slow, and the observation that the PollingProvider performs much better provided that we allocate a LOT of threads to it.
My reasoning is that although it's a Thread.sleep()
it still suspends execution of the thread. This will wreak havoc with the watching provider performance and also increases the likelihood for event overflows. For the polling provider the GRACE_PERIOD
of 1000 ms is a direct correlation to the observed performance of the 1-thread-per-Hz of staging performance.
What we really ought to do is something along the lines of:
GRACE_PERIOD
I believe this would allow threads to do actual work instead of sleeping all the time.
Hello,
TRIUMF tested an adapted version of 1.0.9 with dCache 7.2.25, and got flush requests stuck. It turned out that the following statement in FlsushTask.java
was the source of the issue, and commenting out bypassed the issue.
xattrs = request.getFileAttributes().getXattrs();
Stuck flush meant that no flush requests were processed. To track it, I put a few debug messages including the ones before and after the call. The debug message before the call appeared only once from the first flush request and no other debug messages from FlushTask after it showed up at all. There was no similar debug message from remaining flush requests. So it seemed like that flush requests were stuck.
There were still other debug messages from WatchingEnditNearLineStorage
so the plugin seemed not crashed. We tested flush requests only so we're not sure whether stage requests would have worked or not.
After more investigation, I found that XATTR
was not defined so guard()
raised IllegalStateException
. I wrapped the statement with if/else as a temporary fix.
The first question came to me was that were we the only site who have the issue? If so, why XATTR was not defined in our case. What does it need to fill the map? etc.
The second question is what happened to the exception? I'm not a Java person so I just guess that somewhere above in the call stack did something or crashed(?) only that line?
Could you please enlighten me about it?
Thank you and kind regards,
Yun-Ha
P.S.
When I first adapted Endit when Gerd introduced this wonderful plugin, I changed activate()
to activateWithPath()
to gurantee that we're able to get path
info all the time. If I remember correctly, at that time it was not guaranteed that path
info was not always available in StorageInfo
. I don't think that results in undefined XATTR but that's the only technical difference.
It seems that now you're extracting path info from StorageInfo
. Is path
always in StorageInfo
now? If so, I'd like to change it back to simple activate
method.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.