Git Product home page Git Product logo

Comments (15)

dotnwat avatar dotnwat commented on August 18, 2024

HI @wwang-pivotal I'll take a look at this this week. If the changes aren't major then it shouldn't take more than an a day or two. Patches welcome too :)

from cephfs-hadoop.

wormwang avatar wormwang commented on August 18, 2024

Have u look the issue?

from cephfs-hadoop.

m0zes avatar m0zes commented on August 18, 2024

This is certainly one of the changes needed, and this is only to get it partially working with Hadoop 2.6.0. I still can't get it to run yarn jobs.

diff --git a/src/main/java/org/apache/hadoop/fs/ceph/CephFileSystem.java b/src/main/java/org/apache/hadoop/fs/ceph/CephFileSystem.java
index a27384f..6f0df53 100644
--- a/src/main/java/org/apache/hadoop/fs/ceph/CephFileSystem.java
+++ b/src/main/java/org/apache/hadoop/fs/ceph/CephFileSystem.java
@@ -78,6 +78,10 @@ public class CephFileSystem extends FileSystem {
   public CephFileSystem() {
   }

+  protected int getDefaultPort() {
+    return 6789;
+  }
+
   /**
    * Create an absolute path using the working directory.
    */

from cephfs-hadoop.

dotnwat avatar dotnwat commented on August 18, 2024

Thank @m0zes. I've dropped the ball on 2.7, but I have some updates pending for that. I've only heard of a few problems with 2.6, and in those cases there were some things that were not reproducible. It would be helpful to know what other problems you were seeing with 2.6.

from cephfs-hadoop.

m0zes avatar m0zes commented on August 18, 2024

Just trying one of the examples here, although even "debug" logging doesn't seem give me any idea on what is actually wrong. I believe this is at the filesystem level, though.

# hadoop  jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar pi 10 100
16/04/26 12:40:40 DEBUG util.Shell: setsid exited with exit code 0
Number of Maps  = 10
Samples per Map = 100
16/04/26 12:40:40 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate
of successful kerberos logins and latency (milliseconds)], about=, always=false, type=DEFAULT, sampleName=Ops)
16/04/26 12:40:40 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate
of failed kerberos logins and latency (milliseconds)], about=, always=false, type=DEFAULT, sampleName=Ops)
16/04/26 12:40:40 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroup
s], about=, always=false, type=DEFAULT, sampleName=Ops)
16/04/26 12:40:40 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
16/04/26 12:40:40 DEBUG util.KerberosName: Kerberos krb5 configuration not found, setting default realm to empty
16/04/26 12:40:40 DEBUG security.Groups:  Creating new Groups object
16/04/26 12:40:40 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
16/04/26 12:40:40 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library
16/04/26 12:40:40 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution
16/04/26 12:40:40 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping
16/04/26 12:40:40 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
16/04/26 12:40:40 DEBUG security.UserGroupInformation: hadoop login
16/04/26 12:40:40 DEBUG security.UserGroupInformation: hadoop login commit
16/04/26 12:40:40 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: mozes
16/04/26 12:40:40 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: mozes" with name mozes
16/04/26 12:40:40 DEBUG security.UserGroupInformation: User entry: "mozes"
16/04/26 12:40:40 DEBUG security.UserGroupInformation: UGI loginUser:mozes (auth:SIMPLE)
16/04/26 12:40:40 DEBUG core.Tracer: sampler.classes = ; loaded no samplers
16/04/26 12:40:40 TRACE core.TracerId: ProcessID(fmt=%{tname}/%{ip}): computed process ID of "FSClient/10.5.3.30"
16/04/26 12:40:40 TRACE core.TracerPool: TracerPool(Global): adding tracer Tracer(FSClient/10.5.3.30)
16/04/26 12:40:40 DEBUG core.Tracer: span.receiver.classes = ; loaded no span receivers
16/04/26 12:40:40 TRACE core.Tracer: Created Tracer(FSClient/10.5.3.30) for FSClient
Loading libcephfs-jni from default path: /usr/lib/hadoop/lib/native
Loading libcephfs-jni: /usr/lib64/libcephfs_jni.so
Loading libcephfs-jni: /usr/lib/jni/libcephfs_jni.so
Loading libcephfs-jni: Success!
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
16/04/26 12:40:42 DEBUG security.UserGroupInformation: PrivilegedAction as:mozes (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.connect(Job.java:1272)
16/04/26 12:40:42 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider : org.apache.hadoop.mapred.LocalClientProtocolProvider
16/04/26 12:40:42 DEBUG mapreduce.Cluster: Cannot pick org.apache.hadoop.mapred.LocalClientProtocolProvider as the ClientProtocolProvider - returned null protocol
16/04/26 12:40:42 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider : org.apache.hadoop.mapred.YarnClientProtocolProvider
16/04/26 12:40:42 DEBUG service.AbstractService: Service: org.apache.hadoop.mapred.ResourceMgrDelegate entered state INITED
16/04/26 12:40:42 DEBUG service.AbstractService: Service: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state INITED
16/04/26 12:40:42 DEBUG azure.NativeAzureFileSystem: finalize() called.
16/04/26 12:40:42 DEBUG azure.NativeAzureFileSystem: finalize() called.
16/04/26 12:40:42 INFO client.RMProxy: Connecting to ResourceManager at gremlin00.beocat.ksu.edu/10.5.3.30:8032
16/04/26 12:40:42 DEBUG security.UserGroupInformation: PrivilegedAction as:mozes (auth:SIMPLE) from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136)
16/04/26 12:40:42 DEBUG ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
16/04/26 12:40:42 DEBUG ipc.HadoopYarnProtoRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ApplicationClientProtocol
16/04/26 12:40:42 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@3c86c285
16/04/26 12:40:42 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@74107a99
16/04/26 12:40:42 DEBUG service.AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started
16/04/26 12:40:42 DEBUG service.AbstractService: Service org.apache.hadoop.mapred.ResourceMgrDelegate is started
16/04/26 12:40:42 DEBUG security.UserGroupInformation: PrivilegedAction as:mozes (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:334)
16/04/26 12:40:42 DEBUG mapreduce.Cluster: Picked org.apache.hadoop.mapred.YarnClientProtocolProvider as the ClientProtocolProvider
16/04/26 12:40:42 DEBUG security.UserGroupInformation: PrivilegedAction as:mozes (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Cluster.getFileSystem(Cluster.java:161)
16/04/26 12:40:42 DEBUG security.UserGroupInformation: PrivilegedAction as:mozes (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
16/04/26 12:40:42 DEBUG mapred.ResourceMgrDelegate: getStagingAreaDir: dir=/staging/mozes/.staging
16/04/26 12:40:42 TRACE ipc.ProtobufRpcEngine: 1: Call -> gremlin00.beocat.ksu.edu/10.5.3.30:8032: getNewApplication {}
16/04/26 12:40:42 DEBUG ipc.Client: The ping interval is 60000 ms.
16/04/26 12:40:42 DEBUG ipc.Client: Connecting to gremlin00.beocat.ksu.edu/10.5.3.30:8032
16/04/26 12:40:42 DEBUG ipc.Client: IPC Client (1597504843) connection to gremlin00.beocat.ksu.edu/10.5.3.30:8032 from mozes: starting, having connections 1
16/04/26 12:40:42 DEBUG ipc.Client: IPC Client (1597504843) connection to gremlin00.beocat.ksu.edu/10.5.3.30:8032 from mozes sending #0
16/04/26 12:40:42 DEBUG ipc.Client: IPC Client (1597504843) connection to gremlin00.beocat.ksu.edu/10.5.3.30:8032 from mozes got value #0
16/04/26 12:40:42 DEBUG ipc.ProtobufRpcEngine: Call: getNewApplication took 161ms
16/04/26 12:40:42 TRACE ipc.ProtobufRpcEngine: 1: Response <- gremlin00.beocat.ksu.edu/10.5.3.30:8032: getNewApplication {application_id { id: 12 cluster_timestamp: 1461615899163 } maximumCapability { memory: 8192 virtual_cores: 4 }}
16/04/26 12:40:42 DEBUG mapreduce.JobSubmitter: Configuring job job_1461615899163_0012 with /staging/mozes/.staging/job_1461615899163_0012 as the submit dir
16/04/26 12:40:42 DEBUG mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:[ceph://hobbit01:6789/]
16/04/26 12:40:42 DEBUG mapreduce.JobResourceUploader: default FileSystem: ceph://hobbit01:6789
16/04/26 12:40:42 DEBUG mapreduce.JobSubmitter: Creating splits at ceph://hobbit01:6789/staging/mozes/.staging/job_1461615899163_0012
16/04/26 12:40:42 DEBUG input.FileInputFormat: Time taken to get FileStatuses: 32
16/04/26 12:40:42 INFO input.FileInputFormat: Total input paths to process : 10
16/04/26 12:40:42 DEBUG input.FileInputFormat: Total # of splits generated by getSplits: 10, TimeTaken: 35
16/04/26 12:40:43 INFO mapreduce.JobSubmitter: Cleaning up the staging area /staging/mozes/.staging/job_1461615899163_0012
java.lang.NullPointerException
        at org.apache.hadoop.io.Text.encode(Text.java:450)
        at org.apache.hadoop.io.Text.encode(Text.java:431)
        at org.apache.hadoop.io.Text.writeString(Text.java:480)
        at org.apache.hadoop.mapreduce.split.JobSplit$SplitMetaInfo.write(JobSplit.java:125)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.writeJobSplitMetaInfo(JobSplitWriter.java:193)
        at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:81)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1325)
        at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
        at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

from cephfs-hadoop.

dotnwat avatar dotnwat commented on August 18, 2024

Wow, nothing there looks suspicious at first glance. The usual suspect is a mismatch between our bindings and what Hadoop expects, which seems to diverge occasionally. What version of Ceph are you running?

from cephfs-hadoop.

m0zes avatar m0zes commented on August 18, 2024

I built cephfs-hadoop with the 9.2.1 libcephfs jar, 9.2.1 libcephfs_jni, and hadoop 2.6.0-cdh5.7.0. On ubuntu trusty.

The cluster I'm connecting to is also 9.2.1.

from cephfs-hadoop.

m0zes avatar m0zes commented on August 18, 2024

For the life of me I can't see anything wrong with my configuration, but perhaps there is something else wrong. I know I can list, add, delete, and move files with the hdfs dfs suite of tools. Here is my configuration for reference. https://gist.github.com/m0zes/e6eb5ca39153989f7a37947a469e0b98

from cephfs-hadoop.

dbseraf avatar dbseraf commented on August 18, 2024

Has there been any progress on this lately? Anyone know whether ceph 10.2 works any better?

from cephfs-hadoop.

wormwang avatar wormwang commented on August 18, 2024

Has there been any progress on this lately in 2017? Anyone know whether ceph 10.2 or 11.2 works any better?

from cephfs-hadoop.

dotnwat avatar dotnwat commented on August 18, 2024

There hasn't been much work on this. I don't have a lot of time to work on this in the short term, but would be happy to offer basic support. Have you tried deploying the bindings?

from cephfs-hadoop.

zphj1987 avatar zphj1987 commented on August 18, 2024

@m0zes
the same error with you paste,had you resolve it?

data:2 wanted=3
17/02/28 14:26:17 DEBUG mapreduce.JobSubmitter: Creating splits at ceph://10.168.10.1:6789/tmp/hadoop-yarn/staging/root/.staging/job_1488254605886_0020
17/02/28 14:26:17 DEBUG input.FileInputFormat: Time taken to get FileStatuses: 5
17/02/28 14:26:17 INFO input.FileInputFormat: Total input paths to process : 1
17/02/28 14:26:17 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1488254605886_0020
java.lang.NullPointerException
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getBlockIndex(FileInputFormat.java:444)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:405)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
	at org.apache.hadoop.examples.Grep.run(Grep.java:78)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.examples.Grep.main(Grep.java:103)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
	at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
	at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

from cephfs-hadoop.

m0zes avatar m0zes commented on August 18, 2024

No. I ended up creating individual pools for rbd for each hadoop node, no replication. Then I created 6 rbds for each hadoop node for parallelism. And I put hdfs on top of those rbds, with a forced 3x replication. Not an ideal setup, but I couldn't waste any more time going down the cephfs-hadoop route.

from cephfs-hadoop.

zphj1987 avatar zphj1987 commented on August 18, 2024

from cephfs-hadoop.

zphj1987 avatar zphj1987 commented on August 18, 2024

@m0zes
and i down my hadoop version to 2.7.1

from cephfs-hadoop.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.