Comments (4)
@shengnoah 可以,你打包时只要将sdk的jar打进去。
例如:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<configuration>
<shadedArtifactAttached>false</shadedArtifactAttached>
<outputFile>${project.build.directory}/shaded/emr-examples_2.10-${project.version}.jar</outputFile>
<artifactSet>
<includes>
<include>com.aliyun:emr-sdk_2.10</include>
<include>com.aliyun:emr-core</include>
</includes>
</artifactSet>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
from aliyun-emapreduce-datasources.
有一个想法,就是的例子,如果能统一
用这种:
val conf = new SparkConf().setAppName("Test OSS")
conf.set("spark.hadoop.fs.oss.accessKeyId", accessKeyId)
conf.set("spark.hadoop.fs.oss.accessKeySecret", accessKeySecret)
conf.set("spark.hadoop.fs.oss.endpoint", endpoint)
conf.set("spark.hadoop.fs.oss.impl", "com.aliyun.fs.oss.nat.NativeOssFileSystem")
或者统一下:
val ossData = OssOps(sc, endpoint, accessKeyId, accessKeySecret).readOssFile(inputPath, numPartitions)
因为readme.md说明里用了conf.set,有的项目set了,有的没有set。或是多一个少一个。
像example里TestOss的代码,就一个conf.set 也没有。
其实意思都能明白,有时候被绕进去了。
多谢解答!
如果不能独立,把SDK放进去也可以。
from aliyun-emapreduce-datasources.
@shengnoah 第一种可以直接用spark提供的sc.textFile(...),可能更好接受些。提供OssOps实际上和其他例如ODPS等接口保持一种统一风格。对于OSS来说这两种都可以,看你个人的选择。
from aliyun-emapreduce-datasources.
@uncleGen 收到,多谢!
from aliyun-emapreduce-datasources.
Related Issues (20)
- add kudu support HOT 1
- We can only read data in at most 5 minutes in each batch.
- when initialPartitionOffsets is out of date, getLatestHistograms returns a empty list, which may cause a Exception.
- batch datasource for datahub
- Sink data to specific shard of datahub by key
- Failed with missing config 'tunnel.id' when batch read tablestore data. HOT 1
- emr-oss get OSSClient exception HOT 4
- Tablestore datasource support both UPDATE and INSERT HOT 2
- Streaming+SLS may failed in some corner case
- jdbc source:remove query name requirement
- remove fastjson dependency
- Can't load this .dll (machine code=0xe800) on a AMD 64-bit platform HOT 5
- tablestore binlog parser udf HOT 1
- Timeout error occurred when use hadoop FileSystem to rename big file with JindoFS SDK HOT 6
- NotSerializableException when read from kudu datasource
- `GLIBC_2.16' not found HOT 10
- support for Windows HOT 1
- SLS structured streaming 消费加上错误重试
- use hadoop fs -ls oss://<ak>:<secret>@<bucket>.<endpoint>/ meet exception
- Any Spark 3 support for log service
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aliyun-emapreduce-datasources.