Git Product home page Git Product logo

Comments (8)

dongjoon-hyun avatar dongjoon-hyun commented on July 25, 2024 2

Yep. That's the test case which I wrote in the Spark community. :)
Thanks, @guiyanakuang . I'll close this issue because it seems that @liujinhui1994 found the answer here.

from orc.

dongjoon-hyun avatar dongjoon-hyun commented on July 25, 2024

Did you check Apache ORC webpage about Spark Configuration, @liujinhui1994 ? The encryption option should be provided in the same way during writing and reading. So, we prefer to put it to the table properties.

CREATE TABLE encrypted (
  ssn STRING,
  email STRING,
  name STRING
)
USING ORC
OPTIONS (
  hadoop.security.key.provider.path "kms://http@localhost:9600/kms",
  orc.key.provider "hadoop",
  orc.encrypt "pii:ssn,email",
  orc.mask "nullify:ssn;sha256:email"
)

from orc.

liujinhui1994 avatar liujinhui1994 commented on July 25, 2024

I checked here. Only saw how the table is configured. There is no way to see the datasource. Because my task will not use the table.
Does the above configuration work the way dataSource does?

@dongjoon-hyun

from orc.

guiyanakuang avatar guiyanakuang commented on July 25, 2024

@liujinhui1994 Using a custom dataSource to read and write orc files?
Maybe this test case can help you.
https://github.com/apache/orc/blob/main/java/core/src/test/org/apache/orc/impl/TestEncryption.java

from orc.

liujinhui1994 avatar liujinhui1994 commented on July 25, 2024

Is it possible to do something like parquet encryption.
The following way to pass?
JavaSparkContext jsc = new JavaSparkContext(spark.sparkContext());
jsc.hadoopConfiguration().set("","")

@guiyanakuang

from orc.

liujinhui1994 avatar liujinhui1994 commented on July 25, 2024

https://spark.apache.org/docs/latest/sql-data-sources-parquet.html

from orc.

guiyanakuang avatar guiyanakuang commented on July 25, 2024

I think it's similar, but I haven't practiced using it this way
https://github.com/apache/spark/blob/c55b9fd1e014fac979b1e42f5a880e7b63286a54/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcEncryptionSuite.scala#L27-L60

from orc.

liujinhui1994 avatar liujinhui1994 commented on July 25, 2024

ok i'll try it thanks @guiyanakuang

from orc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.