Git Product home page Git Product logo

Comments (10)

PhilippSalvisberg avatar PhilippSalvisberg commented on July 3, 2024 1

This commit is related: 7cb32ba

from plsql-formatter-settings.

PhilippSalvisberg avatar PhilippSalvisberg commented on July 3, 2024

When reading a file the default encoding of the platform (OS) is used. In case of Windows 10 it is not UTF-8. In Germany it's probably Windows-1252. Nowadays on Windows we have files in UTF-8 character set and in the default Windows character set. Finding out which is which is quite difficult. The best approach I know is using the CharsetDetector of Apache Tika. It basically will use the character set with the highest probability, which is good enough.

It would probably make sense to extend the functionality of the format.js accordingly. However, this JavaScript is designed to work with SQLcl and the provided Java libraries. Apache Tika is not part of them. So, we need to find another solution to detect the character set. Maybe via java.nio.charset.CharsetDecoder. This should be doable with a list of character sets to try and using the platform character set if the character set cannot be identified.

In the meantime using -Dfile.encoding is probably the best option.

BTW: How do you deploy the code into the database? How is the character set identified there?

from plsql-formatter-settings.

PhilippSalvisberg avatar PhilippSalvisberg commented on July 3, 2024

The following lines should be changed in format.js:

Both operation should use the same character set. The character set should be one of the following:

  • default (platform/OS specific as today
  • auto (detect character set based on content)
  • according parameter

This would be the most flexible solution. I'd go with "auto" as the new default.

from plsql-formatter-settings.

PhilippSalvisberg avatar PhilippSalvisberg commented on July 3, 2024

On a side note. The only thing I do not like about the pre-commit hook is that when you are using it, your changes are blended with formatter changes in one commit. Ideally I'd see two separate commits. One with manual changes and one with automated changes (formatter).

Yes, if the formatter is applied the first time for a file, the complete file is formatted. This way you cannot distinguish between the original change and re-formatting of other unrelated code.

When you decide to use the Git hook in a project, I recommend formatting the complete code base with the formatter. You can do that with the configured settings by running the following in the root folder of your project (in GitBash on Windows):

.git/hooks/pre-commit .

Then you can check in the changes with the comment "formatted with the new formatter settings".

From that point on the scope of the committed changes should not be affected by the formatter.

from plsql-formatter-settings.

jgebal avatar jgebal commented on July 3, 2024

We are a combination of SQLCL and Flyway do orchestrate deployments.
Now that I have checked I se that we always explicitly set the below before calling various operations from command line.

$env:java_tool_options="-Dfile.encoding=UTF-8 -Duser.language=en -Duser.country=EN"
chcp 65001

from plsql-formatter-settings.

jgebal avatar jgebal commented on July 3, 2024

When you decide to use the Git hook in a project, I recommend formatting the complete code base with the formatter.

That approach also has 2 sides. but then your entire DB code is different than your version control.

Well, whatever you do, there will be some place of inconsistency unless you re-deploy entire DB sources.

I need to think and discuss which is worse :)

from plsql-formatter-settings.

PhilippSalvisberg avatar PhilippSalvisberg commented on July 3, 2024

We are a combination of SQLCL and Flyway do orchestrate deployments. Now that I have checked I se that we always explicitly set the below before calling various operations from command line.

$env:java_tool_options="-Dfile.encoding=UTF-8 -Duser.language=en -Duser.country=EN"
chcp 65001

Thanks for the update, Jacek. I expected something like that. This just confirms that the format.js behaves as SQLcl. Hence I do not consider this a bug.

Nevertheless, I think it would be a good idea to try to detect the character set and add an option to override the default behavior (beside using -Dfile.encoding).

from plsql-formatter-settings.

PhilippSalvisberg avatar PhilippSalvisberg commented on July 3, 2024

When you decide to use the Git hook in a project, I recommend formatting the complete code base with the formatter.

That approach also has 2 sides. but then your entire DB code is different than your version control.

Well, whatever you do, there will be some place of inconsistency unless you re-deploy entire DB sources.

I need to think and discuss which is worse :)

I understand. However, the code in the database and the deployment process is for sure out of scope. If you would like to ensure that there are two commits, one calling the formatter without the change and one for the formatted change then you need to implement that with an own shell script. The pre-commit Git hook is not suited for something like that. However, needing to run an own script will complicate things if you use Git from an IDE.

Maybe not using the formatter is the best option in your case.

from plsql-formatter-settings.

jgebal avatar jgebal commented on July 3, 2024

I think the best case is to do as you suggested: Format entire repository in one commit. I only need to make sure there are no significant changes in code (as I saw with UTF8 files and non-UTF8 default console encoding).

Thank you Philipp

from plsql-formatter-settings.

PhilippSalvisberg avatar PhilippSalvisberg commented on July 3, 2024

closed via #231

from plsql-formatter-settings.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.