Git Product home page Git Product logo

Comments (3)

rvantonder avatar rvantonder commented on August 11, 2024

Hi there! I did the following (though I am not on a windows machine, but I think this will help us get to the issue):

echo -n '\r\n' > file.txt 
# now the file.txt contains only `\r\n`
comby ':[1]' '' -match-only -json-lines file.txt 2> /dev/null | python -m json.tool

The output of the above is:

{
    "matches": [
        {
            "environment": [
                {
                    "range": {
                        "end": {
                            "column": 1,
                            "line": 2,
                            "offset": 2
                        },
                        "start": {
                            "column": 1,
                            "line": 1,
                            "offset": 0
                        }
                    },
                    "value": "\n",
                    "variable": "1"
                }
            ],
            "matched": "\r\n",
            "range": {
                "end": {
                    "column": 1,
                    "line": 2,
                    "offset": 2
                },
                "start": {
                    "column": 1,
                    "line": 1,
                    "offset": 0
                }
            }
        }
    ],
    "uri": "/private/tmp/file.txt"
}
  • The "value": "\n" part looks like a bug: this should likely be \r\n, and in that case, it looks like the reported range for "variable": "1" is wrong, yes.

  • The range for matched looks OK to me (offsets are 0-based, line/column is 1-based). Do you expect different values here?

from comby.

vn-ki-cn avatar vn-ki-cn commented on August 11, 2024
/tmp/comby-fixer-15612932977450046446.tmp
$ echo -n "this is line 1\r\nthis is line 2 \r\n" > file.txt                                           

/tmp/comby-fixer-15612932977450046446.tmp
$ xxd file.txt 
00000000: 7468 6973 2069 7320 6c69 6e65 2031 0d0a  this is line 1..
00000010: 7468 6973 2069 7320 6c69 6e65 2032 200d  this is line 2 .
00000020: 0a                                       .

/tmp/comby-fixer-15612932977450046446.tmp
$ comby ':[1]' '' -match-only -json-lines file.txt 2> /dev/null | python -m json.tool
{
    "uri": "/tmp/comby-fixer-15612932977450046446.tmp/file.txt",
    "matches": [
        {
            "range": {
                "start": {
                    "offset": 0,
                    "line": 1,
                    "column": 1
                },
                "end": {
                    "offset": 33,
                    "line": 3,
                    "column": 1
                }
            },
            "environment": [
                {
                    "variable": "1",
                    "value": "this is line 1\nthis is line 2 \n",
                    "range": {
                        "start": {
                            "offset": 0,
                            "line": 1,
                            "column": 1
                        },
                        "end": {
                            "offset": 33,
                            "line": 3,
                            "column": 1
                        }
                    }
                }
            ],
            "matched": "this is line 1\r\nthis is line 2 \r\n"
        }
    ]
}

as you can see with the above output there are 2 lines in the file, but the comby thinks it's just 1 huge line and reports the range as such.

EDIT: the value appears to be wrong as well

from comby.

rvantonder avatar rvantonder commented on August 11, 2024

as you can see with the above output there are 2 lines in the file, but the comby thinks it's just 1 huge line

The matched part for :[1] matches across new lines (it is not line-based), so the output "this is line 1\r\nthis is line 2 \r\n" here seems correct to me. The JSON format means that newlines are encoded as escape sequences \n. When the \r\n escape sequences are interpreted, this corresponds to two lines. The range reported for this fragment is:

"start": {
                    "offset": 0,
                    "line": 1,
                    "column": 1
                },
                "end": {
                    "offset": 33,
                    "line": 3,
                    "column": 1
                }

The current convention is to put the end of the range at the position after the match. Thus, if the last character of the range is a newline, the position after is the next line, at the first column position (line 3, column 1). It's possible to change the convention of what range means, but as it is this is currently consistent and doesn't seem incorrect to me.


That said, the value for

                    "variable": "1",
                    "value": "this is line 1\nthis is line 2 \n",

indeed looks wrong to me, since the \r\n's should be in there. I'll investigate that :-)

from comby.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.