Git Product home page Git Product logo

Comments (19)

ronaldoussoren avatar ronaldoussoren commented on June 12, 2024 2

Why do you think there's a bug here? Posixpath is documented to work on POSIX style paths and that will cause problems when working with Windows style paths. Likewise when working with POSIX style paths on using ntpath.

In particular APIs that end up interacting with the filesystem, like abspath and realpath cannot do the right thing when running on a "foreign" system. It is possible to define semantics, but I'm not convinced that's worth the extra code.

from cpython.

eryksun avatar eryksun commented on June 12, 2024 1

For comparison, note that pathlib.PurePath instances don't provide absolute() because it's not meaningful for cross-platform use. For posixpath and ntpath, I think assuming the root directory is fine, and drive "C:" for ntpath.

Here's an implementation of ntpath.abspath() with correct behavior for drive-relative paths:

try:
    from nt import _getfullpathname
except ImportError:
    # Non-Windows platforms
    def abspath(path):
        """Return the absolute version of a path."""
        path = os.fspath(path)
        if not isabs(path):
            if isinstance(path, bytes):
                drive = splitroot(path)[0] or b'C:'
                path = join(drive, b'\\', path)
            else:
                drive = splitroot(path)[0] or 'C:'
                path = join(drive, '\\', path)
        return normpath(path)
else:
    def abspath(path):
        """Return the absolute version of a path."""
        try:
            return _getfullpathname(normpath(path))
        except (OSError, ValueError):
            path = os.fspath(path)
            if not isabs(path):
                if isinstance(path, bytes):
                    cwd = os.getcwdb()
                else:
                    cwd = os.getcwd()
                cwd_drive = splitroot(cwd)[0]
                path_drive = splitroot(path)[0]
                if path_drive and path_drive != cwd_drive:
                    try:
                        cwd = _getfullpathname(path_drive)
                    except (OSError, ValueError):
                        pass
                path = join(cwd, path)
            return normpath(path)

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

If you agree I'll make a pull request. At least for posixpath, the change doesn't lead to more code.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

I improved your snippet: now it's a bit faster and works correctly when it fails to determine the current working directory.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

posixpath.realpath() also doesn't return an absolute path for relative paths on Windows:

>>> import posixpath
>>> posixpath.realpath('foo')
'C:\\Users\\wanne/foo'
>>> posixpath.isabs('C:\\Users\\wanne/foo')
False

from cpython.

eryksun avatar eryksun commented on June 12, 2024

I improved your snippet: now it's a bit faster and works correctly when it fails to determine the current working directory.

Great. If _getfullpathname(drive) fails, barring some catastrophic API failure, it's a ValueError because the drive string has an embedded null (e.g. '\x00:'), or because it's a bytes path that can't be decoded with the filesystem encoding (e.g. b'\xff:' given the filesystem encoding is UTF-8).

Actually, the latter UnicodeDecodeError is raised earlier in 3.13, by splitroot(). Maybe that's a bug. It's certainly new behavior compared to 3.12. The same applies to normpath() in 3.10 vs 3.11+.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

Why do you think there's a bug here?

We're literally testing that ntpath.realpath(), ntpath.relpath() & posixpath.relpath() work partially for relative paths on the wrong platform. These tests broke when I tried to optimise os.path.abspath() & os.path.relpath().

And you would expect that os.path.abspath() & os.path.realpath() can't return relative paths under any circumstances. (especially since we're passing a valid path to them)

It is possible to define semantics, but I'm not convinced that's worth the extra code.

The only extra code is this function (a wrapper that adds the strict argument to posixpath.abspath()), matching ntpath.realpath():

def realpath(path, *, strict=False):
    """Return an absolute path."""
    if strict:
        raise NotImplementedError('realpath: strict unavailable on this platform')
    return abspath(path)

When we have a C implementation for posixpath.abspath(), all other functions will need a fallback anyway.
The one for ntpath.abspath() must be implemented separately to handle drive relative paths correctly, in case of an error.
It would be nice if these fallback functions (which are only used on the other platform) also followed the correct semantics.

from cpython.

ronaldoussoren avatar ronaldoussoren commented on June 12, 2024

Why do you think there's a bug here?

We're literally testing that ntpath.realpath(), ntpath.relpath() & posixpath.relpath() work partially for relative paths on the wrong platform. These tests broke when I tried to optimise os.path.abspath() & os.path.relpath().

And you would expect that os.path.abspath() & os.path.realpath() can't return relative paths under any circumstances. (especially since we're passing a valid path to them)

The implementation IMHO rightfully assumes that os.getcwd() returns an absolute path. Using ntpath on Unix or posixpath on Windows historically is only valid for functions that can work without interacting with the filesystem, e.g. APIs like dirname that only use string manipulation.

Other's don't work, and often cannot work in an expected way due to the difference in path separators on Windows and POSIX.

It is possible to define semantics, but I'm not convinced that's worth the extra code.

The only extra code is this function (a wrapper that adds the strict argument to posixpath.abspath()), matching ntpath.realpath():

def realpath(path, *, strict=False):
    """Return an absolute path."""
    if strict:
        raise NotImplementedError('realpath: strict unavailable on this platform')
    return abspath(path)

When we have a C implementation for posixpath.abspath(), all other functions will need a fallback anyway. The one for ntpath.abspath() must be implemented separately to handle drive relative paths correctly, in case of an error. It would be nice if these fallback functions (which are only used on the other platform) also followed the correct semantics.

IMHO There are no "correct" semantics when calling realpath on a foreign path. I'd say "don't do that than" is correct answer here, on Windows posixpath.abspath('d:foo') will give a nonsensical answer, as will posixpath.abpath(r'c:\foo').

Note that teaching posixpath about drive letters when used on Windows breaks the API contract because POSIX file system paths don't have drive letters.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

Other's don't work, and often cannot work in an expected way due to the difference in path separators on Windows and POSIX.

That still doesn't explain why we're calling these functions with relative paths in the tests. (absolute paths are fine, though)

IMHO There are no "correct" semantics when calling realpath on a foreign path.

I agree, but that's NOT what I want to change here. I want posixpath to work on posix style paths and ntpath to work on Windows style paths:

posixpath.abspath('foo/bar')  -> '/foo/bar'
posixpath.realpath('foo/bar') -> '/foo/bar'
ntpath.abspath(r'foo\bar')    -> 'C:\\foo\\bar'
ntpath.abspath(r'\foo\bar')   -> 'C:\\foo\\bar'
ntpath.abspath(r'D:foo\bar')  -> 'D:\\foo\\bar'

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

I copied the fix for drive relative paths to the other pull request. If this gets rejected, we don't lose it.

from cpython.

barneygale avatar barneygale commented on June 12, 2024

In particular APIs that end up interacting with the filesystem, like abspath and realpath cannot do the right thing when running on a "foreign" system. It is possible to define semantics, but I'm not convinced that's worth the extra code.

This is exactly right.

Perhaps we should work towards making abspath, realpath etc unavailable from "foreign" path modules, e.g. raise a warning when a user calls posixpath.realpath() from Windows.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

That's too strict. ntpath.relpath() & posixpath.relpath() work correctly for absolute paths. We could only raise a warning for relative paths, though:

def abspath(path):
    """Return the absolute version of a path."""
    path = os.fspath(path)
    if not isabs(path):
        warn("passing relative paths is deprecated", DeprecationWarning, stacklevel=2)
        drive, _, path = splitroot(path)
        if isinstance(path, bytes):
            path = join(drive or b'C:', b'\\' + path)
        else:
            path = join(drive or 'C:', '\\' + path)
    return normpath(path)

from cpython.

ronaldoussoren avatar ronaldoussoren commented on June 12, 2024

In particular APIs that end up interacting with the filesystem, like abspath and realpath cannot do the right thing when running on a "foreign" system. It is possible to define semantics, but I'm not convinced that's worth the extra code.

This is exactly right.

Perhaps we should work towards making abspath, realpath etc unavailable from "foreign" path modules, e.g. raise a warning when a user calls posixpath.realpath() from Windows.

Maybe, although I'm not sure if adding warnings is worth the effort because the majority of uses of these APIs is through os.path. That said, I did a quick GitHub search on posixpath.abspath and that has more hits than I'd expect, and at least one of them is likely broken.

That's too strict. ntpath.relpath() & posixpath.relpath() work correctly for absolute paths. We could only raise a warning for relative paths, though:

relpath calls abspath and hence won't correctly when used on "foreign" paths. More importantly, I don't think we should suggest that these function do work correctly or try to define semantics for them (when using posixpath on Windows or ntpath on Unix)

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

relpath calls abspath and hence won't correctly when used on "foreign" paths.

It DOES work correctly for absolute paths ('/foo/bar' for posixpath.relpath() & C:\\foo\\bar for ntpath.relpath()).
When a path is absolute, it's returned unchanged:

cpython/Lib/posixpath.py

Lines 409 to 418 in f5b7e39

def abspath(path):
"""Return an absolute path."""
path = os.fspath(path)
if isinstance(path, bytes):
if not path.startswith(b'/'):
path = join(os.getcwdb(), path)
else:
if not path.startswith('/'):
path = join(os.getcwd(), path)
return normpath(path)

cpython/Lib/ntpath.py

Lines 604 to 618 in f5b7e39

def _abspath_fallback(path):
"""Return the absolute version of a path as a fallback function in case
`nt._getfullpathname` is not available or raises OSError. See bpo-31047 for
more.
"""
path = os.fspath(path)
if not isabs(path):
if isinstance(path, bytes):
cwd = os.getcwdb()
else:
cwd = os.getcwd()
path = join(cwd, path)
return normpath(path)

ntpath.relpath() isn't that useful, but posixpath.relpath() can be used for urls.
People that are passing absolute paths are already doing what they need to prevent garbage output.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

More importantly, I don't think we should suggest that these function do work correctly or try to define semantics for them (when using posixpath on Windows or ntpath on Unix)

Again, the fallback functions I want to change are ONLY used when on a foreign platform. It's a small effort to assume the cwd is / for posixpath & C:/ for ntpath, which doesn't lead to extra code.

That being said, if you don't want ANY extra code, we could not fix posixpath.realpath() and not add a fallback function.

from cpython.

barneygale avatar barneygale commented on June 12, 2024

ntpath.relpath() isn't that useful, but posixpath.relpath() can be used for urls. People that are passing absolute paths are already doing what they need to prevent garbage output.

Relative URLs work differently to relative paths, so users doing this are asking for trouble.

It's a small effort to assume the cwd is / for posixpath & C:/ for ntpath, which doesn't lead to extra code.

This is not going to happen.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

Sighs, that one time I waited for some sign of approval before making a pull request.

from cpython.

nineteendo avatar nineteendo commented on June 12, 2024

Hmm, this might not even have been possible when not using the c implementation of Python...

from cpython.

novaTopFlex avatar novaTopFlex commented on June 12, 2024

On my system (Ubuntu 22.04 LTS, Python 3.10.12 [GCC 11.4.0]) on linux), I am a POSIX user, not an NT user, but I still have a similar situation. In my case, I have imported both posixpath and ntpath, and when I have attempted to utilize the function ntpath.realpath() drive letters are not automatically printed. I believe that my situation is the proper and expected situation as the drive letter is unknown (would the file be on the C: drive? D: drive?). But I do agree that the POSIX path conversion on Microsoft Windows is unacceptable, as the POSIX formula seemed to remain on your interpretation. Also, the os.path.abspath() function is not intended for path conversion between POSIX and NT systems.

from cpython.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.