Bug report The following test is currently failing in CI, however

Notes: the test still passes in minimal env tests the

very close to an answer! I used the <a href="https://github.com/asottile/detect-test-p

TST: `yt/visualization/volume_rendering/tests/test_scene.py::test_annotations` is failing in CI,about yt-project/yt

Comments (28)

chrishavlin commented on September 23, 2024 1

the fact that we haven't been able to reproduce it locally (by running just this test) may be indicative of test pollution, which could be checked by running only this test in CI too.

I just confirmed this over in #4786:

running just pytest yt/visualization/volume_rendering/tests/test_scene.py::test_annotations passed in CI
expanding to run all the visualization tests, pytest yt/visualization/, passed in CI

from yt.

chrishavlin commented on September 23, 2024 1

I gotta run now, thank you for your help so far!!

from yt.

chrishavlin commented on September 23, 2024 1

Ok, we have an answer! and not pyarrow related after all.

So turns out when you import firefly, it modifies the default matplotlib figure.dpi value:

import matplotlib as mpl
rcparams0 = mpl.rcParams.copy()
import firefly.data_reader
rcparams1 = mpl.rcParams.copy()

print(rcparams0['figure.dpi']) 
print(rcparams1['figure.dpi'])

prints

100
120

In yt, Scene._show_mpl() has a dpi=100 argument that is used to set the dimensions of the figure but is not explicitly passed to the matplotlib Figure on instantiation. So after firefly gets imported, the default dpi changes and the yt figure changes size when saved. The fix on the yt side is to just pass along the dpi argument to Figure as well (which maybe we should have been doing all along). I'm not sure why this was only exposed now...

from yt.

neutrinoceros commented on September 23, 2024

Notes:

the test still passes in minimal env tests
the fact that we haven't been able to reproduce it locally (by running just this test) may be indicative of test pollution, which could be checked by running only this test in CI too.

from yt.

chrishavlin commented on September 23, 2024

was trying to re-run the failed test with pytest --lf as well to see if it'd fail on the second go, but i couldn't get it working in the github action...

might also be worth doing a run that randomizes test order?

from yt.

neutrinoceros commented on September 23, 2024

might also be worth doing a run that randomizes test order?

Ideally yes, though I suspect that would be very unstable if we did it now because pollution in our test is possibly high, and it's likely that some of them are only passing because of other tests' side effects (I've put a lid on this for a while ...).

from yt.

chrishavlin commented on September 23, 2024

ok, well i've narrowed down the subset of tests that you need to run to get the failure: just running the tests in yt/data_objects/tests and the failing test:

pytest --color=yes yt/data_objects/tests yt/visualization/volume_rendering/tests/test_scene.py::test_annotations

no indication of why yet. Also, when re-running the failed test, it does pass on second run.

from yt.

chrishavlin commented on September 23, 2024

very close to an answer! I used the https://github.com/asottile/detect-test-pollution pytest plugin to find that yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string is the source of the issue. Running just these two tests you can reproduce the failure:

        pytest --color=yes  \
        yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string \
        yt/visualization/volume_rendering/tests/test_scene.py

when reversing the order, tests pass no problem. Looking at test_firefly_JSON_string, there's nothing obvious... but those firefly reader objects seem to do a lot with writing to disk and symlinks and default data directories so maybe there's something there?

from yt.

neutrinoceros commented on September 23, 2024

Great job ! I may be able to give a second read to those tests as early as tomorrow. Meanwhile: friendly ping @agurvich

from yt.

chrishavlin commented on September 23, 2024

OK. I think i'm done for the day, but one extra note: I still get the failure when I change test_firefly_JSON_string to not actually call writeToDisk, so it's something in the initial creation of the firefly object in ad.create_firefly_object (could be a yt thing, a firefly thing or a shared library between the two or something else very strange? really still no good idea...)

from yt.

agurvich commented on September 23, 2024

but those firefly reader objects seem to do a lot with writing to disk and symlinks and default data directories so maybe there's something there?

don't talk about my trash like that! hahaha. I don't have a working yt development environment set-up, but it sounds like I might need to sniff something out?

The situation with the symlinks is that the firefly data needs to be written out to disk in order to be read in by the browser (unless it's passed through a websocket using flask). However, because of browser security features, websites (or webapps) can't read files just anywhere on disk, they have to be subdirectories of the app itself. So when you're hosting this app locally on your computer, the data has to be visible from a path that's a subdirectory of the app files (which int he case of being installed via pip, is hidden in your site-packages directory). My solution at the time was to let users specify where they wanted to write their data to and then symlink from there to Firefly's data directory. So what yt should be doing here is creating a Firefly object, passing some array data to it, using the Firefly object's built-in writeToDisk to write the data to a named temporary file, and then make a symlink from there to Firefly's data directory. I don't totally remember how I set up the unit test and I'm not sure what is actually failing here (is there a failing test or is there an incomprehensible error?). Where can I see the CI output (without building and running locally myself)?

Or will I have to build yt?

from yt.

chrishavlin commented on September 23, 2024

don't talk about my trash like that! hahaha.

haha, many apologies :D I don't actually think that the symlinks are the culprit, I was just grasping at straws cause I'm rather baffled by this failure.

Here's a quick summary of the issue:

a test unrelated to firefly started failing recently (yt/visualization/volume_rendering/tests/test_scene.py::test_annotations). neither myself or @neutrinoceros have been able to reproduce this issue locally, it only fails in our CI pipeline on github with python 3.12.
we figured out it a test pollution issue: re-running the failed test by itself passes just fine
I bisected down to the point where I can re-create the failure with just two tests: a yt test that creates a firefly object (yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string) and our test_annotations test. It only fails when the firefly related test is run before the test_annotations test, reversing the order passes. So some global state somewhere is somehow being changed? maybe??

here's an example of the failing test that reads back in an image that was written to disk.

from yt.

agurvich commented on September 23, 2024

E       DeprecationWarning: 
E       Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
E       (to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
E       but was not found to be installed on your system.
E       If this would cause problems for you,
E       please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466

../../../miniconda3/envs/oss/lib/python3.12/site-packages/pandas/__init__.py:221: DeprecationWarning
==================================== short test summary info ====================================
FAILED yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string - DeprecationWarning:

So here is what I get when I build locally, which seems consistent with ageller/Firefly#177 .

The solution there was to remove the dependency on pandas. This was the objectively correct thing to do, using pandas to write a JSON string was silly, but happened because we were already using pandas elsewhere and knew it would work. It looks like what's going on is that pandas is being imported in a different file (in this case, firefly/data_reader/reader.py (where the "Firefly Object" aka Reader class instance is defined). There, we use it to read a csv. We could swap to numpy and use genfromtxt or something, but I like how pandas is able to load tabular data in a pretty generic/bulletproof way (at least, it doesn't die it just returns nonsense which depending on your philosophy is better/worse).

My hypothesis is that depending on the order/whether the test has been run is that Pyarrow is being loaded into the environment after this test is run and perhaps it's not happening locally for either of you because you have Pyarrow installed(?). I'll be honest I don't totally understand the concept of this deprecation warning. The current version of pandas doesn't require Pyarrow. If I ever want to upgrade to the next version, whatever software distribution system I'm using (conda or pip) will install Pyarrow for me.

from yt.

agurvich commented on September 23, 2024

@chrishavlin I was so close to finishing my message before you posted, haha! What do you think of my hypothesis? What is the best way to proceed?

from yt.

agurvich commented on September 23, 2024

Ah... looking at the error you posted above, I would be very surprised if the Pyarrow thing were the actual culprit,

E       assert (614, 614, 4) == (512, 512, 4)
E         At index 0 diff: 614 != 512
E         Full diff:
E         - (512, 512, 4)
E         + (614, 614, 4)

/home/runner/work/yt/yt/yt/visualization/volume_rendering/tests/test_scene.py:104: AssertionError
=========================== short test summary info ============================
FAILED yt/visualization/volume_rendering/tests/test_scene.py::test_annotations - assert ([61](https://github.com/yt-project/yt/actions/runs/7661070299/job/20879720772?pr=4786#step:7:62)4, 614, 4) == (512, 512, 4)
  At index 0 diff: 614 != 512
  Full diff:
  - (512, 512, 4)
  + (614, 614, 4)
========================= 1 failed, 2 passed in 6.[63](https://github.com/yt-project/yt/actions/runs/7661070299/job/20879720772?pr=4786#step:7:64)s ==========================
Error: Process completed with exit code 1.

How could something like this fail because of whether or not something is imported into the environment...? Maybe the Pyarrow thing is just a red herring and a totally separate failure in my local environment.

from yt.

chrishavlin commented on September 23, 2024

and it doesn't look like Pyarrow is being installed in the test run either (but I liked the idea :) )

from yt.

agurvich commented on September 23, 2024

Not sure if it's progress, but I'm getting the same error locally now that I've installed Pyarrow....

however running a second time does not make the test pass

from yt.

chrishavlin commented on September 23, 2024

ooOOoo, I'd say it is progress. maybe pyarrow is being installed and I didn't search the logs properly

from yt.

chrishavlin commented on September 23, 2024

and I just reproduced it as well after installing PyArrow!

from yt.

agurvich commented on September 23, 2024

3rd time was not the charm, still getting the above assertion error in test_annotations when I do

pytest --color=yes\

yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string\

yt/visualization/volume_rendering/tests/test_scene.py

seemingly no matter how many times I do it (which seems unlike the situation in the CI environment?).

from yt.

agurvich commented on September 23, 2024

Hey!!! I'll call that progress!!!!

from yt.

chrishavlin commented on September 23, 2024

Well, it passes if you change the order so that test_scene runs first:

pytest --color=yes  \
        yt/visualization/volume_rendering/tests/test_scene.py \
        yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string

from yt.

agurvich commented on September 23, 2024

Ahhh I see, yes, can confirm, that passed for me.

from yt.

agurvich commented on September 23, 2024

NP. At this point I'm gonna leave it to you and @neutrinoceros now that you can get your hands dirty locally, but let me know if you want me in the trenches as well.

from yt.

neutrinoceros commented on September 23, 2024

Thank you both for working on this. There seem to be some confusion around pyarrow, so let me try to clarify the situation:

In yt's CI, we don't install pyarrow yet, but we promote warnings to errors
pandas 2.2.0 started emitting a DeprecationWarning to warn that pyarrow was to become a dependency to pandas 3.0
there are three ways to get around this warning:
1. tell pytest to ignore it (this is what I did in #4784)
2. install pyarrow (pandas assumes that you're fine with having to install pyarrow in the future if you already have it)
3. downgrade pandas
this actually has nothing to do with ageller/Firefly#177: what happened there was that Firefly was using private pandas API which was moved in pandas 2.2 . I fixed it by using public API instead. It so happens that said public API lives in the standard lib, but it wouldn't have been a problem to keep using pandas if it provided such public API.

So, I don't think pyarrow is necessary to reproduce locally, it just helps dodging pandas' deprecation warning, but I think downgrading pandas would have the same effect. To help reduce confusion I'd suggest merging #4784 😇

from yt.

chrishavlin commented on September 23, 2024

and @agurvich thanks for you help yesterday!

from yt.

neutrinoceros commented on September 23, 2024

I think I found where this change happens:
https://github.com/agurvich/abg_python/blob/408f0daffb2f84130b4df096b72b77a5affdd733/src/abg_python/plot_utils.py#L26

abg_python is a dependency to Firefly, and it leaks Matplotlib configuration on import.

The reason we're finding out now is that the culprit commit is from Nov 2022, but there was no release of abc_python between April 2022 and 3 days ago when I came knocking on Alex's door agurvich/abg_python#4

so, at the risk of being called names: ping @agurvich 🙈

from yt.

neutrinoceros commented on September 23, 2024

moving the discussion upstream as there's nothing left to be done on our side: agurvich/abg_python#5

from yt.

TST: `yt/visualization/volume_rendering/tests/test_scene.py::test_annotations` is failing in CI about yt HOT 28 CLOSED

Comments (28)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent