Git Product home page Git Product logo

Comments (3)

phofl avatar phofl commented on July 23, 2024 1

Thx for the report and the digging in. I'll put up a fix

from dask.

LucaMarconato avatar LucaMarconato commented on July 23, 2024

Computing the dataframe works:

>>> table.compute()
           x         y genes
0   0.254246  0.977669     a
1   0.776803  0.776138     a
2   0.945980  0.821877     a
3   0.344795  0.886264     a
4   0.825084  0.915373     a
5   0.254027  0.482018     a
6   0.418374  0.077625     b
7   0.279991  0.692238     b
8   0.960852  0.192070     b
9   0.750886  0.973036     b
10  0.558514  0.742854     b
11  0.898733  0.855712     b
12  0.078307  0.143652     b
13  0.859781  0.869062     b
14  0.110986  0.262581     b
15  0.445537  0.669543     b
16  0.933542  0.471514     b
17  0.320354  0.295965     b
18  0.307094  0.974755     b
19  0.824354  0.553312     b

While computing the column 'genes' doesn't

>>> table['genes'].compute()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_collection.py", line 475, in compute
    out = out.optimize(fuse=fuse)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_collection.py", line 590, in optimize
    return new_collection(self.expr.optimize(fuse=fuse))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 94, in optimize
    return optimize(self, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 3032, in optimize
    return optimize_until(expr, stage)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 2993, in optimize_until
    expr = expr.lower_completely()
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_core.py", line 444, in lower_completely
    new = expr.lower_once(lowered)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_core.py", line 399, in lower_once
    out = expr._lower()
          ^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_repartition.py", line 81, in _lower
    if self.new_partitions < self.frame.npartitions:
                             ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 398, in npartitions
    return len(self.divisions) - 1
               ^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/functools.py", line 995, in __get__
    val = self.func(instance)
          ^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 382, in divisions
    return tuple(self._divisions())
                 ^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 2071, in _divisions
    return super()._divisions()
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 529, in _divisions
    if not self._broadcast_dep(arg):
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 520, in _broadcast_dep
    return dep.npartitions == 1 and dep.ndim < self.ndim
           ^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 398, in npartitions
    return len(self.divisions) - 1
               ^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/functools.py", line 995, in __get__
    val = self.func(instance)
          ^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 382, in divisions
    return tuple(self._divisions())
                 ^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 3374, in _divisions
    if {df.npartitions for df in self.args} == {1}:
                                 ^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/functools.py", line 995, in __get__
    val = self.func(instance)
          ^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 3392, in args
    return [op for op in dfs if not is_broadcastable(dfs, op)]
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 3050, in is_broadcastable
    and any(compare(s, df) for df in dfs if df.ndim == 2)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 3050, in <genexpr>
    and any(compare(s, df) for df in dfs if df.ndim == 2)
            ^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/ome/lib/python3.12/site-packages/dask_expr/_expr.py", line 3042, in compare
    return s.divisions == (min(df.columns), max(df.columns))
                           ^^^^^^^^^^^^^^^
ValueError: min() iterable argument is empty

from dask.

LucaMarconato avatar LucaMarconato commented on July 23, 2024

Note that this works, it's adding a column to an existing Dask dataframe that seems to lead to the issue.

import dask.dataframe as dd
import numpy as np
import pandas as pd

# Generate random data for 'x' and 'genes'
x = np.random.rand(20)
genes = pd.Series(['a'] * 6 + ['b'] * 14, dtype='category')

# Create a Dask DataFrame with 'x' and 'genes' columns in a single call
table = dd.from_pandas(pd.DataFrame({'x': x, 'genes': genes}), npartitions=1)

# Both should work now
table.compute()
table['genes'].compute()

from dask.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.