Git Product home page Git Product logo

plotly-dash-rapids-census-demo's People

Contributors

ajaythorve avatar ajschmidt8 avatar aleksficek avatar exactlyallan avatar jjacobelli avatar mike-wendt avatar nishantjadhav369 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

plotly-dash-rapids-census-demo's Issues

Bug in live data update logic

Looks like there a problem with the live data update for a running dashboard.

TypeError: load_covid() missing 1 required positional argument: 'acs_path'
Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/dash/dash.py", line 1459, in dispatch
    response.set_data(self.callback_map[output]["callback"](*args))
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/dash/dash.py", line 1339, in add_context
    output_value = func(*args, **kwargs)  # %% callback invoked %%
  File "/home/jon_mease_dev/repos/plotly-dash-rapids-census-demo/plotly_demo/app-covid.py", line 1136, in update_plots
    figures = figures_d.compute()
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/dask/base.py", line 166, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/dask/base.py", line 437, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/distributed/client.py", line 2595, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/distributed/client.py", line 1894, in gather
    asynchronous=asynchronous,
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/distributed/client.py", line 778, in sync
    self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/distributed/utils.py", line 348, in sync
    raise exc.with_traceback(tb)
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/distributed/utils.py", line 332, in f
    result[0] = yield future
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/tornado/gen.py", line 735, in run
    value = future.result()
  File "/home/jon_mease_dev/miniconda3/envs/plotly-dash-covid/lib/python3.7/site-packages/distributed/client.py", line 1753, in _gather
    raise exception.with_traceback(traceback)
TypeError: load_covid() missing 1 required positional argument: 'acs_path'

I think this line is missing an argument:

pd_covid = delayed(load_covid)(covid_data_path).persist()

CPU version

Could have a CPU version (it may be already in here) that a user could try out on a subsample of the data then move to the GPU one. Ideally it'll demonstrate acceleration and also give a good idea of code changes needed between a CPU and CPU version

Census Viz UI Updates

List of tasks to clean up original census visualization:

Needs:

  • Remove Rest GPU button
  • Enable map box select / zoom no longer filters
  • Use linear scale for age histogram
  • Make Education / Income chart taller
  • Remove Plasma and Inferno colors
  • Move Clear all selections to "options card" upper right corner
  • Remove Gender filter ( see below )
  • Combine options to single drop down, Color By : Total Count with Maga Color Scale... Gender Count by Blues Color Scale etc.
  • Remove Class of Workers ( see proposed )
  • Updated Readme and updated install instructions

Proposed (Comment for Feasibility):

  • Expand Class of Worker to full list, add as a full width vertical bar chart with names overlaid on bars, make height similar to small age histogram ( place below education / income chart )
  • Convert Education / Income chart into two full width bar charts ( similar to above )
  • Warning before allowing to CPU toggle with > 50 Million points
  • Fail safe / toggle for Open Street Map Dark Matter theme if no mapbox token found

Updated Titles:

  • Census Data = Census 2010 Visualization
  • Selected Population = Population Count
  • Configuration = Options
  • US Population = Population Distribution of Individuals
  • Education - Income Distribution = Education Levels by Income Distribution
  • Age = Age Distribution
  • Acknowledgements = Acknowledgements and Data Sources

Updated Acknowledgements:

  • 2010 Population Census and 2018 ACS data used with permission from IPUMS NHGIS, University of Minnesota, www.nhgis.org ( not for redistribution )
  • Base map layer provided by mapbox
  • Dashboard developed with Plot.ly Dash
  • Geospatial point rendering developed with Datashader
  • GPU accelerated with RAPIDS cudf and cupy libraries. CPU using pandas libraries.
  • For source code visit our GitHub

A typo in conda install path

# setup directory
cd plotly_demo

# setup conda environment 
conda env create --name plotly_env --file environment.yml

Should be

# setup directory
cd plotly_demo

# setup conda environment 
conda env create --name plotly_env --file ../environment.yml

AttributeError: 'DatetimeColumn' object has no attribute 'str' running the covid-19 app

Having just followed the directions in the README on the covid-19 branch, after starting the app it's possible to browse to the server and see a page, but there's no map, and in the console I see various exceptions, of the form:


[2020-06-12 15:21:13,791] ERROR in app: Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/dash/dash.py", line 1459, in dispatch
    response.set_data(self.callback_map[output]["callback"](*args))
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/dash/dash.py", line 1339, in add_context
    output_value = func(*args, **kwargs)  # %% callback invoked %%
  File "app-covid.py", line 1188, in update_plots
    covid_data_loaded_latest_date = delayed(update_latest_update_covid)(client.get_dataset('pd_covid')).compute()
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/dask/base.py", line 166, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/dask/base.py", line 444, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/distributed/client.py", line 2674, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/distributed/client.py", line 1974, in gather
    asynchronous=asynchronous,
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/distributed/client.py", line 824, in sync
    self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/distributed/utils.py", line 339, in sync
    raise exc.with_traceback(tb)
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/distributed/utils.py", line 323, in f
    result[0] = yield future
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/tornado/gen.py", line 735, in run
    value = future.result()
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/distributed/client.py", line 1833, in _gather
    raise exception.with_traceback(traceback)
  File "app-covid.py", line 212, in load_covid
    df_county.Last_Update = pd.to_datetime(df_county.Last_Update.str.split(' ')[
  File "/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/cudf/core/series.py", line 1504, in str
    return self._column.str(parent=self)
AttributeError: 'DatetimeColumn' object has no attribute 'str'

I stopped the server with Ctrl+C and started it again, to get a slightly different message:

$ python app-covid.py 
Found dataset at ../data/census_data_minimized.parquet
Dask status: http://127.0.0.1:8787/status
loading latest covid dataset...
Running on http://0.0.0.0:8050/
Debugger PIN: 955-443-814
 * Serving Flask app "app-covid" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
loading cached latest covid dataset...
/home/gmarkall/miniconda3/envs/plotly_env/lib/python3.7/site-packages/cudf/core/column/string.py:1738: UserWarning:

`expand` parameter defatults to True.

distributed.worker - WARNING -  Compute Failed
Function:  load_covid
args:      ('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/%s.csv', '../data/acs2018_county_population.parquet')
kwargs:    {}
Exception: AttributeError("'DatetimeColumn' object has no attribute 'str'")

I'm not sure if there's something wrong with my setup or I've mis-followed some of the directions - is there anything I can do to try and narrow down what the problem is?

Dask version doesn't work on a DGX system with 8*Tesla V100s

When running the dask_app.py, get the following issue on a dgx system. The dask app seems to work fine on systems with 2 GPUs (tested on 3 different machines, with titan RTX and v100s).

app.py non-dask version seems to be running fine on all systems

$ python dask_app.py --cuda_visible_devices=0,1

Found dataset at ../data/total_population_dataset.parquet
2022-12-09 12:18:52,985 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
2022-12-09 12:18:52,985 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-12-09 12:18:53,037 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
2022-12-09 12:18:53,037 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
Dask status: http://127.0.0.1:8787/status
Dash is running on http://0.0.0.0:8050/

 * Serving Flask app 'dask_app'
 * Debug mode: off

2022-12-09 12:19:14,429 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:42189 -> tcp://127.0.0.1:44145
Traceback (most recent call last):
  File "/home/nfs/***/yes/envs/plotly_env/lib/python3.9/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/home/nfs/***/yes/envs/plotly_env/lib/python3.9/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/nfs/***/yes/envs/plotly_env/lib/python3.9/site-packages/distributed/worker.py", line 1738, in get_data
    response = await comm.read(deserializers=serializers)
  File "/home/nfs/***/yes/envs/plotly_env/lib/python3.9/site-packages/distributed/comm/tcp.py", line 241, in read
    convert_stream_closed_error(self, e)
  File "/home/nfs/***/yes/envs/plotly_env/lib/python3.9/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:42189 remote=tcp://127.0.0.1:54032>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-12-09 12:19:14,435 - distributed.nanny - WARNING - Restarting worker

cc @exactlyallan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.