Git Product home page Git Product logo

pam's Introduction

PAM

Population Activity Modeller

DailyCIbadge Documentation image JOSS DOI Test coverage PyPI version Anaconda.org version

PAM is a python library for population activity sequence modelling. Example use cases:

  • Read an existing population then write to a new format.
  • Modify an existing population, for example to model activity locations.
  • Create your own activity-based model.

PAM supports common travel and activity formats, including MATSim.

Activity Sequences?

Population activity sequences (sometimes called activity plans) are used to model the activities (where and when people are at home, work, education and so on) and associated travel of a population:

PAM

Activity sequences are used by transport planners to model travel demand, but can also be used in other domains, such as for virus transmission or energy use modelling.

Brief History

PAM was originally built and shared to rapidly modify existing activity models to respond to pandemic lock-down scenarios.

PAM

This functionality used a read-modify-write pattern. Where modifications are made by applying policies. Example policies might be (a) infected persons quarantine at home, (b) only critical workers travel to work, and (c) everyone shops locally.

PAM

Features

Activity Modelling

In addition to the original read-modify-write pattern and functionality, PAM has modules for:

  • location modelling
  • discretionary activity modelling
  • mode choice modelling
  • facility sampling
  • vehicle ownership

More generally the core PAM data structure and modules can be used as a library to support your own use cases, including building your own activity-based model.

MATSim

PAM fully supports the MATSim population/plans format. This includes vehicles, unselected plans, leg routes and leg attributes. A core use case of PAM is to read-modify-write experienced plans from MATSim. This can allow new MATSim scenarios to be "warm started" from existing scenarios, significantly reducing MATSim compute time.

Documentation

For more detailed instructions, see our documentation.

Installation

To install PAM, we recommend using the mamba package manager:

As a user

mamba create -n pam -c conda-forge -c city-modelling-lab cml-pam
mamba activate pam

As a developer

git clone [email protected]:arup-group/pam.git
cd pam
mamba create -n pam -c conda-forge -c city-modelling-lab --file requirements/base.txt --file requirements/dev.txt
mamba activate pam
pip install --no-deps -e .

Installing with pip

Installing directly with pip as a user (pip install cml-pam) or as a developer (pip install -e '.[dev]') is also possible, but you will need the libgdal & libspatialindex geospatial non-python libraries pre-installed.

For more detailed instructions, see our documentation.

Contributing

There are many ways to make both technical and non-technical contributions to PAM. Before making contributions to the PAM source code, see our contribution guidelines and follow the development install instructions.

If you are using pip to install PAM instead of the recommended mamba, you can install the optional test and documentation libraries using the dev option, i.e., pip install -e '.[dev]'

If you plan to make changes to the code then please make regular use of the following tools to verify the codebase while you work:

  • pre-commit: run pre-commit install in your command line to load inbuilt checks that will run every time you commit your changes. The checks are: 1. check no large files have been staged, 2. lint python files for major errors, 3. format python files to conform with the pep8 standard. You can also run these checks yourself at any time to ensure staged changes are clean by simple calling pre-commit.
  • pytest - run the unit test suite, check test coverage, and test that the example notebooks successfully run.
  • pytest -p memray -m "high_mem" --no-cov (not available on Windows) - after installing memray (mamba install memray pytest-memray), test that memory and time performance does not exceed benchmarks.

For more information, see our documentation.

Building the documentation

If you are unable to access the online documentation, you can build the documentation locally. First, install a development environment of PAM, then deploy the documentation using mike:

mike deploy 0.2
mike serve

Then you can view the documentation in a browser at http://localhost:8000/.

Credits

This package was created with Cookiecutter and the arup-group/cookiecutter-pypackage project template.

pam's People

Contributors

alex-kaye avatar ana-kop avatar andkay avatar arup-sb avatar brynpickering avatar chicken-teriyaki-cup-rice avatar dependabot[bot] avatar divyasharma-arup avatar elladahan avatar fredshone avatar iseulsong avatar josepaznoguera avatar kasiakoz avatar markruddy avatar markusstraub avatar mfitz avatar pre-commit-ci[bot] avatar rorysedgwick avatar sarah-e-hayes avatar syhwawa avatar theodore-chatziioannou avatar val-ismaili avatar yannisza avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pam's Issues

Stream plans to matsim xml

Would be useful to stream matsim plans to xml (we have read stream already).

Would be great for reducing memory requirement for simpler ops especially via cli

pytest fails when using python 3.11 (and updated dependencies)

I tried to use pam with python 3.11 - generating a population (mostly with the facility sampler) worked fine and was roughly 20% faster than using python 3.7 (wohoo!)

-> See https://github.com/markusstraub/pam/tree/py311, especially the requirements.txt with all current dependencies

However I noticed that a few unit tests fail. This is (I guess) due to the updated (geo)pandas-dependencies:

FAILED tests/test_00_utils.py::test_build_geodataframe_for_pt_person - AssertionError: Attributes of DataFrame.iloc[:, 11] (column name="seq") are different
FAILED tests/test_00_utils.py::test_build_geodataframe_for_cyclist - AssertionError: Attributes of DataFrame.iloc[:, 11] (column name="seq") are different
FAILED tests/test_00_utils.py::test_build_hhld_geodataframe - AssertionError: Attributes of DataFrame.iloc[:, 11] (column name="seq") are different
FAILED tests/test_00_utils.py::test_build_pop_geodataframe - AssertionError: Attributes of DataFrame.iloc[:, 11] (column name="seq") are different
FAILED tests/test_09_samplers.py::test_random_sample_point_from_multilinestring_random_seed - ValueError: Total of weights must be greater than zero
FAILED tests/test_09_samplers.py::test_random_sample_point_from_multipoint_random_seed - ValueError: Total of weights must be greater than zero

The fails in test_00_utils.py don't worry me at all, apparently pandas 1.3.x converted the seq parsed from the matsim xml to float, with pandas 1.5.x it is an int.

The fails in test_09_samplers.py are strange, but maybe these two tests need to be fixed anyways:

  • test_random_sample_point_from_multilinestring_random_seed calls sampler.sample_point_from_multipolygon
  • test_random_sample_point_from_multipoint_random_seed calls sampler.sample_point_from_multipolygon

@YannisZa git tells me you wrote (or last edited) these two tests, do you have an idea what could be wrong here?

Add write support for `facilities.xml`

Facility information is currently present within the plans.xml output.

There are some applications which make use of a simplified summary facilities.xml output, which is defined as:

A Facility is a (Basic)Location ("getCoord") with an Id ("getId") that is connected to a Link ("getLinkId").

As per this

Redesign notebooks directory

There are a growing number of notebooks in the "notebooks" directory.

Use of these notebooks should be better designed streamlined, specifically:

  • rename as "examples" directory
  • maybe provide a README.md document in the notebooks dir, providing contents and summaries
  • give notebooks some sensible ordering and prefix names so they are shown in correct order. Additional change names so that style if consistent
  • generally improve consistency across notebooks
  • ensure that major use cases are covered with a focus on new users

save a facility sampler

FSs take a while to build, so may be good to be able to create a way to save a pre-baked FS and then importing it into a synthesis pipeline. Currently FacilitySampler class is a generator, which can't be pickled. Consider adding a FacilitySampler.dump() and .load() methods?

facility sampler behavior problems

Using a FacilitySampler via Population.sample_locs fails to return point locations with repeated warnings that look like

Using random sample for zone:252:home
Cannot find idx:252, returning None
Using random sample for zone:262:shop_food
Using random sample for zone:262:shop_food
Using random sample for zone:258:shop_food
Using random sample for zone:258:home
Cannot find idx:258, returning None
....

However, tests using the same sampler yield valid shapely points:

  • facility_sampler.sample('258', 'shop_food') yields a valid shapely point, with Using random sample ... warning.
  • facility_sampler.sample('258', 'home') yields a valid shapely point, without warning.

FacilitySampler params are set to fail=False, random_default=True with the input zones having the zone id as index.

The returning None error traces back to the spatial.RandomPointSampler class (see below) -- my best guess is there is a problem with the way the idxs are being passed around the various facility sampling methods in core, facility, and spatial modules.

if not idx in self.index:
if self.fail:
raise IndexError(f'Cannot find idx: {idx} in geoms index')
self.logger.warning(f'Cannot find idx:{idx}, returning None')
return None

Update README

README requires review by a fresh set of eyes. Some specific requirements are:

  • improved install instructions for Windows users
  • links to example data and notebooks

Docker container entry point should be `python` not `ipython`

What happened?

I tried using the generated PAM Docker image to run a job on AWS and realised that the entry point for the container should not be ipython, but rather python. I initially set the entry point to ipython as I assumed most people would interact with the container interactively, but that was probably only true when installing PAM on Windows devices was a massive pain. Now, most people are going to run python scripts as commands on the Docker container from an e.g. AWS job, so python makes more sense.

In the meantime, it is possible to get the same result as python -m my_script .... by calling ipython -- my_script ....

Which operating systems have you used?

  • macOS
  • Windows
  • Linux

Version

0.2.5-dev

Relevant log output

No response

Add automated tests for PAM's Jupyter notebooks

PAM includes a notebooks directory; at the time of writing, there are 12 Jupyter notebooks inside it. We should create a script to automatically smoke test all of these notebooks, and integrate that script into the CI build.

too much logging

Some methods print leave a string of logs to terminal. eg:

Cropping plan components
Cropping plan ending in Leg
Cropping plan components
Cropping plan components
Cropping plan components
Cropping plan components
Cropping plan components
Cropping plan components
Cropping plan components
Cropping plan components
Cropping plan components
Cropping plan components
...

For large populations this is unhelpful, both because the logs are not very specific and because there are a lot.

Suggest:

  1. improve log information
  2. offer and maybe default to less logging
  3. offer method summaries (eg "N plans required cropping")
  4. catch repeated logs and stop after ~10 such repeats, similar to MATSim

Repalce the last activity from 'None' to 'home' when using write_benchmark

When I used write_benchmark function to write the trips and legs from population.xml, I found the last activity for every agent’s trip was defined as None which should be home. I have checked all of the None value comes from the last trips.
An easy fix can be replacing the 'None' with 'home' in the function.

trips_input
	seq	purp
0	0	work
1	1	work
2	2	work
3	3	work
4	4	work
5	5	work
6	6	work
7	7	None
8	0	work
9	1	work

CI builds are broken

What happened?

The open-source GitHub action we are using to send Slack notifications from CI workflows no longer builds, leading not just to missing notifications, but completely failed builds that look like this:

Screenshot 2023-07-18 at 15 47 04

The quick fix is to use the latest version of the action.

Which operating systems have you used?

  • macOS
  • Windows
  • Linux

Version

None

Relevant log output

No response

test requires internet access

found out (on the train) that the following test requires internet access:

tests/test_17_vehicles.py::test_writing_all_vehicles_results_in_valid_xml_file

fix

Pip install optional dependencies not working

What happened?

pip install ./pam[dev] not working.

This is a zsh issue - fix incoming.

Which operating systems have you used?

  • macOS
  • Windows
  • Linux

Version

None

Relevant log output

No response

Consolidate pam.write tabular methods

There are a number of tabular write methods in pam.write:

  • write_travel_diary
  • dump (wrap of to_csv)
  • to_csv (this is the dominant method, also used by pam.core.Population.to_csv)
  • write_population_csv (this function is similar to above but can write multiple populations - originally used to create a "database" of tables for PowerBI)

Requirements:

  • consolidate into single function/method
  • remove code duplication
  • simplify user interface/update docs
  • fix breaking changes to example notebooks

Additionally it would be great to provide consistency as much as possible with column names in the elara log outputs (https://github.com/arup-group/elara/blob/6755d4d27866463466e15a9abac316f3951d7e6f/elara/plan_handlers.py#L662). The idea to make these two outputs (pre and post sim) more easilly comparable.

pid-hid mapping for plans without trips

Need to add a warning/error message for an edge case: PAM creates stay-at-home plans when loading persons without any trip data. However, we often assign persons to households using the pid-hid columns of the trips dataset; in the case of missing trips the persons are not added.

UserWarnings are not Errors

What happened?

The most common error type that is raised in the code is a UserWarning. This class should not be used as an exception, but as a type of warning for e.g. warnings.warn("my message", UserWarning). Instead the appropriate exception class should be used, e.g. ValueError, KeyError, TypeError... Or a PamError should be created which subclasses Exception.

Which operating systems have you used?

  • macOS
  • Windows
  • Linux

Version

0.2.5-dev

Relevant log output

No response

Improve simple yield speed

What can be improved?

As I've commented in #215, there are many instances of iterators in PAM that are not performant.

If there is a case where you want a class method/property to yield an element in a simple iterator (list, tuple, set, flat dict, ...), then wrapping the object in iter will yield about twice as quick as for i in foo: yield i.

Version

0.2.5-dev

NL parsing issues

record of issues for parsing NL data:

  • hardcoding of "home"
  • read/write part of docs not obvious/too short
  • tst/tet cannot accept float

plan equality bug

the plan equality method doesn't work as intended when comparing plans of unequal size.

We use the zip method to iteratively check equality of individual plan components. However, when we iterate across plans of unequal size, zip will only yield the first n elements, where n is the number of elements in the "sparser" plan.

Use PAM as a schedule optimiser

There is a demonstration of a schedule optimiser in the charypar-nagel branch. Demo notebook here: https://github.com/arup-group/pam/blob/charypar-nagel/notebooks/reschedule.ipynb

There are two (very bad) optimising algos implemented already: https://github.com/arup-group/pam/tree/charypar-nagel/pam/optimise

The ultimate application of this work is to optimise activity times and durations for a given population of 24 hour (MATSim) plans. This process will assume trip durations, locations (and modes) are fixed, so is just concerned with finding the optimum activity times and durations. Scoring will be based on some arbitrary (utility) function - the charypar-nagel util function is already implemented for this work and included in the existing demo.

One of the known use cases for this work is to search for better initial plans before progressing to simulation.

Tasks:

  • implement a test case that can be used to a) test quality of optimisation and b) check performance of optimisation (ie how long it takes)
  • develop and test new algos
  • get code in a state (inc tests) that it can be used on real project

pytest failure at example notebooks,

fresh pull, env and install (using latest mamba instructions):

running pytest fails at smoke tests, due to not finding kernels, eg:

____________________________________________________________________________________________________ /Users/fred.shone/Projects/pam/examples/01_PAM_Getting_Started.ipynb ____________________________________________________________________________________________________
[gw1] darwin -- Python 3.9.16 /Users/fred.shone/mambaforge/envs/pam/bin/python3.9
Error - No such kernel: 'pam'
----------------------------------------------------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------------------------------------------------
WARNING  traitlets:kernelspec.py:286 Kernelspec name pam cannot be found!
ERROR    traitlets:manager.py:92 No such kernel named pam
Traceback (most recent call last):
  File "/Users/fred.shone/mambaforge/envs/pam/lib/python3.9/site-packages/jupyter_client/manager.py", line 85, in wrapper
    out = await method(self, *args, **kwargs)
  File "/Users/fred.shone/mambaforge/envs/pam/lib/python3.9/site-packages/jupyter_client/manager.py", line 397, in _async_start_kernel
    kernel_cmd, kw = await self._async_pre_start_kernel(**kw)
  File "/Users/fred.shone/mambaforge/envs/pam/lib/python3.9/site-packages/jupyter_client/manager.py", line 359, in _async_pre_start_kernel
    self.kernel_spec,
  File "/Users/fred.shone/mambaforge/envs/pam/lib/python3.9/site-packages/jupyter_client/manager.py", line 182, in kernel_spec
    self._kernel_spec = self.kernel_spec_manager.get_kernel_spec(self.kernel_name)
  File "/Users/fred.shone/mambaforge/envs/pam/lib/python3.9/site-packages/jupyter_client/kernelspec.py", line 287, in get_kernel_spec
    raise NoSuchKernel(kernel_name)
jupyter_client.kernelspec.NoSuchKernel: No such kernel named pam

I can run notebooks fine after identifying the correct kernel. But pytest error persists.

I can fix by explictly installing kernel within env: python3 -m ipykernel install --user --name pam. This command is also used in the github build pipeline. But not sure how to include in local install? maybe this.


Separate complaint. I would rather not run the smoke tests on pytest - they are relatively slow which is annoying during dev.

Running a single module, eg pytest tests/test_17_vehicles.py, addresses this but then coverage reports as failing. 🤷🏻 . No biggy - just a niggle. Also just discover no coverage option: pytest tests/test_17_vehicles.py --no-cov.

Include full linting checks in the code QA script

Currently we are running a code linting check in the CI build that restricts the linting rules in play via the argument --select=E9,F63,F7,F82. We should be using the full default set of rules so that we catch and fix a wider range of linting problems. Using this full set of rules will require us to fix a few hundred linting errors:

flake8 . --max-line-length 120 --count --show-source --statistics --exclude=scripts,tests
...
2     E101 indentation contains mixed spaces and tabs
12    E122 continuation line missing indentation or outdented
3     E124 closing bracket does not match visual indentation
33    E125 continuation line with same indent as next logical line
4     E128 continuation line under-indented for visual indent
61    E203 whitespace before ':'
4     E221 multiple spaces before operator
20    E225 missing whitespace around operator
1     E227 missing whitespace around bitwise or shift operator
1     E228 missing whitespace around modulo operator
95    E231 missing whitespace after ':'
271   E251 unexpected spaces around keyword / parameter equals
21    E261 at least two spaces before inline comment
6     E262 inline comment should start with '# '
9     E265 block comment should start with '# '
13    E266 too many leading '#' for block comment
1     E271 multiple spaces after keyword
22    E302 expected 2 blank lines, found 1
10    E303 too many blank lines (2)
52    E501 line too long (150 > 120 characters)
1     E502 the backslash is redundant between brackets
12    E711 comparison to None should be 'if cond is not None:'
1     E712 comparison to True should be 'if cond is True:' or 'if cond:'
3     E713 test for membership should be 'not in'
22    E999 SyntaxError: invalid syntax
5     F401 'pam.plot.plans.*' imported but unused
5     F403 'from pam.plot.plans import *' used; unable to detect undefined names
2     W191 indentation contains tabs
36    W291 trailing whitespace
4     W292 no newline at end of file
34    W293 blank line contains whitespace
1     W391 blank line at end of file
767

We should extract the linting into its own script, and invoke this script from our new QA script/git commit hook.

Leg vs Trip vs seq

The Leg object is sometimes a Leg (when parsing MATSim plans) and sometimes a Trip but in all cases we use an object called Leg.

Suggest clarifying this by refactoring Leg to Trip and then adding optional Leg objects as required.

Profiling

Need to profile some common pam ops:

  • read methods
  • write methods
  • sampling new populations
  • facility sampling

pam.activity.simplify_pt_trips is broken

What happened?

pam.activity.simplify_pt_trips is broken assumes transit has mode "pt". This is no longer the case so method is broken.

Which operating systems have you used?

  • macOS
  • Windows
  • Linux

Version

0.2.5-dev

Relevant log output

No response

Install on M1

notes so far on installing on M1 mac:

brew install proj (perhaps also brew install some other spatial libs too)

If you get "AttributeError: dlsym(RTLD_DEFAULT, Error_GetLastErrorNum): symbol not found"

try:

export SPATIALINDEX_C_LIBRARY='/opt/homebrew/Cellar/spatialindex/1.9.3/lib'

from here

Link to PAM Slack channel (or move user discussions to github)

Description

In the current contributing guidelines, a button for asking a question links to https://github.com/arup-group/pam/discussions

Since adding this button I realised we have a Slack channel which I believe is public, so anyone can join to ask questions. The only problem is that I can't seem to generate a public permalink for the channel. The Slack sharing links only last 14 days...

Related links

Version

0.2

Proposed change

Either we switch this button to point at the Slack channel or we move discussions to the github page. There are advantages to both. Probably easier for now to point to the Slack channel, if a public permalink can be generated.

attributes path causing confusion in write_matsim function

write.write_matsim no longer supports v12 which mean attributes_path no longer required.

We currently raise an error/warning: "WARNING:root:parameter "attributes_path" is no longer supported by write_matsim()"

But this still causing confusion -> suggest either beef up message or remove entirely now.

no way of reading matsim experienced plans

MATSim currently outputs "output_experienced_plans.xml.gz" that do not include person attributes. As we have currently set upo the pam.read.read_matsim function we will not be able to get it to read these plans with person attributes.

We need to be able to pass the expereinced plans path and also the regular plans from which person attributes can be read.

Read methods - make naming and method clearer

the load_activity_plan load method is for home based tour tours.

I mistakingly used it with plans where the first trip is not assumed to be home. Can we label/name this to be more explicit?

Further content in this proposed PR

Reading MATSim plans results in inflated leg and activity numbers

Created input plans with the following stats:

{'num_households': 76530, 
'num_people': 76530, 
'num_activities': 313395, 
'num_legs': 236865}

Reading the output population after iteration 0 with MATSim gives

{'num_households': 76530,
 'num_people': 76530,
 'num_activities': 571585,
 'num_legs': 495055}

The following message shows up when reading both of them:

Negative duration activity found at pid=1000121102110
Cropping plan components

There are a few things here:

  1. I'm surprised that reading in plans that were created with pam have negative durations.
  2. What happens with activities with negative duration? Does it warrant doubling the number of activities/legs?
  3. I think this should be refined. The goal of pam, as I understand it, is to read plans, modify them and run these new plans as a scenario. I think having so many extra activities will result in unintended outcomes, every extra activity is extra demand so you're making a scenario but not the kind that you expect, your population may look very different once you load it in.

All `Person` attributes default to `java.lang.String` when saving to MATSim xml

All attributes for a Person are currently saved as java.lang.String in the MATSim xml files:
https://github.com/arup-group/pam/blob/main/pam/write.py#L177

Sometimes MATSim requires different java types. Below is an error caused by running the multimodal contrib, which is expecting Person's age to be saved as java.lang.Integer for example:

  | java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Integer (java.lang.String and java.lang.Integer are in module java.base of loader 'bootstrap')
  | at org.matsim.contrib.multimodal.router.util.WalkTravelTime.setPerson(WalkTravelTime.java:276) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.contrib.multimodal.router.util.WalkTravelTime.getLinkTravelTime(WalkTravelTime.java:139) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.AStarEuclidean.addToPendingNodes(AStarEuclidean.java:152) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.Dijkstra.relaxNodeLogic(Dijkstra.java:438) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.Dijkstra.relaxNode(Dijkstra.java:409) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.AStarLandmarks.relaxNode(AStarLandmarks.java:137) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.Dijkstra.searchLogic(Dijkstra.java:317) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.Dijkstra.calcLeastCostPath(Dijkstra.java:234) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.AStarLandmarks.calcLeastCostPath(AStarLandmarks.java:124) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.NetworkRoutingModule.calcRoute(NetworkRoutingModule.java:108) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.TripRouter.calcRoute(TripRouter.java:182) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.router.PlanRouter.run(PlanRouter.java:101) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.population.algorithms.PersonPrepareForSim.run(PersonPrepareForSim.java:219) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at org.matsim.core.population.algorithms.ParallelPersonAlgorithmUtils$PersonAlgoThread.run(ParallelPersonAlgorithmUtils.java:145) ~[columbus-2.1.0-jar-with-dependencies.jar:2.1.0]
  | at java.lang.Thread.run(Thread.java:834) ~[?:?]

We had a go at 'guessing' java types from python types in genet: arup-group/genet#124, master/genet/utils/java_dtypes.py, which may be of use here.

Duplicate household attributes - results in nested outputs

In the event of duplicate household attribute rows, you will see successful ingestion, processing but an issue with the outputs when writing to file (in this case MATSim xml):

<object id="10002061023">
    <attribute class="java.lang.String" name="psex">{10002061023: 'Male'}</attribute>
    <attribute class="java.lang.String" name="page">{10002061023: 40}</attribute>
    <attribute class="java.lang.String" name="pdlicense">{10002061023: 1}</attribute>
    <attribute class="java.lang.String" name="hincome">{10002061023: 4}</attribute>
    <attribute class="java.lang.String" name="hsize">{10002061023: 4}</attribute>
    <attribute class="java.lang.String" name="hid">{10002061023: 640586395095}</attribute>
    <attribute class="java.lang.String" name="subpopulation">{10002061023: 'License'}</attribute>
    <attribute class="java.lang.String" name="source">{10002061023: 'phase-2-tfl'}</attribute>
  </object>

This nested attribute can not be read by MATSim. It happens when this function returns more than one row (as there are duplicate households).

Two possible approaches to handle this;

  1. The most upstream check is to warn or hard error if there are duplicate person_attributes rows in the inputted pandas dataframe.
  2. Silently remove duplicate rows - eg via Pandas ftn person_attributes_df.drop_duplicates(keep="first", inplace=True)

Make dev install instructions more prominent

Description

Most PAM users are also developers, so the developer / contributer installation instructions should be more clear in the README and in the documentation. The README currently only gives the user installation instructions, leading to an environment being installed without the expected testing/docs dependencies.

Related links

Version

0.2

Proposed change

Add this code snippet (which includes the dev requirements) to the installation instructions and the README under an e.g., as a developer subheading:

git clone [email protected]:arup-group/pam.git
cd pam
mamba create -n pam -c city-modelling-lab --file requirements/base.txt --file requirements/dev.txt
mamba activate pam
pip install --no-deps -e .

too many too big examples

examples repo is out of control and some of the books take ages to run which is making CI and build slow.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.