Comments (3)
To help me understand freq
some more (I have already read the relevant section in the docs), I have a couple of questions:
-
If we have a population of 100 people distributed across 40 households, would we generally set up a model such that
sum(hh.freq for hh in population.households.values()) = 40
andsum(person.freq for hh in population.households.values() for person in hh.people.values()) = 100
? -
If we have two
Person
objects in onehousehold
, one with afreq
of 2 and another with afreq
of 5, does that imply 7 people live in this household type, represented by two agents? -
If we then have that household having a
freq
of 10, does that imply 70 people of our population are accounted for in that household?
Finally, why is it freq
? Why not weight
or size
? (the latter I can understand, since size
has an existing meaning for objects in Python).
from pam.
- If we have a population of 100 people distributed across 40 households, would we generally set up a model such that
sum(hh.freq for hh in population.households.values()) = 40
andsum(person.freq for hh in population.households.values() for person in hh.people.values()) = 100
?
Nice questions. Freq or weight is intended to represent the quantity of an entity that would be found in a (representative) population. It is used to sample individual agents in a representative way.
Back to basics: we represent a real population of N with a subset of agents in pam. The the freq of all hhs should add up to N and the freq of all persons should add up to N and the freq of all trips should add up to N.
BUT we historically found trip, person and household "weights" to be inconsistent in input data sets, within households and between a hh and persons within, within a persons trips and so on. This is because we were using data intended for use in (for example) a trip based model, where there was no requirement for consistency at higher levels.
A user might fix this themselves before sampling a MATSim population (all agents in MATSim have the same weight*). Or more commonly (now), weights are calculated at household level from our own population synthesis process and weighting should be consistent. Or more simply only specified at hh level.
The assumption is that any weighted sampler will use the weight from whatever it samples. So a "hh sampler" would use the household weighting. There are mechanics in pam that will use a weighted average of person weights for a hh, if the hh has no weight. This was convenient in past but is perhaps now redundant. I would be open to removing it.
I am happy to call either frequency
or weight
. We made heavy use of "observed frequency based sampling" in past, hence freq
. I think size
would be a bit misleading (for a hh I would expect it to be the number of occupants).
- earlier i said "all agents in MATSim have the same weight". For example in a 10% sample of the population, all agents would have weight of 10. This could be relaxed in future for a good reason. But not sure what that would be.
from pam.
Seems .size
should be reserved for "unweighted" counts and either .freq
or .weight
for "weighted"?
So propose that:
- population.size be changed to be count of hhs (rather than weighted)
- keep
.freq
as "weighted" sum
I would also be open to being more explicit about what is being counted, eg:
- add
Population.persons_freq()
- add
Population.hhs_freq()
- add
Household.persons_freq()
- remove
Population.freq
- remove
Household.freq()
- remove `Leg.freq
from pam.
Related Issues (20)
- Unable to read matsim example berlin plans
- Pip install optional dependencies not working
- pam.activity.simplify_pt_trips is broken
- CI builds are broken
- Docker container entry point should be `python` not `ipython` HOT 4
- Make dev install instructions more prominent
- Improve simple yield speed
- Link to PAM Slack channel (or move user discussions to github)
- Clean up docs versioning in readme and issue template
- UserWarnings are not Errors
- Leave example notebooks unexecuted in the repo
- Pam Fix Plans Crop functionality beyond the last leg/activity
- Bad design of Scorer (it was me)
- Abstractmethod docstrings are not inherited in API reference
- Airspeed Velocity benchmarking HOT 1
- wrong type conversion of type conversions
- integration with scikit-mobility
- Bloated project size HOT 1
- PAM build results are not reported to Slack HOT 1
- Build Docs Actions Workflow is broken
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pam.