Git Product home page Git Product logo

california's Introduction

An open synthetic population of California

Via San-Francisco

This repository contains the code to create an open data synthetic travel demand for any region in California.

Main reference

The main research reference for the general pipeline methodology is:

Hörl, S. and M. Balac (2020) Reproducible scenarios for agent-based transport simulation: A case study for Paris and Île-​de-France, Arbeitsberichte Verkehrs-und Raumplanung, 1499, IVT, ETH Zurich, Zurich.

The main research reference for the California synthetic travel demand is:

M. Balac and S. Hörl (2021) Synthetic population for the state of California based on open-data: examples of San Francisco Bay area and San Diego County, presented at the 100th Annual Meeting of the Transportation Research Board.

What is this?

This repository contains the code to create an open data synthetic travel demand for any region in California. It takes as input several publicly available data sources to create a data set that closely represents the socio-demographic attributes of persons and households in the region, as well as their daily mobility patterns. Those mobility patterns consist of activities which are performed at certain locations (like work, education, shopping, ...) and which are connected by trips with a certain mode of transport. It is known when and where these activities happen.

Such a synthetic population is useful for many research and planning applications. Most notably, such a synthetic population serves as input to agent-based transport simulations, which simulate the daily mobility behaviour of people on a spatially and temporally detailed scale. Moreover, such data has been used to study the spreading of diseases, or the placement of services and facilities.

The synthetic travel demand for California can be generated from scratch by everybody who has basic knowledge in using Python. Detailed instructions on how to generate a synthetic population with this repository are available.

Although the travel demand is independent of the downstream application or simulation tool, we provide the means to create an input population for the agent- and activity-based transport simulation framework MATSim.

This pipeline has been adapted to many other regions and cities around the world and is under constant development. It is released under the GPL license, so feel free to make adaptations, contributions or forks as long as you keep your code open as well!

Built scenarios

In case you want to use the synthetic travel demand directly without building them yourself, already built scenarios exist for Los Angeles, San Francisco and San Diego. Those can be found here:

Ready to use 1pct agent-based models can also be found here:

  • San Francisco and can be run with the RunSimulation class avaialble here
  • Los Angeles will arrive soon... and can be run with the SunSimulation class available here

Publications

california's People

Contributors

balacmi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

california's Issues

Python error during synthesis process

Dear eqasim developers,

First of all, thank you for the great project and the work.
I am currently trying to recreate the San Francisco scenario, following the documentation meticulously.
Unfortunately I run into a Python error and since I am not that familiar with Python I would like to ask if you can help me?

It happens always in the 22 stage where "synthesis.population.spatial.by_person.primary_locations_education" is executed:

INFO:synpp:Pipeline progress: 22/28 (78.57%)                                                                                                                                                    
INFO:synpp:Executing stage synthesis.population.spatial.by_person.primary_locations_education__9d115bd22684d7fa5b85e5ab5e91a0df ...                                                             
INFO:synpp:Loading cache for synthesis.population.sociodemographics__1fb23cc29a732c02767b3d80ad838e71 ...                                                                                       
INFO:synpp:Loading cache for synthesis.population.spatial.by_person.primary_zones__9d115bd22684d7fa5b85e5ab5e91a0df ...                                                                         
INFO:synpp:Loading cache for synthesis.population.spatial.by_person.primary_locations__9d115bd22684d7fa5b85e5ab5e91a0df ...                                                                     
INFO:synpp:Loading cache for synthesis.destinations__276e498090d2b7b804c663738c6038d1 ...                                                                                                       
INFO:synpp:Loading cache for data.hts.cleaned__4a0d7dfd1ef8a95f682308468477cd08 ...                                                                                                             
INFO:synpp:Loading cache for synthesis.population.trips__1fb23cc29a732c02767b3d80ad838e71 ...                                                                                                   
multiprocessing.pool.RemoteTraceback:                                                                                                                                                           
"""                                                                                                                                                                                             
Traceback (most recent call last):                                                                                                                                                              
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/multiprocessing/pool.py", line 121, in worker                                                                                     
    result = (True, func(*args, **kwds))        
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar                                                                                     
    return list(map(*args))                                                                                                                                                                     
  File "/home/bunsenuser/workspace/andyb/project_22cities/matsim-california/california/synthesis/population/spatial/by_person/primary_locations_education.py", line 53, in impute_education_locations_same_zone                                 
    indices_cp, distances_cp = tree.query_radius(home_coordinates_cp, r=random_from_cdf_cp, return_distance = True, sort_results=True)                                                          
  File "sklearn/neighbors/_binary_tree.pxi", line 1452, in sklearn.neighbors._kd_tree.BinaryTree.query_radius                                                                                   
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/site-packages/sklearn/utils/validation.py", line 556, in check_array                                                              
    "if it contains a single sample.".format(array))                                            
ValueError: Expected 2D array, got 1D array instead:                                                                                                                                            
array=[].                                       
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.                                                   
"""        
The above exception was the direct cause of the following exception:                            
                                                                                                
Traceback (most recent call last):                                                              
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/runpy.py", line 193, in _run_module_as_main                  
    "__main__", mod_spec)                                                                       
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/site-packages/synpp/__main__.py", line 14, in <module>
    synpp.run_from_yaml(config_path)
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/site-packages/synpp/pipeline.py", line 818, in run_from_yaml
    Synpp.build_from_yml(path).run_pipeline()
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/site-packages/synpp/pipeline.py", line 842, in run_pipeline
    ensure_working_directory=True)
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/site-packages/synpp/pipeline.py", line 742, in run
    result = stage["wrapper"].execute(context)
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/site-packages/synpp/pipeline.py", line 63, in execute
    return self.instance.execute(context)
  File "/home/bunsenuser/workspace/andyb/project_22cities/matsim-california/california/synthesis/population/spatial/by_person/primary_locations_education.py", line 156, in execute
    educ_19_100 = parallelize_dataframe(hts_trips_educ, df_agents, df_candidates, df_trips, 18,  100, 2, "/nas/balacm/educ16100.png",impute_education_locations_same_zone, 6)
  File "/home/bunsenuser/workspace/andyb/project_22cities/matsim-california/california/synthesis/population/spatial/by_person/primary_locations_education.py", line 106, in parallelize_datafram
e
    df_locations = pd.concat(pool.map(prod_x, df_split))
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/multiprocessing/pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/bunsenuser/.conda/envs/california/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.             

I have the suspicion that it could be due to changed data sources and have therefore already checked the input files several times and tried to modify the Pyhon script but without success. I would appreciate it very much if you could give me a hint here? :)

Thank you and many greetings
Andy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.