Git Product home page Git Product logo

Comments (5)

giheungkim avatar giheungkim commented on August 10, 2024 1

The problem with "NameError: name 'np' is not defined" was that numpy was not shared across process that pathos created even if it was imported in the main code. So it required importing numpy within the custom functions I wrote for weights and values computation.

from pyblp.

jeffgortmaker avatar jeffgortmaker commented on August 10, 2024

I think the following error message should have shown up in your full traceback?

pathos_message = (
"The built-in multiprocessing module does not support lambda functions. Consider setting "
"the use_pathos of parallel to True."
)

Let me know if it didn't show up for some reason. The solution is to set use_pathos=True (and to install it) which in turn uses dill.

from pyblp.

giheungkim avatar giheungkim commented on August 10, 2024

Hi Jeff,

Thank you that made the code go through. However, I still see two issues. I think it may be just the nature of pathos, but wanted to raise your attention.

First, it seems that pathos is doing a really bad job at parallel processing.

For example, in the problem of mine without micromoments, I see that the default method utilizes all N cores initiated during computation, and computation per Objective Evaluation is ~ 30 seconds.

If I use pathos instead of the default, all N cores are initiated (as in I see them in Task Manager) but only one core gets used. Subsequently, computation per Objective Evaluation is 7minutes 30s.

Second and a bit minor issue is that in the problem with micromoments, it raises the following exception:

Click to Expand/Collapse

---------------------------------------------------------------------------RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\markets\market.py", line 1368, in compute_micro_weights
weights = np.asarray(dataset.compute_weights(self.t, self.products, agents), options.dtype)
File "C:\Users\robin\AppData\Local\Temp\ipykernel_20168\2028236089.py", line 8, in
NameError: name 'np' is not defined

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\robin\anaconda3\lib\site-packages\multiprocess\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\robin\anaconda3\lib\site-packages\pathos\helpers\mp_helper.py", line 15, in
func = lambda args: f(*args)
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\utilities\basics.py", line 160, in generate_items_worker
return key, method(instance, *method_args)
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\utilities\basics.py", line 652, in wrapper
returned = decorated(*args, **kwargs)
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\markets\problem_market.py", line 93, in solve
self.safely_compute_micro_contributions(
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\utilities\basics.py", line 652, in wrapper
returned = decorated(*args, **kwargs)
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\markets\problem_market.py", line 171, in safely_compute_micro_contributions
self.compute_micro_contributions(
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\markets\market.py", line 1794, in compute_micro_contributions
self.compute_micro_dataset_contributions(
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\markets\market.py", line 1453, in compute_micro_dataset_contributions
weights = self.compute_micro_weights(dataset, agent_indices)
File "C:\Users\robin\anaconda3\lib\site-packages\pyblp\markets\market.py", line 1371, in compute_micro_weights
raise RuntimeError(message) from exception
RuntimeError: Failed to compute weights for micro dataset 'Mkt 9.0 ACS: 2000 Observations in Market '9.0'' because of the above exception.
"""

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
Cell In[17], line 3 1 # With Micro Moments but without RC - by Income Bin 2 with pyblp.parallel(20,use_pathos=True):
----> 3 demo_frac_rent_results= demographic_problem_frac_rent.solve( 4 sigma=no_sigma, 5 pi=initial_pi_frac_rent, 6 iteration = iter_routine, 7 optimization = opt_routine, 8 micro_moments=micro_moments_wbin, 9 )

File ~\anaconda3\lib\site-packages\pyblp\economies\problem.py:696, in ProblemEconomy.solve(self, sigma, pi, rho, beta, gamma, sigma_bounds, pi_bounds, rho_bounds, beta_bounds, gamma_bounds, delta, method, initial_update, optimization, scale_objective, check_optimality, finite_differences, error_behavior, error_punishment, delta_behavior, iteration, fp_type, shares_bounds, costs_bounds, W, center_moments, W_type, se_type, micro_moments, micro_sample_covariances, resample_agent_data) 694 else:
695 output("Estimating standard errors ...")
--> 696 final_progress = compute_step_progress( 697 theta, progress, compute_gradient, compute_hessian, compute_micro_covariances, 698 detect_micro_collinearity, compute_simulation_covariances 699 ) 700 iteration_stats.append(final_progress.iteration_stats)
701 optimization_stats.evaluations += 1

File ~\anaconda3\lib\site-packages\pyblp\economies\problem.py:820, in ProblemEconomy._compute_progress(self, parameters, moments, iv, W, scale_objective, error_behavior, error_punishment, delta_behavior, iteration, fp_type, shares_bounds, costs_bounds, finite_differences, resample_agent_data, theta, progress, compute_gradient, compute_hessian, compute_micro_covariances, detect_micro_collinearity, compute_simulation_covariances, agents_override) 818 parts_collinearity_candidate_values: Dict[Hashable, Dict[MicroDataset, Array]] = {}
819 generator = generate_items(self.unique_market_ids, market_factory, ProblemMarket.solve)
--> 820 for t, generated_t in generator:
821 (
822 delta_t, xi_jacobian_t, parts_numerator_t, parts_denominator_t, parts_numerator_jacobian_t,
823 parts_denominator_jacobian_t, parts_covariances_numerator_t, weights_mapping_t, values_mapping_t,
824 clipped_shares_t, iteration_stats_t, tilde_costs_t, omega_jacobian_t, clipped_costs_t, errors_t
825 ) = generated_t
827 delta[self._product_market_indices[t]] = delta_t

File ~\anaconda3\lib\site-packages\multiprocess\pool.py:873, in IMapIterator.next(self, timeout) 871 if success:
872 return value
--> 873 raise value

RuntimeError: Failed to compute weights for micro dataset 'Mkt 9.0 ACS: 2000 Observations in Market '9.0''

Hopefully this is easy to remedy but even this issue gets fixed, not sure why pathos underperforms so badly than the default multiprocessing.

Thanks!

from pyblp.

giheungkim avatar giheungkim commented on August 10, 2024

Hi Jeff,

For now, I've found a workaround of just defining my own functions using the "def function_name(t,p,a): ..." syntex for compute_weights, compute_values, compute_value, and compute_gradient arguments for relevant Micromoment objects.

Without any lambda expressions, the default multiprocess goes through, and seems to be utilizing most of the initiated cores.

from pyblp.

jeffgortmaker avatar jeffgortmaker commented on August 10, 2024

Your minor issue seems like missing import in your code? The top of the traceback is NameError: name 'np' is not defined in a local file.

I'm not sure why you're having trouble with pathos. My parallel implementation uses either multiprocessing.pool.Pool.imap_unordered or pathos.multiprocessing.ProcessPool.uimap, depending on whether or not pathos is being used. My guess is any differences come down to different behavior (e.g., in iterated calls?) of these two methods, but I'd have to take a look at a minimum working example that tries to replicate the behavior you're seeing.

Since you've found a workaround I'll close this issue for now, but feel free to keep commenting if you end up putting together a minimum working example (i.e., just a few lines of code, ideally just comparing these above two methods) of the weird pathos behavior!

from pyblp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.