Git Product home page Git Product logo

Comments (4)

943fansi avatar 943fansi commented on August 31, 2024 1

Thanks for your answers and sharing. Your patient and detailed reply is very helpful to me. Thank you again sincerely!

from enas-pdm.

mohyunho avatar mohyunho commented on August 31, 2024

Hi, 943fansi. Thanks for reaching out.
To find the reason, I've tried to run the code, but i could not get the same error. So, please understand that i cannot provide you an exact answer to address the problem.

Based on the discussion in the forum you shared, i guess that the error likely comes from the code related to the multiprocessing via pathos. I used a pathos multiprocessing in my code, and currently the variable 'jobs' is set to '2' for that.
So, i suggest you to set this to '1' to deactivate the multiprocessing or directly remove the pathos related codes in 'evolutionary_algorithm.py'.

If the answer would not be helpful and the problem persists, please let me know.

from enas-pdm.

943fansi avatar 943fansi commented on August 31, 2024

Hi, 943fansi. Thanks for reaching out. To find the reason, I've tried to run the code, but i could not get the same error. So, please understand that i cannot provide you an exact answer to address the problem.

Based on the discussion in the forum you shared, i guess that the error likely comes from the code related to the multiprocessing via pathos. I used a pathos multiprocessing in my code, and currently the variable 'jobs' is set to '2' for that. So, i suggest you to set this to '1' to deactivate the multiprocessing or directly remove the pathos related codes in 'evolutionary_algorithm.py'.

If the answer would not be helpful and the problem persists, please let me know.

Thank you for your advice.Your suggestion is very useful, it does work as you guess when 'jobs' is set to '1'.

Another problem is something about gpu on training.
【Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR】
Could you tell me the Equipment you used for this code?
I notice that you set 5014 in the code,but my GPU is RTX2060 has only 6GB.I guess the error occurs because of Insufficient memory.So I change 5014 to 3072,Will this change affect the experimental results?

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:

            # Currently, memory growth needs to be the same across GPUs
            for gpu in gpus:
                # tf.config.experimental.set_memory_growth(gpu, True)
                tf.config.experimental.set_virtual_device_configuration(gpu, [
                    tf.config.experimental.VirtualDeviceConfiguration(memory_limit=5014)])
            logical_gpus = tf.config.experimental.list_logical_devices('GPU')
            # print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
        except RuntimeError as e:
            # Memory growth must be set before GPUs have been initialized
            print(e)

from enas-pdm.

mohyunho avatar mohyunho commented on August 31, 2024

I'm happy to hear that you could handle the issue with that.
The GPU i used is NVIDIA TITAN Xp which has 12GB. I didn't manually set memory_limit to '5014', and do not know how does it affect to our computation.
But, if you can run the code with the limit of 3072 and without changing any tunable parameters, then you can obtain an optimized network as an output ( the discovered and selected architecture can be slightly different depending on the computational resources you used because the fitness of each individual can be different too).

You can somewhat mitigate the memory issue with changing the batch size, but it may affect the training time and the final results.

One last thing want to share is that i used a TF determinism which enables to have reproducible results from GPU computation. But, in certain case, it can cause ineffective memory use and slow down the training. So, i recommend you to remove "os.environ['TF_DETERMINISTIC_OPS'] = '1'" any of the files in the repository unless you have to get an strictly reproducible results.

from enas-pdm.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.