Apply optimization methods such as (Stochastic) Gradient Descent, Momentum, RMSProp and Adam and use random minibatches to accelerate convergence and improve optimization
some advanced optimization methods can speed up learning and perhaps even get you to a better final value for the cost function. Having a good optimization algorithm can be the difference between waiting days vs. just a few hours to get a good result.