We study the performance of stochastically trained deep neural networks (DNNs) whose synaptic weights are implemented using emerging memristive devices that exhibit limited dynamic range, resolution, and variability in their programming characteristics. We show that a key device parameter to optimize the learning efficiency of DNNs is the variability in its programming characteristics. DNNs with such memristive synapses, even with dynamic range as low as 15 and only 32 discrete levels, when trained based on stochastic updates suffer less than 3% loss in accuracy compared to floating point software baseline. We also study the performance of stochastic memristive DNNs when used as inference engines with noise corrupted data and find that if the device variability can be minimized, the relative degradation in performance for the Stochastic DNN is better than that of the software baseline. Hence, our study presents a new optimization corner for memristive devices for building large noise-immune deep learning systems.