Base station sleeping is an effective way to reduce the energy consumption of mobile networks. Previous efforts to design sleeping control algorithms mainly rely on stochastic traffic models and analytical derivation. However the tractability of models often conflicts with the complexity of real-world traffic, making it difficult to apply in reality. In this paper, we propose a data-driven algorithm for dynamic sleeping control called DeepNap. This algorithm uses a Deep Q-network (DQN) to learn effective sleeping policies from high-dimensional raw observations or un-quantized systems state vectors. We propose to enhance the original DQN algorithm with action-wise experience replay and adaptive reward scaling to deal with the challenges in non-stationary traffic. We also provide a model-assisted variant of DeepNap through the Dyna framework for inferring and simulating system dynamics. Periodical traffic modeling makes it possible to capture the non-stationarity in real-world traffic and the incorporation with DQN allows for feature learning and generalization from model outputs. Experiments show that both the end-to-end and the model-assisted version of DeepNap outperform table-based Q-learning algorithm and the non-stationarity enhancements improve the stability of vanilla DQN.