Wireless mesh network (WMN) is one of the most promising technologies for Internet of Things (IoT) applications because of its self-adaptive and self-organization nature. To meet different performance requirements on communications in WMNs, traditional approaches always have to program flow control strategies in an explicit way. In this case, the performance of WMNs will be significantly affected by the dynamic properties of underlying networks in real applications. With providing a more flexible solution in mind, in this article, for the first time, we present how we can apply emerging Deep Reinforcement Learning (DRL) on communication flow control in WMNs. Moreover, different from a general DRL based networking solution, in which the network properties are pre-defined, we leverage the adaptive nature of WMNs and propose a self-adaptive DRL approach. Specifically, our method can reconstruct a WMN during the training of a DRL model. In this way, the trained DRL model can capture more properties of WMNs and achieve better performance. As a proof of concept, we have implemented our method with a self-adap-tive Deep Q-learning Network (DQN) model. The evaluation results show that the presented solution can significantly improve the communication performance of data flows in WMNs, compared to a static benchmark solution.