The freshness or timeliness of data at server is a significant key performance indicator of sensor networks, especially in tolerance critical applications such as factory automation.As an effective and intuitive measurement to data timeliness, the metric of Age of Information (AoI) has attracted an intensive recent interest of research. This paper initiates a study on the AoI of wireless sensor networks working in the finite blocklength (FBL) regime as a resource allocation problem, and proposes to minimize the long-term discounted system AoI as a Markov decision process (MDP). The proposed method with its optimum solved by Reinforced Learning technique is verified by simulations to outperform benchmarks, including the conventional error rate minimizing policy.