Abstract Application of model-free reinforcement learning (RL) designs for optimal control of large-scale dynamical systems is largely bottlenecked by the system dimension, resulting in large learning time and dense feedback control gains. This talk will present reduced-dimensional RL designs by incorporating ideas from model reduction to alleviate the curse of dimensionality. We will use singular perturbation (SP) theory for linear time invariant systems, and exploit the time-scale separation existing in the plant to perform the learning in the reduced-dimensional space using only the slow variables. Theoretical analysis captures the sub-optimality of the proposed approach, and sufficient stability conditions are derived by applying theoretical results from SP literature to RL algorithms. The design developed for generic SP systems can be extended to multi-agent networked dynamic systems with clustering. The design enjoys lesser learning time, and reduced-dimensional feedback gain matrix, making it scalable. Numerical examples drawn from generic SP system and clustered multi-agent network show various analytical aspects and effectiveness of the proposed approach.