Learning to stabilize nonequilibrium phases of matter with active feedback using partial information

G. Cemin, M. Schmitt, and M. Bukov,

We investigate the role of information in active feedback control of quantum many-body systems using reinforcement learning. Active feedback breaks detailed balance, enabling the engineering of steady states and dynamical phases of matter otherwise inaccessible in equilibrium. We train reinforcement learning agents using partial state information to prevent entanglement spreading in (1+1)-dimensional stabilizer circuits with up to 128 qubits. We find that, above a critical information threshold, learned near-optimal strategies are non-greedy, stochastic, and reduce volume-law entangled steady states to area-law scaling. The agents achieve this by placing a series of bottlenecks that induce pyramidal structures in the long-time spatial entanglement distribution, which effectively split the system and reduce the maximum accessible entanglement. Crucially, learned strategies are inherently out of equilibrium and require real-time active feedback; we find that the learned behavior cannot be replaced by simple human-designed control rules. This work establishes the foundations for classically implemented, information-driven individual control of many interacting quantum degrees of freedom, demonstrating the capabilities of reinforcement learning to stabilize and uncover novel critical properties of many-body nonequilibrium steady states.