Model-Based Reinforcement Learning For Cooperative Multi-Agent Planning: Exploiting Hierarchies, Bias, And Temporal Sampling