Combining ground environments with reward machines to create learning tasks
CrossProduct
base class:
_get_obs
to_ground_obs
u
) and counter value (c[0]
)CrossProduct
class uses generic type parameters for flexibility:
GroundObsType
: Type of ground environment observationsObsType
: Type of cross-product environment observationsActType
: Type of actionsRenderFrame
: Type returned by the render method_get_obs
to add machine state information in a logical waymax_steps
value for your task