In this problem, you will work with the CartPole environment, a classic reinforcement learning environment. Your goal is to balance a pole on a cart by applying forces to the cart. The environment returns the state of the cart and pole, and you need to decide the action to take (left or right force) to keep the pole upright. **Example:** Suppose the current state is [0.1, 0.2, 0.3, 0.4] representing the cart position, cart velocity, pole angle, and pole velocity. You need to decide whether to apply a left force (0) or right force (1). **Constraints:** The cart position and velocity should be within [-2.4, 2.4] and [-1, 1] respectively. The pole angle and velocity should be within [-12, 12] and [-4, 4] respectively.
Test Cases
Test Case 1
Input:
np.array([0.1, 0.2, 0.3, 0.4])Expected:
np.array([0.1, -0.02, 0.3, 0.4])Test Case 2
Input:
np.array([-2.3, 0.2, 0.3, 0.4])Expected:
np.array([0, 0, 0, 0])+ 3 hidden test cases