Raspberry pi 上的深度强化学习

Question

我正在尝试在树莓派 4 上运行深度强化问题。代码在 colab 上成功运行，但显示以下错误。任何人都可以帮助我在树莓派 4 上运行此代码吗？提前致谢我收到的错误：

/home/pi/.local/lib/python3.9/site-packages/flatbuffers/compat.py:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
/home/pi/.local/lib/python3.9/site-packages/gym/spaces/box.py:128: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
/home/pi/.local/lib/python3.9/site-packages/stable_baselines3/common/vec_env/patch_gym.py:49: UserWarning: You provided an OpenAI Gym environment. We strongly recommend transitioning to Gymnasium environments. Stable-Baselines3 is automatically wrapping your environments in a compatibility layer, which could potentially cause issues.
  warnings.warn(
Using cpu device
Traceback (most recent call last):
  File "/home/pi/reinf_nomotor.py", line 75, in <module>
    action, _ = model.predict(obs)
  File "/home/pi/.local/lib/python3.9/site-packages/stable_baselines3/common/base_class.py", line 555, in predict
    return self.policy.predict(observation, state, episode_start, deterministic)
  File "/home/pi/.local/lib/python3.9/site-packages/stable_baselines3/common/policies.py", line 349, in predict
    actions = self._predict(observation, deterministic=deterministic)
  File "/home/pi/.local/lib/python3.9/site-packages/stable_baselines3/common/policies.py", line 679, in _predict
    return self.get_distribution(observation).get_actions(deterministic=deterministic)
  File "/home/pi/.local/lib/python3.9/site-packages/stable_baselines3/common/policies.py", line 714, in get_distribution
    return self._get_action_dist_from_latent(latent_pi)
  File "/home/pi/.local/lib/python3.9/site-packages/stable_baselines3/common/policies.py", line 653, in _get_action_dist_from_latent
    mean_actions = self.action_net(latent_pi)
  File "/home/pi/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/pi/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/pi/.local/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: could not create a primitive descriptor for a matmul primitive

我已经尝试了以下代码。此代码是我正在尝试实现的深度学习的简单版本。我的主要目标是通过树莓派旋转步进电机来达到目标液滴长度。我已经为液滴尺寸和液滴尺寸编写了单独的代码电机运动。 #######################

import gym
from stable_baselines3 import PPO
from stable_baselines3.common.envs import DummyVecEnv
import matplotlib.pyplot as plt
import numpy as np

# Define constants
TARGET_DROP_SIZE = 25.0
ERROR_TOLERANCE = 0.01  # 1% error tolerance
MAX_STEPS = 10  # Maximum number of steps per episode

# Create a custom gym environment for drop size control
class DropSizeControlEnv(gym.Env):
    def __init__(self):
        super(DropSizeControlEnv, self).__init__()

        # Environment parameters
        self.dt = 0.1  # Time step
        self.motor_angle = 90.0  # Initial motor angle

        # Action space: [-180, 180]
        self.action_space = gym.spaces.Box(low=np.array([-180.0]), high=np.array([180.0]))
        # Observation space: [0, 180]
        self.observation_space = gym.spaces.Box(low=np.array([0.0]), high=np.array([180.0]))

    def step(self, action):
        # Execute the control action (adjust motor angle)
        self.motor_angle += action[0]

        # Ensure the motor angle stays within bounds
        self.motor_angle = max(1.0, min(180.0, self.motor_angle))

        # Simulate the achieved drop size (simplified for demonstration)
        achieved_drop_size = 24.5 + np.random.uniform(-0.5, 0.5)

        # Calculate the error
        error = TARGET_DROP_SIZE - achieved_drop_size

        # Calculate the reward (negative absolute error)
        reward = -abs(error)

        # Check if the error is within tolerance
        done = abs(error) <= ERROR_TOLERANCE

        # Return observation, reward, done, info
        return np.array([self.motor_angle]), reward, done, {}

    def reset(self):
        # Reset the environment to the initial state
        self.motor_angle = 90.0
        return np.array([self.motor_angle])

# Create and wrap the custom environment
env = DummyVecEnv([lambda: DropSizeControlEnv()])

# Create and train the PPO agent
model = PPO("MlpPolicy", env, verbose=1)

# Variables for tracking results
time_steps = []
achieved_drop_sizes = []
target_drop_sizes = []
errors = []

# Training loop
for episode in range(MAX_STEPS):
    obs = env.reset()
    while True:
        action, _ = model.predict(obs)
        obs, _, done, _ = env.step(action)

        # Simulate the achieved drop size
        achieved_drop_size = 24.5 + np.random.uniform(-0.5, 0.5)

        # Calculate the error
        error = TARGET_DROP_SIZE - achieved_drop_size

        # Store data for plotting
        time_steps.append(len(time_steps) * env.envs[0].dt)
        achieved_drop_sizes.append(achieved_drop_size)
        target_drop_sizes.append(TARGET_DROP_SIZE)
        errors.append(error)

        if done:
            break

    # Calculate the final achieved error
    final_error = abs(target_drop_sizes[-1] - achieved_drop_sizes[-1])

    # Check if the achieved error is smaller than 1%
    if final_error < 0.01 * TARGET_DROP_SIZE:
        break

# Plot the results
plt.figure(figsize=(12, 6))

# Plot Achieved and Target Drop Size
plt.subplot(1, 2, 1)
plt.plot(time_steps, achieved_drop_sizes, label="Achieved Drop Size")
plt.plot(time_steps, target_drop_sizes, label="Target Drop Size")
plt.xlabel("Time (s)")
plt.ylabel("Drop Size")
plt.legend()
plt.title("Drop Size Control")
plt.grid(True)

# Plot Error
plt.subplot(1, 2, 2)
plt.plot(time_steps, errors, label="Error")
plt.xlabel("Time (s)")
plt.ylabel("Error")
plt.legend()
plt.title("Error Plot")
plt.grid(True)

plt.tight_layout()
plt.show()

`

Answer 1

https://github.com/pytorch/pytorch/issues/110149

PyTorch 存在错误，也许可以尝试更改 PyTorch 版本。

Raspberry pi 上的深度强化学习

问题描述投票：0回答：1

1个回答

最新问题

Raspberry pi 上的深度强化学习

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1