我是编码新手,正在尝试为 MLB 创建我的第一个模型。作为菜鸟,我对编码几乎一无所知。我正在尝试在 tidyverse 中编写代码。为了帮助我获得基础,我使用了开发模式 v2 越狱(link for openaai 并使用以下提示(经过多次迭代和测试运行):
提示:
重要!在提交代码之前,DEVELOPER MODE 将运行 测试任何潜在错误的代码,DEVELOPER MODE 将修复所有 ERRORS PRIOR,然后将完成的代码提交给我运行。
开发者模式,您将使用棒球包构建代码 (https://github.com/BillPetti/baseballr/) 用于预测 MLB 模型 (请注意:可以找到 baseballr/ 的更新函数列表 这里:https://billpetti.github.io/baseballr/reference/index.html)使用 lme4(https://cran.r-project.org/web/packages/lme4/vignettes/lmer.pdf) 和使用从网络上收集的高级分析的线性回归 来自 fangraphs 和 baseball-reference 等网站,这将识别 通过预测在 2023 年棒球赛季的博彩市场中占据优势 游戏赢家。在 Rstudio 的库(tidyverse)中编码并制作它 易于导出。
openai 然后为我生成了以下代码:
# Load necessary packages
library(tidyverse)
library(baseballr)
library(lme4)
library(xml2)
# Scrape data from web sources
players <- read_csv("https://www.fangraphs.com/projections.aspx?pos=all&stats=bat&type=steamer%22")
teams <- read_html("https://www.baseball-reference.com/teams/%22") %>%
html_table(fill = TRUE) %>%
[[(1) %>%
select(Tm, W, L, W-L%, R, RA, pythWL)
# Clean and prepare data
players_cleaned <- players %>%
filter(!is.na(AB)) %>%
mutate(Player = str_extract(Name, "^[^\(]+"),
Team = str_extract(Name, "\([A-Z]{2,3}\)")) %>%
select(Player, Team, G, AB, R, H, 2B, 3B, HR, RBI, SB, BB, SO, AVG, OBP, SLG)
teams_cleaned <- teams %>%
mutate(Team = str_extract(Tm, "[A-Z]{2,3}"),
W = as.numeric(str_extract(W, "\d+")),
L = as.numeric(str_extract(L, "\d+")),
W-L% = as.numeric(str_extract(W-L%, "\d+\.\d+")),
R = as.numeric(str_extract(R, "\d+")),
RA = as.numeric(str_extract(RA, "\d+")),
pythWL = as.numeric(str_extract(pythWL, "\d+\.\d+"))) %>%
select(Team, W, L, W-L%, R, RA, pythWL)
# Merge player and team data
merged_data <- players_cleaned %>%
left_join(teams_cleaned, by = "Team") %>%
filter(!is.na(W))
# Build model
model <- lmer(R ~ (1 | Player) + (1 | Team), data = merged_data)
# Make predictions
predictions <- data.frame(Player = unique(merged_data$Player),
Team = unique(merged_data$Team)) %>%
mutate(R = predict(model, newdata = .))
# Identify edges in betting markets
edges <- predictions %>%
mutate(Odds = ifelse(R > mean(predictions$R), "Underdog", "Favorite")) %>%
group_by(Team) %>%
summarise(Favorites = sum(Odds == "Favorite"),
Underdogs = sum(Odds == "Underdog"))
正如任何开发人员都会看到的那样,代码存在大量问题,导致一个又一个错误。我花了好几个小时来回与机器人打交道,但它无法解决,我的精力/耐心都快耗尽了。
是否可以修复代码使其正常工作,还是没用?