如何将带有新行的文本输出表格化到数据框中?

问题描述 投票:-1回答:1

这是我正在处理的文本斑点的结构:

reprEx <- "] WITHDRAWALS\nDATE DESCRIPTION AMOUNT\n04/01 Quickpay With Zelle Payment To Mike T 819018100 $1,450.00\n04/01 Quickpay With Zelle Payment To Mandy Doid 809012906 2,665.00"

我希望能够在每个新行上获取文本,并将行中的每个元素分隔到相应的数据框列。例如,我需要将每一行的日期放在DATE列中,将事务的描述放在DESCRIPTION列中,并将行尾之前的数字放入AMOUNT列。这是我在数据框中所需输出的示例。

desiredResult <- data.frame(DATE = c("04/01", "04/01"),
                            DESCRIPTION = c("Quickpay With Zelle Payment To Mike T 819018100", "Quickpay With Zelle Payment To Mandy Doid 819012906"),
                            AMOUNT = c("$1,450.00", "2,665.00"))
r regex
1个回答
0
投票

一开始如何?此解决方案使用str_extract_all包中的stringr

library(stringr)
desiredResult <- data.frame(
  DATE = unlist(str_extract_all(reprEx, "[0-9]{2}/[0-9]{2}")),
  DESCRIPTION = unlist(str_extract_all(reprEx, "(?<=[0-9]{2}/[0-9]{2}\\s)[\\s\\w$]+(?=\\d{1,3},\\d{3}\\.\\d{2})")),
  AMOUNT = unlist(str_extract_all(reprEx, "\\d{1,3},\\d{3}\\.\\d{2}"))
)

输出:

desiredResult
   DATE                                           DESCRIPTION   AMOUNT
1 04/01    Quickpay With Zelle Payment To Mike T 8090128100 $ 1,450.00
2 04/01 Quickpay With Zelle Payment To Mandy Dold 8090129906  2,665.00
© www.soinside.com 2019 - 2024. All rights reserved.