将描述拆分为多行

问题描述 投票:0回答:1

我正在尝试在 SAS 中完成以下任务。

我的数据集如下所示:

|ID  |Description
|----|-----------------
| 1  |Object Car_bmw Processed.Colour changed from Red to Green Mileage changed from 30 to 32 Object Car_Audi Processed.Colour changed from Blue to White Mileage changed from 0 to 5
| 2  |Object Car_Kia Processed. Colour changed from White to Black Mileage changed from 3 to 9 Value changed from 12034 to 11029
| 3  |Object Phone_Iphone Processed. Colour changed from Black to Green Value changed from 300 to 290 Object Car_bmw Processed. Colour changed from White to Red Mileage changed from 100 to 50

我想创建一个新的对象列,它将“描述”列拆分为:

|ID  |Index| Description
|----|-----|-----------
| 1  | 1   | Object Car_bmw Processed.Colour changed from Red to Green Mileage changed from 30 to 32
|  1 | 2   | Object Car_Audi Processed.Colour changed from Blue to White Mileage changed from 0 to 5                                
| 2  | 1   | Object Car_Kia Processed.Colour changed from White to Black Mileage changed from 3 to 9 Value changed from 12034 to 11029
| 3  | 1   | Object Phone_Iphone Processed.Colour changed from Black to Green Value changed from 300 to 290 
| 3  | 2   | Object Car_bmw Processed. Colour changed from White to Red Mileage changed from 100 to 50

我尝试使用以下函数将对象和“描述”分成不同的行。我用的是“已处理”。作为分隔符,因为它在不同对象之间是通用的。然而并没有成功

data want;
set have;
do index = 1 to countw(LOG_LONG_DESC,'Processed.');
line_part = dequote(scan(LOG_LONG_DESC,index,'Processed.'));
OUTPUT;
end;
run;
sql function sas dataset sas-macro
1个回答
0
投票

假设原始照片中的行之间有一些分隔符,那么我建议将其分成每行一个观察值,并创建第二个变量,当行中出现“已处理”一词时,该变量会递增。

让我们制作一个使用 CR 作为行之间分隔符的示例数据集。

data have;
  length id 8 description $400;
  id=1;
  description='Object Car_bmw Processed.' || '0D'x
            ||'Colour changed from Red to Green' || '0D'x 
            ||'Mileage changed from 30 to 32' || '0D'x
            ||'Object Car_Audi Processed.' || '0D'x
            ||'Colour changed from Blue to White' || '0D'x 
            ||'Mileage changed from 0 to 5'
  ;
  output;
run;

现在我们可以使用 COUNTW() 和 SCAN() 将其分成几行,并使用 FINDW() 来检测一行何时已处理。

data want;
  set have;
  group=0;
  do lineno=1 to countw(description,'0D'x);
    length line $100;
    line=scan(description,lineno,'0D'x);
    if findw(line,'processed',,'spit') then group+1;
    output;
  end;
  drop description;
run;

结果

Obs    id    group    lineno                  line

 1      1      1         1      Object Car_bmw Processed.
 2      1      1         2      Colour changed from Red to Green
 3      1      1         3      Mileage changed from 30 to 32
 4      1      2         4      Object Car_Audi Processed.
 5      1      2         5      Colour changed from Blue to White
 6      1      2         6      Mileage changed from 0 to 5
© www.soinside.com 2019 - 2024. All rights reserved.