从给定字符串数组中的文本中删除空格

问题描述 投票:0回答:1

我正在做一些练习,我必须从给定的文本创建 numpy 数组,并删除 numpy 数组文本之间的空格。请帮助如何实现它。我正在尝试但到目前为止还没有成功.. 这是数据..

array([['ALIOR', 'PLALIOR00045', '88 860 000', '1 386 216 000', '0,891',
        '2,16', '14'],
       ['CCC', 'PLCCC0000016', '27 918 000', '1 292 603 400', '0,831',
        '5,28', '42'],
       ['CDPROJEKT', 'PLOPTTC00011', '67 348 000', '22 864 646 000',
        '14,702', '7,39', '7'],
       ['CYFRPLSAT', 'PLCFRPT00013', '275 301 000', '6 854 994 900',
        '4,408', '1,17', '14'],
       ['DINOPL', 'PLDINPL00011', '47 937 000', '8 916 282 000', '5,733',
        '9,13', '12'],
       ['JSW', 'PLJSW0000015', '52 636 000', '716 902 320', '0,461',
        '1,51', '24'],
       ['KGHM', 'PLKGHM000017', '136 410 000', '9 881 540 400', '6,354',
        '4,78', '8'],
       ['LOTOS', 'PLLOTOS00025', '86 543 000', '5 609 717 260', '3,607',
        '2,91', '16'],
       ['LPP', 'PLLPP0000011', '1 306 000', '7 444 200 000', '4,787',
        '1,43', '19'],
       ['MBANK', 'PLBRE0000012', '12 997 000', '2 830 746 600', '1,820',
        '0,42', '24'],
       ['ORANGEPL', 'PLTLKPL00017', '647 357 000', '4 285 503 340',
        '2,756', '1,16', '13'],
       ['PEKAO', 'PLPEKAO00016', '176 379 000', '9 619 710 660', '6,185',
        '5,27', '9'],
       ['PGE', 'PLPGER000010', '796 776 000', '3 561 588 720', '2,290',
        '2,88', '18'],
       ['PGNIG', 'PLPGNIG00014', '1 624 608 000', '6 072 784 704',
        '3,905', '1,56', '12'],
       ['PKNORLEN', 'PLPKN0000018', '289 049 000', '17 701 360 760',
        '11,382', '12,44', '8'],
       ['PKOBP', 'PLPKO0000016', '857 593 000', '18 807 014 490',
        '12,093', '10,49', '9'],
       ['PLAY', 'LU1642887738', '114 151 000', '3 696 209 380', '2,377',
        '1,47', '16'],
       ['PZU', 'PLPZU0000011', '568 305 000', '17 515 160 100', '11,262',
        '6,64', '6'],
       ['SANPL', 'PLBZ00000044', '33 207 000', '5 213 499 000', '3,352',
        '1,91', '18'],
       ['TAURONPE', 'PLTAURN00011', '1 043 590 000', '1 252 308 000',
        '0,805', '1,21', '33']], dtype='<U14')

任务是..在给定的数组中,如果任何文本之间包含空格,则删除空格。如果任何文本包含“,”,则删除逗号并将其替换为句点 (.)。例如:在给定的数组中,第一行有类似“88 860 000”的字符串,那么它应该像“88860000”,“0,891”应该像“0.891”

我尝试了多种方法但未能实现。

python pandas numpy data-science
1个回答
0
投票

使用此功能:

def process_text(text):
    return text.replace(' ', '').replace(',', '.') # replace the space and coma

以下是如何做到这一点:

processed_data = np.vectorize(process_text)(data) # use function on data

结果:

[['ALIOR' 'PLALIOR00045' '88860000' '1386216000' '0.891' '2.16' '14']
 ['CCC' 'PLCCC0000016' '27918000' '1292603400' '0.831' '5.28' '42']
 ['CDPROJEKT' 'PLOPTTC00011' '67348000' '22864646000' '14.702' '7.39' '7']
 ['CYFRPLSAT' 'PLCFRPT00013' '275301000' '6854994900' '4.408' '1.17' '14']
 ['DINOPL' 'PLDINPL00011' '47937000' '8916282000' '5.733' '9.13' '12']
 ['JSW' 'PLJSW0000015' '52636000' '716902320' '0.461' '1.51' '24']
 ['KGHM' 'PLKGHM000017' '136410000' '9881540400' '6.354' '4.78' '8']
 ['LOTOS' 'PLLOTOS00025' '86543000' '5609717260' '3.607' '2.91' '16']
 ['LPP' 'PLLPP0000011' '1306000' '7444200000' '4.787' '1.43' '19']
 ['MBANK' 'PLBRE0000012' '12997000' '2830746600' '1.820' '0.42' '24']
 ['ORANGEPL' 'PLTLKPL00017' '647357000' '4285503340' '2.756' '1.16' '13']
 ['PEKAO' 'PLPEKAO00016' '176379000' '9619710660' '6.185' '5.27' '9']
 ['PGE' 'PLPGER000010' '796776000' '3561588720' '2.290' '2.88' '18']
 ['PGNIG' 'PLPGNIG00014' '1624608000' '6072784704' '3.905' '1.56' '12']
 ['PKNORLEN' 'PLPKN0000018' '289049000' '17701360760' '11.382' '12.44'
  '8']
 ['PKOBP' 'PLPKO0000016' '857593000' '18807014490' '12.093' '10.49' '9']
 ['PLAY' 'LU1642887738' '114151000' '3696209380' '2.377' '1.47' '16']
 ['PZU' 'PLPZU0000011' '568305000' '17515160100' '11.262' '6.64' '6']
 ['SANPL' 'PLBZ00000044' '33207000' '5213499000' '3.352' '1.91' '18']
 ['TAURONPE' 'PLTAURN00011' '1043590000' '1252308000' '0.805' '1.21' '33']]

我这样定义数据:

data = np.array([
    ['ALIOR', 'PLALIOR00045', '88 860 000', '1 386 216 000', '0,891', '2,16', '14'],
    ['CCC', 'PLCCC0000016', '27 918 000', '1 292 603 400', '0,831', '5,28', '42'],
    ['CDPROJEKT', 'PLOPTTC00011', '67 348 000', '22 864 646 000', '14,702', '7,39', '7'],
    ['CYFRPLSAT', 'PLCFRPT00013', '275 301 000', '6 854 994 900', '4,408', '1,17', '14'],
    ['DINOPL', 'PLDINPL00011', '47 937 000', '8 916 282 000', '5,733', '9,13', '12'],
    ['JSW', 'PLJSW0000015', '52 636 000', '716 902 320', '0,461', '1,51', '24'],
    ['KGHM', 'PLKGHM000017', '136 410 000', '9 881 540 400', '6,354', '4,78', '8'],
    ['LOTOS', 'PLLOTOS00025', '86 543 000', '5 609 717 260', '3,607', '2,91', '16'],
    ['LPP', 'PLLPP0000011', '1 306 000', '7 444 200 000', '4,787', '1,43', '19'],
    ['MBANK', 'PLBRE0000012', '12 997 000', '2 830 746 600', '1,820', '0,42', '24'],
    ['ORANGEPL', 'PLTLKPL00017', '647 357 000', '4 285 503 340', '2,756', '1,16', '13'],
    ['PEKAO', 'PLPEKAO00016', '176 379 000', '9 619 710 660', '6,185', '5,27', '9'],
    ['PGE', 'PLPGER000010', '796 776 000', '3 561 588 720', '2,290', '2,88', '18'],
    ['PGNIG', 'PLPGNIG00014', '1 624 608 000', '6 072 784 704', '3,905', '1,56', '12'],
    ['PKNORLEN', 'PLPKN0000018', '289 049 000', '17 701 360 760', '11,382', '12,44', '8'],
    ['PKOBP', 'PLPKO0000016', '857 593 000', '18 807 014 490', '12,093', '10,49', '9'],
    ['PLAY', 'LU1642887738', '114 151 000', '3 696 209 380', '2,377', '1,47', '16'],
    ['PZU', 'PLPZU0000011', '568 305 000', '17 515 160 100', '11,262', '6,64', '6'],
    ['SANPL', 'PLBZ00000044', '33 207 000', '5 213 499 000', '3,352', '1,91', '18'],
    ['TAURONPE', 'PLTAURN00011', '1 043 590 000', '1 252 308 000', '0,805', '1,21', '33']
], dtype='<U14')
© www.soinside.com 2019 - 2024. All rights reserved.