Python会返回文件中的标签过多或不足的行

Question

我需要一些可以执行以下操作的python：

在制表符分隔的文本文件中查找所有大于或小于X个制表符的行。
打印这些行（当然，每一行都是自己的行）

例如，“ the_file.txt”具有以下内容：

Field1[TAB]Field2[TAB]Field3[TAB]Field4[TAB]Field5
Field1[TAB]Field2[TAB]Field3
Field1[TAB]Field2[TAB]Field3[TAB]Field4
Field1[TAB]Field2[TAB]Field3[TAB]Field4[TAB]Field5

Pseudopython：

Read the_file.txt
Find all rows that do not have 4 tabs
Print the entire content of those rows

返回：

Field1[TAB]Field2[TAB]Field3
Field1[TAB]Field2[TAB]Field3[TAB]Field4

要考虑的一件事是，我要针对其运行Python的文件通常非常大，总是1000+行，经常10,000+行，有时是100,000+行。

谢谢！

Answer 1

只需这样做：

df=pd.read_csv('the_file.txt',sep='\t')
df.columns=['Col1','Col2','Col3','Col4','Col5']
nans = lambda df: df[df.isnull().any(axis=1)]

print(nans(df))

输出：

    Col1    Col2    Col3    Col4    Col5
0   Field1  Field2  Field3  NaN     NaN
1   Field1  Field2  Field3  Field4  NaN

Answer 2

你去]

number_not_tabs = 4

with open('the_file.txt') as f:
    content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
for x in content:
    if x.count("\t") != number_not_tabs:
        print(x)

Python会返回文件中的标签过多或不足的行

问题描述投票：-1回答：2

2个回答

最新问题

Python会返回文件中的标签过多或不足的行

问题描述 投票：-1回答：2

2个回答

最新问题

问题描述投票：-1回答：2