Python - glob.glob with grep?

Question

我是一个相当新的Python环境，并逐渐努力前进。

我们在一个文件夹中得到了大约10,000个文件，包含类似的信息，但有一个主要的区别。有些文件包含一个字符串'string1'，而另一组包含'string2'。需要说明的是，字符串不在文件名中，而是在文件本身。文件内容是以字符分隔的。

我试着分别用string1和string2创建两个独立的列表，得到了不同的代码行，但毫无进展。两个列表都应该只包含文件名。

Answer 1

我经常使用 grep 对于这种事情。在这种情况下，我会使用

经过编辑，增加了文件扩展名。

grep -l string1 *.txt > string1_files.txt && grep -l string2 *.txt> string2_files.txt

这个单行线会搜索 string1 在 txt 文件，并将输出写到 string1_files.txt 同理 string2

抄袭 man grep

 -l, --files-with-matches
         Only the names of files containing selected lines are written to
         standard output.  grep will only search a file until a match has
         been found, making searches potentially less expensive.  Path-
         names are listed once per file searched.  If the standard input
         is searched, the string ``(standard input)'' is written.

希望这对你有帮助，但 你可能只想grep某些文件的扩展名。

编辑为无文件扩展名。 (如果在问题评论中无法找到这些文件)

grep -l string1 * > string1_files.txt && grep -l string2 *> string2_files.txt

Answer 2

假设你的文件中只有你想比较的字符串，你只需要执行

folder = 'foo'
files = glob.glob(os.path.join(folder, "*"))

list1 = []
list2 = []
for file in files:
  with open(file, 'r') as f:
    if(f.readlines().strip() == 'string1'):
      list1.append(file)
    else
      list2.append(file)

如果您的文件有更多的数据，您只需要处理一下 f.readlines() 并进行适当比较。

Python - glob.glob with grep?

问题描述投票：-1回答：1

1个回答

最新问题

Python - glob.glob with grep?

问题描述 投票：-1回答：1

1个回答

最新问题

问题描述投票：-1回答：1