一定格式的正则表达式(Regex) [party_name] v. [party_name], citation, year.

问题描述 投票:0回答:1

我有一个内容列表(是一种法律文件)。我必须从这个列表中提取一些项目,这些项目在括号中提到了这种表达类型([party_name] v. [party_name], citation, year)你可以在这里看到。内容和表达类型为了方便起见,我在这里也显示了这个列表。

list1= ['TABLE OF AUTHORITIES',
 'Cases',
 'Agostino v. Quest Diognostics Inc., 256 F.R.D. 437 (D.N.J. 2009)',
 'Amchem Products, Inc. v. Windsor, 521 U.S. 591 (1997)',
 'Arthur Jaffe Associates v. Bilsco Auto Service, Inc., 453 N.Y.S. 2d 501 (App. Div. 1982)',
 'Avritt v. Reliastar Life Ins. Co., 615 F.2d 1023 (8th Cir. 2010)',
 'Beck v. Maximus, Inc., 457 F.3d 291 (3d Cir. 2006)',
 'Carroll v. Cellco Partnership, 713 A.2d 509 (N.J. Super. Ct. App. Div. 1998)',
 'Cox v. Sears Roebuck & Co., 647 A.2d 454 (N.J. 1994)',
 "Cuming v. S.C. Lottery Comm'n, Civil Action No. 05-3608, 2008 WL 906705 (D.S.C. Mar. 28, 2008)",
 'Demmick v. Cellco Parterhship, Civil Action No. 06-2163, 2010 WL 3636216 (D.N.J. Sept. 8, 2010)',
 'Denney v. Deutsche Bank AG, 443 F.3d 253 (2d Cir. 2006)',
 "Elias v. Ungar's Food Products, Inc., 252 F.R.D. 233 (D.N.J. 2008)",
 '*vi Erit v. Judge, Inc., 961 F.Supp. 774 (D.N.J. 1997)',
 'Fink v. Ricoh Corp., 839 A.2d 942 (N.J. Super. Ct. Law Div. 2003)',
 'Folbaum v. Rexall Sundown Inc., Appeal No. A-244-02TI, 2004 WL 3574116 (N.J. Super. Ct. App. Div. 
  May 4, 2004)'
 'Weiss v. York Hosp., 745 F.2d 786 (3d Cir. 1984)',
 'Statutes',
 '28 U.S.C. § 1292(e)',
 '28 U.S.C. § 1332(d)',
 'N.J.S.A. 56:8-2',
 'N.J.S.A. 56:8-19',
 'Rules',]

所以我想做一个只包含上述表达式类型的列表,例如,从上述列表中提取的起始元素应该是 "Agostino v. Quest Diognostics Inc., 256 F.R.D. 437 (D.N.J. 2009) "我想使用regex表达式来完成这个任务,但如果有其他方法也会有帮助。这只是我在这里展示的部分内容,可能是其他方法对完整的文档检查不适用,所以我要求使用regex.Help would be greatly appreciated。

python regex parsing regex-group
1个回答
0
投票

不知道这是否是你想要的,你没有具体说明要提取哪些部分。但我得到的是这个。

list1= ['TABLE OF AUTHORITIES',
 'Cases',
 'Agostino v. Quest Diognostics Inc., 256 F.R.D. 437 (D.N.J. 2009)',
 'Amchem Products, Inc. v. Windsor, 521 U.S. 591 (1997)',
 'Arthur Jaffe Associates v. Bilsco Auto Service, Inc., 453 N.Y.S. 2d 501 (App. Div. 1982)',
 'Avritt v. Reliastar Life Ins. Co., 615 F.2d 1023 (8th Cir. 2010)',
 'Beck v. Maximus, Inc., 457 F.3d 291 (3d Cir. 2006)',
 'Carroll v. Cellco Partnership, 713 A.2d 509 (N.J. Super. Ct. App. Div. 1998)',
 'Cox v. Sears Roebuck & Co., 647 A.2d 454 (N.J. 1994)',
 "Cuming v. S.C. Lottery Comm'n, Civil Action No. 05-3608, 2008 WL 906705 (D.S.C. Mar. 28, 2008)",
 'Demmick v. Cellco Parterhship, Civil Action No. 06-2163, 2010 WL 3636216 (D.N.J. Sept. 8, 2010)',
 'Denney v. Deutsche Bank AG, 443 F.3d 253 (2d Cir. 2006)',
 "Elias v. Ungar's Food Products, Inc., 252 F.R.D. 233 (D.N.J. 2008)",
 '*vi Erit v. Judge, Inc., 961 F.Supp. 774 (D.N.J. 1997)',
 'Fink v. Ricoh Corp., 839 A.2d 942 (N.J. Super. Ct. Law Div. 2003)',
 'Folbaum v. Rexall Sundown Inc., Appeal No. A-244-02TI, 2004 WL 3574116 (N.J. Super. Ct. App. Div.  May 4, 2004)',
 'Weiss v. York Hosp., 745 F.2d 786 (3d Cir. 1984)',
 'Statutes',
 '28 U.S.C. § 1292(e)',
 '28 U.S.C. § 1332(d)',
 'N.J.S.A. 56:8-2',
 'N.J.S.A. 56:8-19',
 'Rules',]

import re

new_list = []
for line in list1:
  matches = re.findall(r'([a-zA-Z \.]+) v. ([a-zA-Z \.]+),(.*)', line)
  if matches:
    new_list.append(line)

for line in new_list:
  print(line)

打印出三个部分,(党派A,党派B,引用)。

Agostino v. Quest Diognostics Inc., 256 F.R.D. 437 (D.N.J. 2009)
Amchem Products, Inc. v. Windsor, 521 U.S. 591 (1997)
Arthur Jaffe Associates v. Bilsco Auto Service, Inc., 453 N.Y.S. 2d 501 (App. Div. 1982)
Avritt v. Reliastar Life Ins. Co., 615 F.2d 1023 (8th Cir. 2010)
Beck v. Maximus, Inc., 457 F.3d 291 (3d Cir. 2006)
Carroll v. Cellco Partnership, 713 A.2d 509 (N.J. Super. Ct. App. Div. 1998)
Demmick v. Cellco Parterhship, Civil Action No. 06-2163, 2010 WL 3636216 (D.N.J. Sept. 8, 2010)
Denney v. Deutsche Bank AG, 443 F.3d 253 (2d Cir. 2006)
*vi Erit v. Judge, Inc., 961 F.Supp. 774 (D.N.J. 1997)
Fink v. Ricoh Corp., 839 A.2d 942 (N.J. Super. Ct. Law Div. 2003)
Folbaum v. Rexall Sundown Inc., Appeal No. A-244-02TI, 2004 WL 3574116 (N.J. Super. Ct. App. Div.  May 4, 2004)
Weiss v. York Hosp., 745 F.2d 786 (3d Cir. 1984)
© www.soinside.com 2019 - 2024. All rights reserved.