我在“县”内的“城市”中有一个关于道路事故的庞大而复杂的GIS数据文件。行代表道路。列提供“城市”,“县”和“城市事故总数”。因此,一个城市包含数条道路(事故总和的重复值),而一个县包含数个城市。对于每个“县”,我现在要根据事故的数量对城市进行排名,以便在每个“县”内,事故最多的城市排名为“ 1”,事故较少的城市排名为“ 2”及以上。该等级值应写入原始数据文件。
我最初的方法是:1.根据“县” _ID”和“事故”对数据进行排序(降序)2.而不是为每一行计算:
if('County' in row 'n+1' = 'County' in row ’n’) AND (Accidents in row 'n+1' = 'Accidents' in row ’n’):
return value: ’n’ ## maintain same rank for cities within 'County'
else if ('County' in row 'n+1' = 'County' in row ’n’) AND if ('Accidents' in row 'n+1' < 'Accidents' in row ’n’):
return value: ’n+1’ ## increasing rank value within 'County'
else if ('County' in row 'n+1' < 'County' in row ’n’) AND ('Accidents' in row 'n+1’ < 'Accidents' in row ’n’):
return value:’1’ ## new 'County', i.e. start ranking from 1
else:
return “0” #error
但是,我不知道如何正确编码;也许这种方法也不适合。也许循环可以解决问题?
有什么建议吗?
建议使用Python Pandas module
虚拟数据
使用县,事故,城市列创建数据
将使用pandas read_csv加载实际数据。
import pandas as pd
df = pd.DataFrame([
['a', 1, 'A'],
['a', 2, 'B'],
['a', 5, 'C'],
['b', 5, 'D'],
['b', 5, 'E'],
['b', 6, 'F'],
['b', 8, 'G'],
['c', 2, 'H'],
['c', 2, 'I'],
['c', 7, 'J'],
['c', 7, 'K']
], columns = ['county', 'accidents', 'city'])
结果数据框
df:
county accidents city
0 a 1 A
1 a 2 B
2 a 5 C
3 b 5 D
4 b 5 E
5 b 6 F
6 b 8 G
7 c 2 H
8 c 2 I
9 c 7 J
10 c 7 K
按县分组数据行,按事故分组rank行内的事故
排名代码
# ascending = False causes cities with most accidents to be ranked = 1
df["rank"] = df.groupby("county")["accidents"].rank("dense", ascending=True)
结果
df:
county accidents city rank
0 a 1 A 3.0
1 a 2 B 2.0
2 a 5 C 1.0
3 b 5 D 3.0
4 b 5 E 3.0
5 b 6 F 2.0
6 b 8 G 1.0
7 c 2 H 2.0
8 c 2 I 2.0
9 c 7 J 1.0
10 c 7 K 1.0
我认为@DarryIG的方法是正确的,但它不认为环境是ArcGIS。
由于您用Python
标记了问题,所以我想出了一个使用Pandas的工作流程。使用ArcGIS工具和或字段计算器,还有其他方法可以做到这一点。
import arcpy # if you are using this script outside ArcGIS
import pandas as pd
# change this to your actual shapefile, you might have to include a path
filename = "road_accidents"
sFields = ['County', 'City', 'SumOfAccidents'] # consider this to be your columns
# read everything in your file into a Pandas DataFrame with a SearchCursor
with arcpy.da.SearchCursor(filename, sFields) as sCursor:
df = pandas.DataFrame(data=[row for row in sCursor], columns=field_names)
df = df.drop_duplicates() # since each row represents a street, we can remove duplicate
# we use this code from DarrylG to calculate a rank
df['Rank'] = df.groupby('County')['SumOfAccidents'].rank('dense', ascending=True)
# set a multiindex, since there might be duplicate city-names
df = df.set_index(['County', 'City'])
dct = df.to_dict() # convert the dataframe into a dictionary
# add a field to your shapefile
arcpy.AddField_management('Rank', 'Rank', 'SHORT')
# now we can update the Shapefile
uFields = ['County', 'City', 'Rank']
with arcpy.da.UpdateCursor(filename, uFields) as uCursor: # open a UpdateCursor on the file
for row in uCursor: # for each row (street)
# get the county/city combo
County_City = (row[uFields.index('County')], row[uFields.index('City')])
if County_City in dct: # see if it is in your dictionary (it should)
# give it the value from dictionary
row[uFields.index('Rank')] = dct['Rank'][County_City]
else:
# otherwise...
row[uFields.index('Rank')] = 999
uCursor.updateRow(row) # update the row
您可以在ArcGIS Pro Python控制台中运行此代码。或使用Jupyter笔记本。希望能帮助到你!