我正在研究从GEOTiff文件中检测对象并返回对象坐标的项目,这些输出将用于无人机飞到那些坐标
我使用tensorflow和YOLO v2(图像检测器框架)和OpenCV来检测我在GEOTiff中需要的对象
import cv2
from darkflow.net.build import TFNet
import math
import gdal
# initial stage for YOLO v2
options = {
'model': 'cfg/yolo.cfg',
'load': 'bin/yolov2.weights',
'threshold': 0.4,
}
tfnet = TFNet(options)
# OpenCV read Image
img = cv2.imread('final.tif', cv2.IMREAD_COLOR)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
#Predict the image
result = tfnet.return_predict(img)
#Calculate the center and radius of each objects
i = 0
while i < len(result):
tl = (result[i]['topleft']['x'], result[i]['topleft']['y'])
br = (result[i]['bottomright']['x'], result[i]['bottomright']['y'])
point = (int((result[i]['topleft']['x']+result[i]['bottomright']['x'])/2), int((result[i]['topleft']['y']+result[i]['bottomright']['y'])/2))
radius = int(math.hypot(result[i]['topleft']['x'] - point[0], result[i]['topleft']['y'] - point[1]))
label = result[i]['label']
result[i]['pointx'] = point[0]
result[i]['pointy'] = point[1]
result[i]['radius'] = radius
i += 1
print(result)
所以结果就像JSON一样
[{'label': 'person', 'confidence': 0.6090355, 'topleft': {'x': 3711, 'y': 1310}, 'bottomright': {'x': 3981, 'y': 1719}, 'pointx': 3846, 'pointy': 1514, 'radius': 244}]
你可以看到对象的位置以像素(x,y)返回,我想用这些x,y转换为lat,lng中的坐标,所以我尝试使用GDAL(用于读取GEO信息的库)包含在图像内部)
所以这里是终端中使用gdalinfo的图像的GEO信息
Driver: GTiff/GeoTIFF
Files: final.tif
Size is 8916, 6888
Coordinate System is:
PROJCS["WGS 84 / UTM zone 47N",
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",99],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",0],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]],
AXIS["Easting",EAST],
AXIS["Northing",NORTH],
AUTHORITY["EPSG","32647"]]
Origin = (667759.259870000067167,1546341.352920000208542)
Pixel Size = (0.032920000000000,-0.032920000000000)
Metadata:
AREA_OR_POINT=Area
TIFFTAG_SOFTWARE=pix4dmapper
Image Structure Metadata:
COMPRESSION=LZW
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left ( 667759.260, 1546341.353) (100d33'11.42"E, 13d58'57.03"N)
Lower Left ( 667759.260, 1546114.600) (100d33'11.37"E, 13d58'49.65"N)
Upper Right ( 668052.775, 1546341.353) (100d33'21.20"E, 13d58'56.97"N)
Lower Right ( 668052.775, 1546114.600) (100d33'21.15"E, 13d58'49.59"N)
Center ( 667906.017, 1546227.976) (100d33'16.29"E, 13d58'53.31"N)
Band 1 Block=8916x1 Type=Byte, ColorInterp=Red
NoData Value=-10000
Band 2 Block=8916x1 Type=Byte, ColorInterp=Green
NoData Value=-10000
Band 3 Block=8916x1 Type=Byte, ColorInterp=Blue
NoData Value=-10000
Band 4 Block=8916x1 Type=Byte, ColorInterp=Alpha
NoData Value=-10000
任何人?
您需要使用与光栅文件关联的GeoTransform矩阵将像素坐标转换为地理空间。使用GDAL,您可以执行以下操作:
# open the dataset and get the geo transform matrix
ds = gdal.Open('final.tif')
xoffset, px_w, rot1, yoffset, px_h, rot2 = ds.GetGeoTransform()
# supposing x and y are your pixel coordinate this
# is how to get the coordinate in space.
posX = px_w * x + rot1 * y + xoffset
posY = rot2 * x + px_h * y + yoffset
# shift to the center of the pixel
posX += px_w / 2.0
posY += px_h / 2.0
当然,您获得的位置将相对于用于栅格数据集的相同坐标参考系统。因此,如果您需要将其转换为纬度/经度,则必须进一步详细说明:
# get CRS from dataset
crs = osr.SpatialReference()
crs.ImportFromWkt(ds.GetProjectionRef())
# create lat/long crs with WGS84 datum
crsGeo = osr.SpatialReference()
crsGeo.ImportFromEPSG(4326) # 4326 is the EPSG id of lat/long crs
t = osr.CoordinateTransformation(crs, crsGeo)
(lat, long, z) = t.TransformPoint(posX, posY)
对不起,我不是很精通python,所以你可能需要调整这段代码。查看GeoTransform here for the C++ API的文档以了解有关矩阵元素的更多信息。