我在一份帮助请求中提出了两个问题。所以我希望它不会让这里变得拥挤。
我花了相当多的时间来解决这个问题,但到目前为止还没有成功。我试图仅绘制一系列彼此接近的数据中的点,而不是转换(见下图)。可能我需要一个 if 条件,表示 if x(2)-x(1)<0.005 plot and if not do not plot. Later I want to do a linear regression on these points (thats why I want to exclude transitions). Can you please help me how to do the plotting with this condition and do linear regression.
这是我的代码:
# which value you want to use or plot reading from log data
desired_field1= "x y Box"
desired_value1 = "x1 [um]"
desired_field2= "x y Box"
desired_value2 = "x3 [um]"
desired_field3= "LM Position"
desired_value3 = "Z [um]"
# extracting desired data from logging
data=pd.read_excel(r"test2.xlsx", sheet_name='Sheet1')
data = data[(data['_time'] < '2024-05-21T09:49:37.6089875Z') & (data['_time'] > '2024-05-21T09:43:31.7141954Z')] #selecting desired time interval
data_measurement1 = data.loc[data['_measurement'] == desired_field1]
data_field1 = data_measurement1.loc[data['_field'] == desired_value1]
data_measurement2 = data.loc[data['_measurement'] == desired_field2]
data_field2 = data_measurement2.loc[data['_field'] == desired_value2]
data_measurement3 = data.loc[data['_measurement'] == desired_field3]
data_field3 = data_measurement3.loc[data['_field'] == desired_value3]
values1 = list(data_field1['_value']) #values we are interested in
values2 = list(data_field2['_value'])
values3 = list(data_field3['_value'])
#....
mean_xs = [(g + h) / 2 for g, h in zip(values1, values2)]
LM_mean = [50-x for x in mean_xs]
#start plotting
data_field1['_time'] = pd.to_datetime(data_field1['_time'].str.split().str[-1])
data_field2['_time'] = pd.to_datetime(data_field2['_time'].str.split().str[-1])
data_field3['_time'] = pd.to_datetime(data_field3['_time'].str.split().str[-1])
plt.plot(data_field1['_time'], values1, '-', label = desired_value1)
plt.plot(data_field2['_time'], values2, '-', label = desired_value2 )
plt.plot(data_field3['_time'], values3, '-', label = desired_value3)
plt.xlabel('time [D hh:mm]')
plt.ylabel(' x [um] MCS')
plt.legend(loc='best')
plt.gca().yaxis.grid(True)
plt.figure()
plt.plot(LM_mean, values3, 'o')
示例数据:
9988 2024-05-21T09:46:00.1164445Z 1294.005333
9989 2024-05-21T09:46:01.1115275Z 1294.005333
9990 2024-05-21T09:46:02.1254956Z 1294.005667
9991 2024-05-21T09:46:03.1191685Z 1294.005667
9992 2024-05-21T09:46:04.1325494Z 1294.005333
9993 2024-05-21T09:46:05.1268794Z 1294.005333
9994 2024-05-21T09:46:06.1409297Z 1294.005333
9995 2024-05-21T09:46:07.1346292Z 1294.005000
9996 2024-05-21T09:46:08.1488069Z 1294.005333
9997 2024-05-21T09:46:09.1417524Z 1294.005333
9998 2024-05-21T09:46:10.1563002Z 1294.005333
9999 2024-05-21T09:46:11.1692835Z 1294.005333
10000 2024-05-21T09:46:12.1642492Z 1332.747333
10001 2024-05-21T09:46:13.1977216Z 1344.011333
10002 2024-05-21T09:46:14.1926256Z 1344.012000
10003 2024-05-21T09:46:15.2062685Z 1344.011667
10004 2024-05-21T09:46:16.2200463Z 1344.011667
10005 2024-05-21T09:46:17.2339343Z 1344.012000
10006 2024-05-21T09:46:18.2479639Z 1344.012000
10007 2024-05-21T09:46:19.2405515Z 1344.012000
10008 2024-05-21T09:46:20.2556817Z 1344.012000
我尝试过寻找这个但没有成功
您有一个非常好的信号,如果您将数字四舍五入为整数,则可以轻松检测到这些步骤。我最后确实生成了一个玩具数据集,因为您的示例只有一个步骤,因此它不太健壮。我现在只包括该方法:
diffY = np.diff(y) # get the gradiant of Y
idxSteps = np.array(np.where(np.abs(diffY) > 1)) # search for drops as the abs of the difference
previousStep = 0 # init the previous step
xMid = list() # init list for x
yMid = list() # ==/== for y
for currentStep in idxSteps[0]: # for every step detected
if currentStep != previousStep: # and if they are not the same as previousStep
dummyX = x[previousStep:currentStep] # get the data between these two steps
dummyY = y[previousStep:currentStep] # ==/==
xMid.append(np.median(dummyX)) # append to the median to the x list
yMid.append(np.min(dummyY)) # append the minimum to the y
previousStep = currentStep # assign previousStep as currentStep for the next loop
在上面的代码中:
应用此代码后的结果如下:
plt.figure()
plt.plot(x, y) # plot original
plt.scatter(xMid, yMid, 10, color = "r", marker = "v") # scatter generated points
fit = np.polyfit(xMid, yMid, 1) # fit
plt.plot(x, np.polyval(fit, x), "k--") # plot fit
plt.grid() # apply grid
plt.legend(["Raw", "Middle", "Fitted"])
尝试应用这种方法,如果您有更多问题,请回来。希望这对您有帮助。
到目前为止导入的是 matplotlib 和 numpy。