查找数据中斜率变化的位置,作为可以轻松索引和提取的参数

问题描述 投票:0回答:2

我有以下数据:

0.8340502011561366 0.8423491600218922 0.8513456021654467 0.8458192388553084 0.8440111276014195 0.8489589671423143 0.8738088120491972 0.8845129900705279 0.8988298998926688 0.924633964692693 0.9544790734065157 0.9908034431246875 1.0236430466543138 1.061619773027915 1.1050038249835414 1.1371449802490126 1.1921182610371368 1.2752207659022576 1.344047620255176 1.4198117350668353 1.507943067143741 1.622137968203745 1.6814098429502085 1.7646810054280595 1.8485457435775694 1.919591124757554 1.9843144220593145 2.030158014640226 2.018184122476175 2.0323466012624207 2.0179200409023874 2.0316932950853723 2.013683870089898 2.03010703506514 2.0216151623726977 2.038855467786505 2.0453923522466093 2.03759031642753 2.019424996752278 2.0441806106428606 2.0607521369415136 2.059310067318373 2.0661157975162485 2.053216429539864 2.0715123971225564 2.0580473413362075 2.055814512721712 2.0808278560688964 2.0601637029377113 2.0539429365156003 2.0609648613513754 2.0585135712612646 2.087674625814453 2.062482961966647 2.066476100210777 2.0568444178944967 2.0587903943282266 2.0506399365756396

绘制的数据如下所示:

Plotted data

我想找到斜率符号变化的点(我用黑色圈出了它。应该在索引 26 附近):

Point I am trying to extract

我需要找到数百个文件的这个变化点。到目前为止,我尝试了这篇文章的推荐:

寻找坡度变化点作为自由参数 - Python

我认为由于我的数据有点噪音,我在斜率的变化中没有得到平滑的过渡。

这是我迄今为止尝试过的代码:

import numpy as np

#load 1-D data file
file = str(sys.argv[1])
y = np.loadtxt(file)

#create X based on file length
x = np.linspace(1,len(y), num=len(y))

Find first derivative:

m = np.diff(y)/np.diff(x)
print(m)

#Find second derivative
b = np.diff(m)
print(b)
#find Index

index = 0
for difference in b:
    index += 1
    if difference < 0: 
        print(index, difference)

由于我的数据有噪音,我在我想要的索引之前得到了一些负值。在这种情况下,我希望它检索的索引约为 26(这是我的数据变得恒定的位置)。有人对我能做什么来解决这个问题有任何建议吗?谢谢!

python numpy data-fitting
2个回答
2
投票

在这种情况下,梯度方法是无用的,因为您不关心速度或矢量场。梯度的知识不会添加额外的信息来定位最大值,因为 run 始终为正,因此不会影响梯度的符号。建议完全基于 raise 的方法。

检测数据减少的索引,找出它们与最大值的位置之间的差异。然后通过索引操作,您可以找到数据具有最大值的值。

data = '0.8340502011561366 0.8423491600218922 0.8513456021654467 0.8458192388553084 0.8440111276014195 0.8489589671423143 0.8738088120491972 0.8845129900705279 0.8988298998926688 0.924633964692693 0.9544790734065157 0.9908034431246875 1.0236430466543138 1.061619773027915 1.1050038249835414 1.1371449802490126 1.1921182610371368 1.2752207659022576 1.344047620255176 1.4198117350668353 1.507943067143741 1.622137968203745 1.6814098429502085 1.7646810054280595 1.8485457435775694 1.919591124757554 1.9843144220593145 2.030158014640226 2.018184122476175 2.0323466012624207 2.0179200409023874 2.0316932950853723 2.013683870089898 2.03010703506514 2.0216151623726977 2.038855467786505 2.0453923522466093 2.03759031642753 2.019424996752278 2.0441806106428606 2.0607521369415136 2.059310067318373 2.0661157975162485 2.053216429539864 2.0715123971225564 2.0580473413362075 2.055814512721712 2.0808278560688964 2.0601637029377113 2.0539429365156003 2.0609648613513754 2.0585135712612646 2.087674625814453 2.062482961966647 2.066476100210777 2.0568444178944967 2.0587903943282266 2.0506399365756396'

data = data.split()
import numpy as np

a = np.array(data, dtype=float)

diff = np.diff(a)

neg_indeces = np.where(diff<0)[0]
neg_diff = np.diff(neg_indeces)

i_max_dif = np.where(neg_diff == neg_diff.max())[0][0] + 1

i_max = neg_indeces[i_max_dif] - 1 # because aise as a difference of two consecutive values

print(i_max, a[i_max])

输出

26 1.9843144220593145

一些细节

print(neg_indeces) # all indeces of the negative values in the data
# [ 2  3 27 29 31 33 36 37 40 42 44 45 47 48 50 52 54 56]
print(neg_diff) # difference between such indices
# [ 1 24  2  2  2  3  1  3  2  2  1  2  1  2  2  2  2]
print(neg_diff.max()) # value with highest difference
# 24
print(i_max_dif) # location of the max index of neg_indeces -> 27
# 2
print(i_max) # index of the max of the origonal data
# 26

2
投票

当一阶导数改变符号时,斜率符号也改变。我认为你不需要二阶导数,除非你想确定斜率的变化率。您也没有得到二阶导数。您只是得到一阶导数的差值。

此外,您似乎正在分配任意 x 值。如果 y 值代表等距的点,那就没问题,否则导数将是错误的。

这是如何获得第一和第二个 der 的示例...



import numpy as np

x = np.linspace(1, 100, 1000)

y = np.cos(x)

# Find first derivative:
m = np.diff(y)/np.diff(x)

#Find second derivative
m2 = np.diff(m)/np.diff(x[:-1])

print(m)
print(m2)

# Get x-values where slope sign changes

c = len(m)

changes_index = []
for i in range(1, c):
    prev_val = m[i-1]
    val = m[i]
    if prev_val < 0 and val > 0:
        changes_index.append(i)
    elif prev_val > 0 and val < 0:
        changes_index.append(i)

for i in changes_index:
    print(x[i])


注意我必须减少第二个 der 的 x 值。这是因为

np.diff()
返回的点比原始输入少一个点。

© www.soinside.com 2019 - 2024. All rights reserved.