为什么我从不同的库中得到不同的自相关结果？

Question

为什么我从不同的库中得到不同的自相关结果？
哪一个是正确的？

import numpy as np
from scipy import signal

# Given data
data = np.array([1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 3.33])

# Compute the autocorrelation using scipy's correlate function
autocorrelations = signal.correlate(data, data, mode='full')

# The middle of the autocorrelations array is at index len(data)-1
mid_index = len(data) - 1

# Show autocorrelation values for lag=1,2,3,4,...
print(autocorrelations[mid_index + 1:])

输出：

[21.2425 17.285  13.4525  9.8075  6.4125  3.33  ]

import pandas as pd

# Given data
data = [1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 3.33]

# Convert data to pandas Series
series = pd.Series(data)

# Compute and print autocorrelation for lags 1 to length of series - 1
for lag in range(0, len(data)):
    print(series.autocorr(lag=lag))

输出：

1.0
0.9374115462038415
0.9287843240596312
0.9260849979667674
0.9407970411588671
0.9999999999999999

from statsmodels.tsa.stattools import acf

# Your data
data = [1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 3.33]

# Calculate the autocorrelation using the acf function
autocorrelation = acf(data, nlags=len(data)-1, fft=True)

# Display the autocorrelation coefficients for lags 1,2,3,4,...
print(autocorrelation)

输出：

[ 1.       0.39072553  0.13718689 -0.08148897 -0.24787067 -0.3445268 -0.35402598]

Answer 1

你的问题让我很好奇，所以我做了一些研究，我想与你分享我的发现：

不同库的不同自相关结果是由于这些库中用于计算相关性的算法和方法的差异造成的。
自相关计算没有通用的公式，因此每个库都有自己的方法。我建议您查看以下内容以更好地理解：从： https://docs.scipy.org/doc/scipy/reference/ generated/scipy.signal.correlate.html

来自：https://github.com/pandas-dev/pandas/blob/v2.2.0/pandas/core/series.py#L3115-L3158

 def autocorr(self, lag: int = 1) -> float:
        """
        Compute the lag-N autocorrelation.

        This method computes the Pearson correlation between
        the Series and its shifted self.

        Parameters
        ----------
        lag : int, default 1
            Number of lags to apply before performing autocorrelation.

        Returns
        -------
        float
            The Pearson correlation between self and self.shift(lag).

        See Also
        --------
        Series.corr : Compute the correlation between two Series.
        Series.shift : Shift index by desired number of periods.
        DataFrame.corr : Compute pairwise correlation of columns.
        DataFrame.corrwith : Compute pairwise correlation between rows or
            columns of two DataFrame objects.

        Notes
        -----
        If the Pearson correlation is not well defined return 'NaN'.

        Examples
        --------
        >>> s = pd.Series([0.25, 0.5, 0.2, -0.05])
        >>> s.autocorr()  # doctest: +ELLIPSIS
        0.10355...
        >>> s.autocorr(lag=2)  # doctest: +ELLIPSIS
        -0.99999...

        If the Pearson correlation is not well defined, then 'NaN' is returned.

        >>> s = pd.Series([1, 0, 0, 0])
        >>> s.autocorr()
        nan
        """
        return self.corr(cast(Series, self.shift(lag)))

为什么我从不同的库中得到不同的自相关结果？

问题描述投票：0回答：1

1个回答

最新问题

为什么我从不同的库中得到不同的自相关结果？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1