为什么我从不同的库中得到不同的自相关结果?

问题描述 投票:0回答:1
  1. 为什么我从不同的库中得到不同的自相关结果?
  2. 哪一个是正确的?
import numpy as np
from scipy import signal

# Given data
data = np.array([1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 3.33])

# Compute the autocorrelation using scipy's correlate function
autocorrelations = signal.correlate(data, data, mode='full')

# The middle of the autocorrelations array is at index len(data)-1
mid_index = len(data) - 1

# Show autocorrelation values for lag=1,2,3,4,...
print(autocorrelations[mid_index + 1:])

输出:

[21.2425 17.285  13.4525  9.8075  6.4125  3.33  ]

import pandas as pd

# Given data
data = [1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 3.33]

# Convert data to pandas Series
series = pd.Series(data)

# Compute and print autocorrelation for lags 1 to length of series - 1
for lag in range(0, len(data)):
    print(series.autocorr(lag=lag))

输出:

1.0
0.9374115462038415
0.9287843240596312
0.9260849979667674
0.9407970411588671
0.9999999999999999

from statsmodels.tsa.stattools import acf

# Your data
data = [1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 3.33]

# Calculate the autocorrelation using the acf function
autocorrelation = acf(data, nlags=len(data)-1, fft=True)

# Display the autocorrelation coefficients for lags 1,2,3,4,...
print(autocorrelation)

输出:

[ 1.       0.39072553  0.13718689 -0.08148897 -0.24787067 -0.3445268 -0.35402598]
python pandas statsmodels autocorrelation
1个回答
0
投票

你的问题让我很好奇,所以我做了一些研究,我想与你分享我的发现:

来自:https://github.com/pandas-dev/pandas/blob/v2.2.0/pandas/core/series.py#L3115-L3158

 def autocorr(self, lag: int = 1) -> float:
        """
        Compute the lag-N autocorrelation.

        This method computes the Pearson correlation between
        the Series and its shifted self.

        Parameters
        ----------
        lag : int, default 1
            Number of lags to apply before performing autocorrelation.

        Returns
        -------
        float
            The Pearson correlation between self and self.shift(lag).

        See Also
        --------
        Series.corr : Compute the correlation between two Series.
        Series.shift : Shift index by desired number of periods.
        DataFrame.corr : Compute pairwise correlation of columns.
        DataFrame.corrwith : Compute pairwise correlation between rows or
            columns of two DataFrame objects.

        Notes
        -----
        If the Pearson correlation is not well defined return 'NaN'.

        Examples
        --------
        >>> s = pd.Series([0.25, 0.5, 0.2, -0.05])
        >>> s.autocorr()  # doctest: +ELLIPSIS
        0.10355...
        >>> s.autocorr(lag=2)  # doctest: +ELLIPSIS
        -0.99999...

        If the Pearson correlation is not well defined, then 'NaN' is returned.

        >>> s = pd.Series([1, 0, 0, 0])
        >>> s.autocorr()
        nan
        """
        return self.corr(cast(Series, self.shift(lag)))
© www.soinside.com 2019 - 2024. All rights reserved.