在阅读和尝试numpy.random时,我似乎无法找到或创建所需的东西;一个10参数的Python伪随机值生成器,其中包括count,min,max,mean,sd,25%ile,50%ile(中位数),75%ile,偏斜和峰度。
从https://docs.python.org/3/library/random.html中,我看到这些分布是均匀分布,正态分布(高斯分布),对数正态分布,负指数分布,伽玛分布和贝塔分布,尽管我需要直接为仅由我的10个参数定义的分布生成值,而无需引用发行家庭。
是否有numpy.random.xxxxxx(n,min,max,mean,sd,25%,50%,75%,偏斜,峰度)的文档或作者,或者是什么?我可以修改以实现此目标的最接近的现有源代码?
这将是describe()的反向,在某种程度上包括歪斜和峰度。我可以进行循环或优化,直到使用随机生成的数字满足条件为止,尽管要花10个参数才能花费无限的时间。
我发现R中的optim生成了一个数据集,但是到目前为止,它能够增加R optim源代码中的参数,或者使用Python scipy.optimize或类似方法复制它,尽管这些仍然依赖于方法而不是直接依赖于方法根据需要,根据我的10个参数伪随机创建数据集;
m0 <- 20
sd0 <- 5
min <- 1
max <- 45
n <- 15
set.seed(1)
mm <- min:max
x0 <- sample(mm, size=n, replace=TRUE)
objfun <- function(x) {(mean(x)-m0)^2+(sd(x)-sd0)^2}
candfun <- function(x) {x[sample(n, size=1)] <- sample(mm, size=1)
return(x)}
objfun(x0) ##INITIAL RESULT:83.93495
o1 <- optim(par=x0, fn=objfun, gr=candfun, method="SANN", control=list(maxit=1e6))
mean(o1$par) ##INITIAL RESULT:20
sd(o1$par) ##INITIAL RESULT:5
plot(table(o1$par))
numpy.random.random()
)。在您的情况下,逆CDF(ICDF(x)
)已由您的五个参数确定-最小值,最大值和三个百分点,如下所示:
_lossfunc
),初始猜测,边界和其他参数传递给SciPy的scipy.optimize.minimize
方法进行优化。import scipy.stats.mstats as mst
from scipy.optimize import minimize
from scipy.interpolate import interp1d
import numpy
# Define the loss function, which is simply the
# sum of squared distances between the calculated
# and ideal parameters
def _lossfunc(x, *args):
mean, vari, skew, kurt=args
return (numpy.mean(x)-mean)**2 + \
(numpy.var(x)-vari)**2 + \
(mst.skew(x)-skew)**2 + \
(mst.kurtosis(x)-kurt)**2
# Calculates an inverse CDF for the given nine parameters.
def _get_inverse_cdf(mn, p25, p50, p75, mx, mean, stdev, skew, kurt):
# Calculate initial guess for the inverse CDF; an
# interpolation of the inverse CDF through the known
# percentiles
interp=interp1d([0,0.25,0.5, 0.75,1.0],[mn,p25,p50,p75,mx])
x=interp(numpy.linspace(0,1,101))
# Bounds
bounds=[(mn,mx) for i in range(101)]
# Percentiles must have fixed values
bounds[0]=(mn,mn)
bounds[25]=(p25,p25)
bounds[50]=(p50,p50)
bounds[75]=(p75,p75)
bounds[100]=(mx,mx)
# Other parameters
otherParams=(mean, stdev**2, skew, kurt)
# Optimize the result for the given parameters
# using the initial guess and the bounds
result=minimize(
_lossfunc, # Loss function
x, # Initial guess
otherParams, # Arguments
bounds=bounds)
# Check for success
if not result.success: raise ValueError
# Calculate interpolating function of result
ls=numpy.linspace(0,1,101)
return interp1d(ls,result.x,kind='cubic')
def random_10params(n, mn, p25, p50, p75, mx, mean, stdev, skew, kurt):
""" Note: Kurtosis as used here is Fisher's kurtosis,
or kurtosis excess. Stdev is square root of numpy.var(). """
# Calculate inverse CDF
icdf = _get_inverse_cdf(mn, p25, p50, p75, mx, mean, stdev, skew, kurt)
# Generate uniform random variables
npr=numpy.random.random(size=n)
# Transform them with the inverse CDF
return icdf(npr)
示例:
print(random_10params(100,
-2.3263478740408408, -0.6744897501960817, 0.0, 0.6744897501960817, 2.0537489106318225,-0.023,0.875,-0.0806,-0.448))