我正在研究一种资源优化问题的格式,并编写 GEKKO 代码来解决。问题陈述如下:假设有 2 个工人和 4 个任务。每个工人都会因其选择的任务而获得一些奖励。每个任务只能分配给一个工人,每个工人只能选择一个任务。目标是最大化两个工人获得的总奖励。
根据问题决策变量的要求如下:
给予工人的奖励是根据所选择的任务权重与分配给工人的所有任务权重之和的比例来计算的。
以下是相同的代码:
import numpy as np
from gekko import GEKKO
rewards = np.array([[2,5,8,10],[1,5,7,11]])
m = GEKKO(remote=False)
allocation = m.Array(m.Var,(2,4),lb=0,ub=1, integer=True)
weights = m.Array(m.Var,4,lb=0,ub=1)
def reward(allocation, weights, rewards):
temp=np.copy(allocation)
#sum_ = np.sum(weights)
for i in range(allocation.shape[1]):
temp[:, i] *= weights[i]#/sum_
total_rewards = np.sum(temp.flatten() * rewards.flatten())
return total_rewards
for j in range(4):
m.Equation(m.sum(allocation[:,j])<=1)
m.Maximize(reward(allocation, weights, rewards))
m.options.SOLVER = 1 # change solver (1=APOPT,3=IPOPT)
#m.open_folder()
m.solve()
print('allocation', allocation)
print('weights', weights)
print('Objective: ' + str(m.options.objfcnval))
在奖励函数中,我注释了“sum_”变量。在这种情况下,输出是:
----------------------------------------------------------------
APMonitor, Version 1.0.1
APMonitor Optimization Suite
----------------------------------------------------------------
--------- APM Model Size ------------
Each time step contains
Objects : 0
Constants : 0
Variables : 16
Intermediates: 0
Connections : 0
Equations : 5
Residuals : 5
Number of state variables: 16
Number of total equations: - 4
Number of slack variables: - 4
---------------------------------------
Degrees of freedom : 8
----------------------------------------------
Steady State Optimization with APOPT Solver
----------------------------------------------
Iter: 1 I: 0 Tm: 0.00 NLPi: 4 Dpth: 0 Lvs: 3 Obj: -1.63E+01 Gap: NaN
--Integer Solution: 0.00E+00 Lowest Leaf: -1.63E+01 Gap: 2.00E+00
Iter: 2 I: 0 Tm: 0.00 NLPi: 2 Dpth: 1 Lvs: 2 Obj: 0.00E+00 Gap:
2.00E+00
Iter: 3 I: 0 Tm: 0.00 NLPi: 3 Dpth: 1 Lvs: 4 Obj: -2.60E+01 Gap: 2.00E+00
--Integer Solution: -2.60E+01 Lowest Leaf: -2.60E+01 Gap: 0.00E+00
Iter: 4 I: 0 Tm: 0.00 NLPi: 1 Dpth: 2 Lvs: 4 Obj: -2.60E+01 Gap: 0.00E+00
Successful solution
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 1.729999999224674E-002 sec
Objective : -26.0000000000000
Successful solution
---------------------------------------------------
allocation [[[1.0] [1.0] [1.0] [0.0]]
[[0.0] [0.0] [0.0] [1.0]]]
weights [[1.0] [1.0] [1.0] [1.0]]
Objective: -26.0
但是如果我取消注释“sum_”变量,即分配给工人的任务的权重和总权重的使用比率。我得到了作为 nan 的客观价值
----------------------------------------------------------------
APMonitor, Version 1.0.1
APMonitor Optimization Suite
----------------------------------------------------------------
--------- APM Model Size ------------
Each time step contains
Objects : 0
Constants : 0
Variables : 16
Intermediates: 0
Connections : 0
Equations : 5
Residuals : 5
Number of state variables: 16
Number of total equations: - 4
Number of slack variables: - 4
---------------------------------------
Degrees of freedom : 8
----------------------------------------------
Steady State Optimization with APOPT Solver
----------------------------------------------
Iter: 1 I: 0 Tm: 0.01 NLPi: 2 Dpth: 0 Lvs: 0 Obj: NaN Gap: 0.00E+00
Successful solution
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 2.390000000013970E-002 sec
Objective : NaN
Successful solution
---------------------------------------------------
allocation [[[0.0] [0.0] [0.0] [0.0]]
[[0.0] [0.0] [0.0] [0.0]]]
weights [[0.0] [0.0] [0.0] [0.0]]
Objective: nan
请让我知道如何解决此问题或任何调试此问题的指针。
谢谢
设置
sum_
的下限以避免分母被零除。
sum_ = m.Var(lb=1e-3)
m.Equation(sum_ == np.sum(weights))
NaN
是因为默认的猜测值为0
并且总和最初也等于零。这是一个成功解决的完整脚本:
import numpy as np
from gekko import GEKKO
rewards = np.array([[2,5,8,10],[1,5,7,11]])
m = GEKKO(remote=False)
allocation = m.Array(m.Var,(2,4),lb=0,ub=1, integer=True)
weights = m.Array(m.Var,4,lb=0,ub=1)
def reward(allocation, weights, rewards):
temp=np.copy(allocation)
sum_ = m.Var(lb=1e-3)
m.Equation(sum_ == np.sum(weights))
for i in range(allocation.shape[1]):
temp[:, i] *= weights[i]/sum_
total_rewards = np.sum(temp.flatten() * rewards.flatten())
return total_rewards
for j in range(4):
m.Equation(m.sum(allocation[:,j])<=1)
m.Maximize(reward(allocation, weights, rewards))
m.options.SOLVER = 1 # change solver (1=APOPT,3=IPOPT)
m.solve()
print('allocation', allocation)
print('weights', weights)
print('Objective: ' + str(m.options.objfcnval))
这是输出:
----------------------------------------------------------------
APMonitor, Version 1.0.0
APMonitor Optimization Suite
----------------------------------------------------------------
--------- APM Model Size ------------
Each time step contains
Objects : 0
Constants : 0
Variables : 17
Intermediates: 0
Connections : 0
Equations : 6
Residuals : 6
Number of state variables: 17
Number of total equations: - 5
Number of slack variables: - 4
---------------------------------------
Degrees of freedom : 8
----------------------------------------------
Steady State Optimization with APOPT Solver
----------------------------------------------
Iter: 1 I: 0 Tm: 0.00 NLPi: 7 Dpth: 0 Lvs: 3 Obj: -1.10E+01 Gap: NaN
--Integer Solution: -1.10E+01 Lowest Leaf: -1.10E+01 Gap: 0.00E+00
Iter: 2 I: 0 Tm: 0.00 NLPi: 2 Dpth: 1 Lvs: 3 Obj: -1.10E+01 Gap: 0.00E+00
Successful solution
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 0.02 sec
Objective : -11.
Successful solution
---------------------------------------------------
allocation [[[0.0] [0.0] [1.0] [0.0]]
[[0.0] [0.0] [0.0] [1.0]]]
weights [[0.0] [0.0] [0.0] [0.0020037011152]]
Objective: -11.0