此问题已经在这里有了答案:
我是Scala的新手,正在尝试创建UDF函数,该函数将返回由我的UDF函数标识的一串级联元组。
它看起来应该像这样,但是这里有一些问题,它不喜欢在块外定义变量“ fine”,并且不希望更改作为参数传递的数量。
val calculateFines: UserDefinedFunction = udf((ids: Array[Long], values: Array[Double], amount: Double, complain_id: Long) => {
var fines
ids.indices foreach { i => {
val (id, value) = (ids(i), values(i))
val penalty = if (value > amount) amount else value
amount = amount - penalty
fines = fines + (amount, id, complain_id, penalty).toString()
if (amount <= 0)
break
}
}
return fines
})
您可以通过少量修复使代码正常工作:
import scala.util.control.Breaks._ //we need this import to allow breaks since Scala doesn't support them out-of-box
val dysfunctional = udf((ids: Array[Long], values: Array[Double], amount: Double, complain_id: Long) => {
var fines: String = "" //you need to initalize var
var amountSum = amount //assign amount to var to allow to reassigment
breakable {
ids.indices foreach { i =>
{
val (id, value) = (ids(i), values(i))
val penalty = if (value > amount) amount else value
amountSum = amountSum - penalty
fines = fines + (amount, id, complain_id, penalty)
if (amount <= 0)
break
}
}
}
fines
})
这会行得通,但由于它是非功能性方法,因此许多人会对此感到不满意,Scala鼓励编写功能性代码。您可以尝试将其更改为如下所示:
val moreFunctional = (ids: Array[Long], values: Array[Double], amount: Double, complain_id: Long) => {
val (_, fines) = (ids, values)
.zipped // zip values and ids to single array of tuples
.toStream //change it to stream to allow lazy computation
.scanLeft((amount, "")) { //we pass tuple of amount and empty string as our initial state to scanLeft
case ((amount, fines), (id, value)) => //second argument of scanLeft is function which receives previous state and currently processed element of array
val penalty = if (value > amount) amount else value
(amount, fines + (amount, id, complain_id, penalty).toString()) //here we passs next state for next iteration of scanLeft
}
.takeWhile { //we proceed with computations as long and amount is above zero
case (amount, _) => amount > 0
}
.last //we're only interested in last state produced by scan left
fines
}