如何在不使用time.Sleep的情况下等待所有goroutines完成?

问题描述 投票:0回答:6
此代码选择同一文件夹中的所有 xml 文件作为调用的可执行文件,并异步对回调方法中的每个结果应用处理(在下面的示例中,仅打印出文件名)。

如何避免使用sleep方法来阻止main方法退出?我在围绕频道进行思考时遇到问题(我认为这就是同步结果所需要的),因此我们将不胜感激!

package main import ( "fmt" "io/ioutil" "path" "path/filepath" "os" "runtime" "time" ) func eachFile(extension string, callback func(file string)) { exeDir := filepath.Dir(os.Args[0]) files, _ := ioutil.ReadDir(exeDir) for _, f := range files { fileName := f.Name() if extension == path.Ext(fileName) { go callback(fileName) } } } func main() { maxProcs := runtime.NumCPU() runtime.GOMAXPROCS(maxProcs) eachFile(".xml", func(fileName string) { // Custom logic goes in here fmt.Println(fileName) }) // This is what i want to get rid of time.Sleep(100 * time.Millisecond) }
    
go synchronization goroutine
6个回答
261
投票
您可以使用

sync.WaitGroup。引用链接的例子:

package main import ( "net/http" "sync" ) func main() { var wg sync.WaitGroup var urls = []string{ "http://www.golang.org/", "http://www.google.com/", "http://www.somestupidname.com/", } for _, url := range urls { // Increment the WaitGroup counter. wg.Add(1) // Launch a goroutine to fetch the URL. go func(url string) { // Decrement the counter when the goroutine completes. defer wg.Done() // Fetch the URL. http.Get(url) }(url) } // Wait for all HTTP fetches to complete. wg.Wait() }
    

86
投票
WaitGroups 绝对是执行此操作的规范方法。不过,为了完整起见,这里是引入 WaitGroups 之前常用的解决方案。基本思想是使用通道说“我完成了”,并让主 Goroutine 等待,直到每个生成的例程报告其完成。

func main() { c := make(chan struct{}) // We don't need any data to be passed, so use an empty struct for i := 0; i < 100; i++ { go func() { doSomething() c <- struct{}{} // signal that the routine has completed }() } // Since we spawned 100 routines, receive 100 messages. for i := 0; i < 100; i++ { <- c } }
    

16
投票

sync.WaitGroup 可以在这里为您提供帮助。

package main import ( "fmt" "sync" "time" ) func wait(seconds int, wg * sync.WaitGroup) { defer wg.Done() time.Sleep(time.Duration(seconds) * time.Second) fmt.Println("Slept ", seconds, " seconds ..") } func main() { var wg sync.WaitGroup for i := 0; i <= 5; i++ { wg.Add(1) go wait(i, &wg) } wg.Wait() }
    

3
投票
虽然

sync.waitGroup

 (wg) 是规范的前进方式,但它确实要求您在 
wg.Add
 之前至少执行一些 
wg.Wait
 调用才能完成所有操作。对于像网络爬虫这样的简单事物来说,这可能不可行,因为您事先不知道递归调用的数量,并且需要一段时间才能检索驱动 
wg.Add
 调用的数据。毕竟,您需要先加载并解析第一页,然后才能知道第一批子页面的大小。

我使用通道编写了一个解决方案,避免了解决方案中的

waitGroup

“Go 之旅 - 网络爬虫”练习。每次启动一个或多个 go 例程时,您都会将数字发送到 
children 通道。每次 Go 例程即将完成时,您都会向 1
 通道发送一个 
done
。当孩子们的总和等于完成的总和时,我们就完成了。
我唯一关心的是 
results

通道的硬编码大小,但这是(当前)Go 的限制。


// recursionController is a data structure with three channels to control our Crawl recursion.
// Tried to use sync.waitGroup in a previous version, but I was unhappy with the mandatory sleep.
// The idea is to have three channels, counting the outstanding calls (children), completed calls 
// (done) and results (results).  Once outstanding calls == completed calls we are done (if you are
// sufficiently careful to signal any new children before closing your current one, as you may be the last one).
//
type recursionController struct {
    results  chan string
    children chan int
    done     chan int
}

// instead of instantiating one instance, as we did above, use a more idiomatic Go solution
func NewRecursionController() recursionController {
    // we buffer results to 1000, so we cannot crawl more pages than that.  
    return recursionController{make(chan string, 1000), make(chan int), make(chan int)}
}

// recursionController.Add: convenience function to add children to controller (similar to waitGroup)
func (rc recursionController) Add(children int) {
    rc.children <- children
}

// recursionController.Done: convenience function to remove a child from controller (similar to waitGroup)
func (rc recursionController) Done() {
    rc.done <- 1
}

// recursionController.Wait will wait until all children are done
func (rc recursionController) Wait() {
    fmt.Println("Controller waiting...")
    var children, done int
    for {
        select {
        case childrenDelta := <-rc.children:
            children += childrenDelta
            // fmt.Printf("children found %v total %v\n", childrenDelta, children)
        case <-rc.done:
            done += 1
            // fmt.Println("done found", done)
        default:
            if done > 0 && children == done {
                fmt.Printf("Controller exiting, done = %v, children =  %v\n", done, children)
                close(rc.results)
                return
            }
        }
    }
}

解决方案的完整源代码

这是一个使用 WaitGroup 的解决方案。

2
投票
首先,定义2个实用方法:

package util import ( "sync" ) var allNodesWaitGroup sync.WaitGroup func GoNode(f func()) { allNodesWaitGroup.Add(1) go func() { defer allNodesWaitGroup.Done() f() }() } func WaitForAllNodes() { allNodesWaitGroup.Wait() }

然后,替换
callback
的调用:

go callback(fileName)

调用您的实用函数:

util.GoNode(func() { callback(fileName) })

最后一步,将此行添加到
main
的末尾,而不是

sleep

。这将确保主线程在程序停止之前等待所有例程完成。
func main() {
  // ...
  util.WaitForAllNodes()
}

asakdljsad 作为
阿斯克达斯克
fdwfoeie

0
投票
erlwjrioefd

© www.soinside.com 2019 - 2024. All rights reserved.