Why Do You Need to Be Careful With Loop Variable in Go

computers, concurrency, golang, pitfalls, programming

TL;DR

This post describes two different issues:

Race conditions when taking reference of loop variable and passing it to another goroutine:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// WRONG: pass message by reference
for message := range inbox {
        outbox <- EnhancedMessage{
                // .. more fields here ..
                Original: &message,
        }
}

// CORRECT: pass message by value
for message := range inbox {
        outbox <- EnhancedMessage{
                // .. more fields here ..
                // Pass message by value here
                Original: message,
        }
}

See explanation here.

Race conditions when using loop variable inside of goroutine inside of loop:

1
2
3
4
5
6
7
8
9
10
11
12
13
// WRONG: use loop variable directly from goroutine
for message := range inbox {
        go func() {
                // .. do something important with message ..
        }()
}

// CORRECT: pass loop variable by value as an argument for goroutine's function
for message := range inbox {
        go func(message Message) {
                // .. do something important with message ..
        }(message)
}

See explanation here.

Taking reference of loop variable

Lets start off with simple code example:

1
2
3
4
5
6
for message := range inbox {
        outbox <- EnhancedMessage{
                // .. more fields here ..
                Original: &message,
        }
}

Looks quite legit. In practice it will often cause race conditions. Because message variable is defined once and then mutated in each loop iteration. The variable is passed pointer to some concurrent collaborators, which causes race conditions and very confusing bugs.

Above code can be re-written as:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// local scope begins here
        var (
                message Message
                ok bool
        )
        for {
                message, ok = <-inbox
                if !ok {
                        break
                }

                outbox <- EnhancedMessage{
                        // .. more fields here ..
                        Original: &message,
                }
        }
// local scope ends here

Looking at this code, it is quite obvious why it would have race conditions.

Correct way of doing that would be to either define a new variable manually each iteration and copy message’s value into it:

1
2
3
4
5
6
7
for message := range inbox {
        m := message
        outbox <- EnhancedMessage{
                // ...
                Original: &m,
        }
}

Another way of doing that would be to take control of how loop variable works yourself:

1
2
3
4
5
6
7
8
9
10
11
for {
        message, ok := <-inbox
        if !ok {
                break
        }

        outbox <- EnhancedMessage{
                // ...
                Original: &message,
        }
}

Taking into account, that until the EnhancedMessage is processed by concurrent collaborator and garbage collected, variables, created during each iteration, i.e.: m and message for both examples, will stay in memory. Therefore it is possible to just use pass-by-value instead of pass-by-reference to achieve the same result. It is simpler too:

1
2
3
4
5
6
7
8
9
for message := range inbox {
        outbox <- EnhancedMessage{
                // ...

                // Given the fact that `EnhancedMessage.Original` definition
                // changed to be of value type `Message`
                Original: message,
        }
}

Personally, I prefer latter. If you know of any drawbacks of this approach comparing to other 2, or if you know of entirely better way of doing that, please let me know.

Running goroutine, that uses loop variable

Example code:

1
2
3
4
5
for message := range inbox {
        go func() {
                // .. do something important with message ..
        }()
}

This code might look legit too. You might think it will process whole inbox concurrently, but most probably it will process only a couple of last elements multiple times.

If you rewrite the loop in a similar fashion as in previous section, you would notice that message would be mutated while these goroutines are still processing it. This will cause confusing race conditions.

Correct way of doing that is:

1
2
3
4
5
for message := range inbox {
        go func(message Message) {
                // .. do something important with message ..
        }(message)
}

In this case, it is basically the same as copying the value to the new defined variable at each iteration of the loop. It just looks nicer.

Thanks!

If you have any questions, suggestions or just want to chat about the topic, you can ping me on twitter @waterlink000 or drop a comment on hackernews.

Especially, if you think I am wrong somewhere in this article, please tell me, I will only be happy to learn and iterate over this article to improve it.

Happy coding!

Credits