Concurrency and Thread Synchronization in Go

Concurrency and Thread Synchronization in Go

Today I will talk about concurrency and thread synchronization using Golang which recently I started learning.

What is concurrency?

Concurrency is when two or more tasks can run in overlapping time periods, which doesn't mean that they run literally at the same time, what this means is if a process or a task is run on a single core, say it has for example 2 threads, concurrency is when you get the illusion that they are working at the same time when what really happens is one thread starts then stops and the other thread works and so on so at most 1 thread works in a given time but because the time is minimal between switching us humans never notice it. When 2 threads run at the exact same time that is known as parallelism, such as a multi core processor where threads from different cores execute tasks at the same time.

For example take this task as an example, the gopher(small dude) is burning manuals one at a time in the image below image.png Here's what doing them concurrently look like where both gophers work together without working at the same time exactly.

image.png

Also worth noting that Parallelism is why there are 2 duplicates of the same image which means that 2 gophers work at the same exact time in a given time.

Concurrency in Go

Now that we have a brief understanding of what concurrency is let's take a look at a example using Golang.

 func main() {
    Print("Duck")
    Print("Bat")
}
func Print(name string) {
    for i := 0; i < 5; i++ {
        fmt.Println(name)
        time.Sleep(time.Second)
    }
}

Take the example above as a example to start with, what it does till now is just print the words duck and bat 5 times waiting 1 second before each time, if we run the snippet above the output would be the following

Duck
Duck
Duck
Duck
Duck
Bat
Bat
Bat
Bat
Bat

Where it finished the first function invocation and went straight to the following one, this is a synchronous flow

Goroutines

Simply a goroutine is a lightweight thread of execution, really lightweight that you can run hundreds of thousands of them at the same time where go handles them internally. They are really easy to use simply add the keyword go before invoking a function. Let's update the example above so we can run both print functions in 2 go routines

func main() {

    go Print("Duck")
    go Print("Bat")
}
func Print(name string) {
    for i := 0; i < 5; i++ {
        fmt.Println(name)
        time.Sleep(time.Second)
    }
}

Executing the code above would yield us empty output. Why is that? because since we ran 2 separate go routines the main go routine of the app which has the main function in it terminated before our go routines had the chance to finish. In Go when the main function finishes execution everything else running gets terminated. To combat this for now we can use a Scan at the end of the main to block it from terminating.

func main() {

    go Print("Duck")
    go Print("Bat")
    fmt.Scanln()
}
func Print(name string) {
    for i := 0; i < 5; i++ {
        fmt.Println(name)
        time.Sleep(time.Second)
    }
}

Running this will yield the following output:

Duck
Bat
Duck
Bat
Duck
Bat
Bat
Duck
Duck
Bat

As we can see both goroutines execute concurrently while we may not notice a difference in this example but for more complex structures this increases performance and reduces the time if used correctly.

This is all fine till now but using Scanln() to block the main isn't really the best approach for preventing the main function from terminating. That's why we can use a WaitGroup which is built in Golang and helps us wait for any occurring go routines to finish before we proceed Now updating the code above using the WaitGroup will be as follows:

func main() {

    var wg sync.WaitGroup
    wg.Add(2)
    go Print("Duck", &wg)
    go Print("Bat", &wg)
    wg.Wait()
}
func Print(name string, wg *sync.WaitGroup) {
    for i := 0; i < 5; i++ {
        fmt.Println(name)
        time.Sleep(time.Second)
    }
    wg.Done()
}

All that happened here is we initialized a wg variable of type WaitGroup and Using the Add method which specifies the amount of goroutines that we should wait on before proceeding. And in every goroutine we invoke the Done method with basically tells the wait group that this specific Goroutine has finished. Finally using .Wait() in the main function is what makes main wait until Done method has been called on every goroutine it was waiting on. So far we talked about concurrency and how go routines work but we haven't yet talked about Thread Synchronization, what if I want both go routines communicate with each other? or maybe on go routine communicate with the main go routine. In Go we use something called channels to enable communication between different threads which is basically a meeting point where when both threads execute they meet at a certain point to exchange data. Let's explain it a bit more in the code snippet below

func main() {
    go Print("Duck")
}
func Print(name string) {
    for i := 0; i < 5; i++ {
        fmt.Println(name)
        time.Sleep(time.Second)
    }
}

Instead of Printing name in the function what if I want to communicate with the main go routine and maybe print it there. how do I pass data from the function Print to the main when Print is ran on a different go routine?

func main() {
    c := make(chan string)
    go Print("Duck", c)
    name := <-c
    fmt.Println(name)
}
func Print(name string, c chan string) {
        c <- name
        close(c)
}

In the code above we created a channel c and passed it to the Print function. In the Print function the line c<- name, the arrow indicates the direction of the data flow either from or to the channel. In this case the data is flowing into the channel, so what happens here is for every name in the function Print it is passed to the channel. On the other hand in main the line name:= <-c indicates data is flowing from the channel to the variable called name hence we can print it there. Remember the meeting point we talked about? it's basically both of these lines mentioned above. Whenever any goroutine reaches it first it gets blocked until the second one reaches it so they can communicate and resume what they were doing. A real life example would be meeting a friend at a certain place, you arrived early and you're waiting for him to arrive so you can maybe resume whatever you were going to do. This is how channels work in go, This is how they synchronize at certain point of time. The close function closes the channel we finish sending all the data we need, this is to prevent a deadlock that occurs when the receiver blocks and keeps waiting when there is nothing to send so after we send all the data we close the channel. This is just a brief about channels, there are also Buffered Channels which don't block until the channel is filled with data, buffered channels take a second parameter in the Make function which is the size of the buffer, the snippet below explains this more.

func main() {
    c := make(chan string, 1)
    c <- "duck"
    name := <-c
    fmt.Println(name)
}

This snippet creates a buffered channel of size 1 and adds a string to it, and I receive it on the same go routine, this would print the name as duck normally, whereas if I didn't add the extra buffer size in the make function I would get a deadlock as the c <- "duck" line blocks waiting for a receiver. This is just a brief explanation of channels and for more information checkout the main go tour's Concurrency Chapter

Mutex and locking

In the last section of this article I would also like to quickly explain mutex locking in go and why we need it. Sometimes we need multiple go routines to access the same memory space at the same time, for example the famous bank account example where if the balance was already 100$, one go routine adds 10$ and the other subtracts 20$, if not handled correctly the user might be left with 80$ instead of 100$, How is that? since both are happening at the same time the go routine that subtracts is unaware that there is another goroutine that's adding to the current balance, so it takes 100$ as the current balance and while it's processing the balance might have changed to add the extra 10$ but it was unaware and it overrode the current balance to be 80$, so the user initially just lost 10$ in the process. What is a mutex? Mutex is short for mutual exclusion where upon entering a certain section of a code called the critical section a mutex makes sure that no other go routine can enter this section while it's being processed by a current go routine already. Let's try to explain this with some code.

var (
    mutex   sync.Mutex
    balance int
)

func deposit(value int, wg *sync.WaitGroup) {
    mutex.Lock()
    old_balance := balance
    fmt.Printf("Despositing %d$ ...\n", value)
    balance = value + old_balance
    mutex.Unlock()
    wg.Done()
}
func withdraw(value int, wg *sync.WaitGroup) {
    mutex.Lock()
    old_balance := balance
    time.Sleep(5 * time.Millisecond)
    fmt.Printf("Withdrawing %d$ ...\n", value)
    balance = old_balance - value
    mutex.Unlock()
    wg.Done()

}
func main() {
    balance = 1000
    var wg sync.WaitGroup
    wg.Add(2)
    go deposit(100, &wg)
    go withdraw(200, &wg)
    wg.Wait()
    fmt.Println("The balance now is", balance)
}

This snippet has 2 functions deposit and withdraw and they both start with taking the initial balance which is 1000$, the withdraw function sleeps 5 ms in the middle which by then the deposit would have finished already but it doesn't see the change so when the sleep finishes it keeps going thinking the initial value is still 1000$, which changed due to deposit function but then proceeds to override that value which would cause the user to lose 100$. What the mutex lock here does is as soon as we run the code, whichever goroutine acquires the lock first starts and since the other goroutine has nothing to process and needs the lock to proceed, it blocks waiting for the lock to be released from the first goroutine, when the first goroutine finishes it releases the lock where the second goroutine acquires it and begins reading the balance which was already updated, this process prevents lots of the race conditions that occur due to different threads trying to access the same memory space.

If you made it to here congrats you now have a basic understanding of concurrency and thread synchronization using Go, I hope i explained this well and if there are any comments please let me know! Thank you.

Did you find this article valuable?

Support Amr Elhewy by becoming a sponsor. Any amount is appreciated!