Rob Pike

Rob Pike
Rob Pike at OSCON 2010
Born	1956 (age 67–68)
Nationality	Canadian
Alma mater	University of Toronto (BS) California Institute of Technology
Occupation	Software engineer
Known for	Plan 9, UTF-8, Go
Spouse	Renée French
Website	herpolhode.com/rob/

Robert Pike (born 1956) is a Canadian programmer and author. He is best known for his work on the Go programming language while working at Google^[1]^[2] and the Plan 9 operating system while working at Bell Labs, where he was a member of the Unix team.^[1]

Pike wrote the first window system for Unix in 1981.^[3]^{[non-primary source needed]} He is the sole inventor named in the US patent for overlapping windows on a computer display.^[4]

With Brian Kernighan, he is the co-author of The Practice of Programming and The Unix Programming Environment.^[1] With Ken Thompson, he is the co-creator of UTF-8.^[1]

YouTube Encyclopedic

1/3
Views:
772 992
45 424
85 041

Transcription

ROB PIKE: Hi, everyone. Wow. Voice is always so big. Thanks for coming. I'm going to be talking today about Go concurrency patterns. My name is Rob Pike, and I work on the Go team at Google. First of all, let me give us a little background. When Go came out in November of 2009, a lot of people were immediately fascinated by the features of Go that provided concurrency, and wanted to play with them and goof around. But also, a lot of people had questions about them. And there's a question about why concurrency's even in the language. What exactly do I mean by concurrency? What is the origin of the ideas? And what is it all useful for? And today I'm going to talk about all those things, but I'm mostly going to talk about how you use the features. But I need to give you a little background. So if you look around in the world at large, what you see is a lot of independently executing things. You see people in the audience doing their own thing, tweeting while I'm talking and stuff like that. There's people outside, there's cars going by. All those things are independent agents, if you will, inside the world. And if you think about writing a computer program, if you want to simulate or interact with that environment, a single sequential execution is not a very good approach. And so concurrency is really a way of writing or structuring your program to deal with the real world. And what I mean by concurrency, we define as the composition of independently executing computations. And I want to stress that concurrency is a way of structuring software. It's a way of thinking about how to write clean code that can interact with the real world, and maybe simulate the real world or behave as an agent inside the real world, and be a good actor in that environment. Concurrency is really not parallelism. Although a lot of people got confused by that, and I think when Go first launched, a lot of people thought Go was a parallel language. It's not really a parallel language. It's a concurrent one. And one to realize the difference is to imagine-- if you have a concurrent piece of code that you've written that uses the concurrency features of Go or any other concurrent language, but you only run it on a single processor, then it's certainly not a parallel program, because it's not executing anything in parallel. But it can still have concurrent structure. And even on a single processor, that could be a useful way to model the way this stuff works. Now I actually give a talk at the beginning of the year at a Heroku conference, which you can find online, where I talk about in great detail about what the difference is between concurrency and parallelism. I don't want to spend too much time on it today. But I want to stress that I'm talking about concurrency here, not parallelism. And just to stress this, concurrency is a model for software construction. And the reason it's valuable is that to interact with the real world, you have to figure out how to write your software to do that. And concurrency and features provided by Go, but also in other languages, are easy to understand. They're easy to use. They're easy to reason about-- which is really important, because you don't need to be an expert to use them. You can use the concurrency features of Go without understanding all of the minutiae of memory barriers and threads and condition variables and all the other stuff that people often think of as the standard way to program parallel or concurrent applications. It's just not true. You can work at a much higher level and make it a lot easier for yourself. And in fact, a lot of people have used Go to write concurrent programs who've never done concurrent programming before, and they find it's actually quite easy to do. Now to many people, as I said, when Go came out, the concurrency stuff seems kind of new and intriguing. It's not actually totally original. In fact, there's a very long history of concurrent ideas and programming languages that leads towards Go, and I've listed some of them here. The most important one is a seminal paper, which I actually recommend every computer programmer in the world should read, is by Tony Hoare, in 1978, called "Communicating Sequential Processes." And that is the fundamental paper. All the real ideas are in there. And all these other languages build on the ideas from that paper to construct real programming environments. Probably the most original language that came out of this was this thing called Occam, which was the programming language for the transputer. It came out in the early '80s. You've mostly probably heard of Erlang. And there's another branch that I had a role in, with languages like Newsqueak and Limbo. There's also Concurrent ML, done by John Reppy, which is actually a lovely language. It's a functional language, but it's concurrent. It's kind of amazing, but it's another level of intellectual depth that some programmers aren't as comfortable with. But it is a beautiful language. Go is on the branch of the Newsqueak-Limbo-Alef sequence. And the thing that distinguishes them from most of these others is that it has the idea of a channel as a first-class value. In the original Hoare CSP, you communicated directly with a process by name, and that's essentially the way Erlang still works today. But in Go, instead you don't talk to a process. You talk to a channel, and the other end of that channel is some other thing that could be reading the values that you're sending to it. And one way to understand that distinction is that the Erlang original CSP idea is a little bit like writing to a file by name, whereas in Go, the channel idea is more like writing to a file descriptor. And the level of indirection that it provides is actually kind of important to a lot of things that get done. But I want to stress that one model isn't better than the other. They're actually formally equivalent. You can write one form in terms of the other. But Go definitely is about channels. So enough of this sort of theoretical nonsense. Let's actually show some code. Now one of the problems with illustrating concurrent programming is that the ideas tend to be subtle if you've never seen them before. And people tend to write complicated examples to demonstrate them. And I don't want to do that. So instead, I'm going to make all of my code today be around really boring things. So these are going to be really boring programs, but they're going to use concurrency to make the boring a little less interesting. And this gives it a focus on the concurrency rather than the boring elements that are actually being made concurrent. So here's a particularly boring program. All it does is print very slowly, once per second, la la la. Now I'd like to stress, actually, what's going on here is that every time I hit the run button here-- this is an HTMI5 presentation-- it compiles the binary out of the HTML, runs it, and pops up this window talking with a web socket back to the browser. And it's all done in Go. By doing it this way, you get to see the actual code running, and you also trust that what's going on here is really what I'm showing you. It's really important for the concurrency because timing things come in. So let's make this program a little less boring by putting in a random sleeping interval so that they run a little less deterministically, so there's a little randomness in there. OK. Pretty boring program, right? You still with me? All right. Now let's run this thing. So it's pretty easy. Here's the whole program except for some boilerplate at the top, and there's the whole program running out of the thing, and la la la. OK. Extremely boring. Good. Now boring things are things you want to ignore. So let's ignore that. To do that, we run the boring function as a goroutine. I"ll talk a little bit more about goroutines in a minute. But for now, just think of a goroutine as being analogous to launching a shell command with an ampersand on the end of the line. It says, run this function in the background. I don't care about it. I'm just going to keep executing directly. So let's run this. Something went wrong. That's not actually a bug. This is actually what Go does. What happened down here is that we launched this goroutine, and then main immediately returned because we didn't have to wait. But the way it Go is defined, when main returns, the program actually exits. And so this is sort of a feature of Go that's deliberate. When main returns, the program goes away. But it also confuses people, at first. So I put it in the very first example. So when you run the program, it launches the goroutine, but the program returns and we're done. So let's make that a little more interesting by doing something after we launch the goroutine. So here we launch it, and now we print a couple of things out, wait for a while, and then exit. So now you can see that, in fact, we are running both the boring function and the main function concurrently. Right? Very, very simple. So what are these goroutines? Well, it's an independently executing function. You can think of, when you run a function, normally you wait for the function to complete, and you get the answer back. The goroutine says, run the function, but I don't want to wait for the answer to come back. Just go and execute independently of me. So this is the sort of fundamental idea of composing. So in Go, concurrency is the composition of independently executing goroutines. It's got its own stack. And the stack actually grows and shrinks as required. So unlike some threading libraries, where you have to say how big the stack is, it's never an issue in Go. You don't have to say how big the stack is. It will be made as big as it needs to be, and if the stack grows, the system will take care of the stack growth for you. And they start out very small. So it's very cheap and practical to have thousands or even tens of thousands of goroutines running. And I'll show you an example, later, where we have way more than that. And we've even seen large jobs running in production with millions of them. So they're very, very cheap things. They're not threads. However, for the point of view of understanding them, it's not misleading to think of a goroutine as just an extremely cheap thread. Because that's really, in effect, what they are. What happens in the runtime is goroutines are multiplexed onto threads that are created as needed in order to make sure no goroutine ever blocks. So you just launch the goroutines, and you don't have to think about it. And that's kind of the point. Now our example that we ran before, where we started the goroutine, printed a bit, and then exited, that was actually a cheat. Because the goroutines were independently executing, but they were not communicating or synchronizing their behavior in any way. I just started this guy, this program talked, and the other one kept running, which isn't a very helpful example. And so to do a proper concurrent program, you need to be able to communicate among the goroutines inside it. And to do that, there is a concept of a channel in Go. And as I mentioned earlier, channels are sort of a fundamental concept in Go. And they're first-class values, and they're really interesting. And we'll use them to do some interesting stuff. But the basics are very straightforward. The first block there shows you how to declare and initialize a channel. So var c chan int says declare a variable called c that has type channel of integer. And channels are types, so the values that go on the channels have a static type, in this case, int or integer, default integer. And then to create a channel, you actually have to initialize it by saying make of chan int. And in Go syntax, you typically use the := notation. So those two lines above are equivalent to this c := make chan int. And you tend to see that kind of code all the time. When you want to send a value on a channel, you use the left-pointing arrow operator. And so this sends the value 1 on the channel. And then to receive it, you put the arrow on the other side of the c, and now the value comes out of the channel, so that's a receive operation. So if one goroutine says c gets 1, then the other goroutine says, receive from c and store the value. That sends this 1 into the channel and then out again into the variable. And the way to think about the mnemonic here is that the arrow points in the direction in which the channel is sending data. So here, the 1 is pointing into the channel. Here the arrow's pointing out of the channel and delivering it to the value. So let's use the channel to do something. Let's make this program we have a little more honest. So here's our main function. And now, this time in a loop, we actually say printf, you say some value, and here is that receive on the channel, which we made up here. And then in the boring function, we just loop forever, sending a printed message to the channel. So if we run this, you can see the program running, and then for five times, it says that. So this is honest, right? This main and the boring function are independently executing, but they're also communicating in the strong sense. So there's a point about what's going on here, which is, obviously when you read from a channel, you have to wait for there to be a value there. It's a blocking operation. But also when you send to a value, it's a blocking operation. When you send a value on a channel, the channel blocks until somebody's ready to receive it. And so as a result, if the two goroutines are executing, and this one's sending, and this one's receiving, whatever they're doing, when they finally reach the point where the send and receive are happening, we know that's like a lockstep position. Those two goroutines are at that communication point-- the send on this side and the receive on this side. So it's also a synchronization operation as well as a send and receive operation. And channels thus communicate and synchronize in a single operation. And that's a pretty fundamental idea. Now for those experts in the room who know about buffered channels in Go-- which exist-- you can create a channel with a buffer. And buffered channels have the property that they don't synchronize when you send, because you can just drop a value in the buffer and keep going. So they have different properties. And they're kind of subtle. They're very useful for certain problems, but you don't need them. And we're not going to use them at all in our examples today, because I don't want to complicate life by explaining them. But the thing about buffered channels is they're much more like mailboxes in Erlang, so I thought it was worth mentioning. OK so given this idea of communication coupled with synchronization that Go's channels provide, the Go approach to concurrent software can be characterized as, don't communicate by sharing memory. Share memory by communicating. In other words, you don't have some blob of memory and then put locks and mutexes and condition variables around it to protect it from parallel access. Instead, you actually use the channel to pass the data back and forth between the goroutines and make your concurrent program operate that way. So based on those principles, we can now start to do some, what I call, concurrency "patterns." And I put it in quotes because I don't want you to think of these as being like object-oriented patterns. They're just very simple, little, tiny examples that do interesting things. So the first and probably most important concurrency pattern is what I call a generator, which is a function that returns a channel. So all of these examples for the next few slides are going to be variants of the things I showed you before, but I'm going to do them in different ways. So in this case, what we do is we take our boring function, and we give it a return value. We say that this boring function is going to return a channel of string, and this less-than means that the return value is a channel that you can only receive from because the main function is only going to use it as a received value. So in this case, in the main function, we call the boring function, and it returns the channel. And then this is the same as before. We just receive and print the values that come out. And in the boring function, instead of sort of looping forever, with the goroutine being launched in main, we actually launch the goroutine inside the boring function itself. You can see here, it says go func. This is a function literal. So this is the loop that we had before, but it's wrapped inside an anonymous function literal and then launched with a Go keyword at the top. And so this starts the computation and then returns back to the caller the channel with which to communicate to that process that's running, that goroutine. And so from the outside, this just looks like a function whose invocation returns a channel, and internally, it actually starts a computation running concurrently. So let's run that. This will behave exactly the same way, but now we've got a much nicer pattern for constructing this service. And in fact, this is very much like having a service. We could use multiple instances of this boring function, give them different names, and we'd have different services, each of which are talking to us. So here's a simple example where we actually launch the boring function twice, and then in the loop, just print a value from Joe and a value from Ann. And it's exactly the same boring function as previously. We're just using it in a different way. So you can see, you get a 0 from Joe and Ann, a 1 from Joe and Ann, a 2 from Joe and Ann, and so on. Now inside here, we're reading a value from Joe and a value from Anne. And because of the synchronization nature of the channels, the two guys are taking turns, not only in printing the values out, but also in executing them. Because if Ann is ready to send a value but Joe hasn't done that yet, Ann will still be blocked, waiting to deliver the value to main. Well, that's a little annoying because maybe Ann is more talkative than Joe and doesn't want to wait around. So we can get around that by writing a fan-in function or a multiplexer. And to do that-- here's this fan-in function-- we actually stitch the two guys together with the fan-in function and construct a single channel, from which we can receive from both of them. And the way to think of it is a little bit like this. Here's Joe, and here's Ann, and they're both talking. And then the fan-in function here is taking values from Joe and values from Ann and just forwarding them on to the output of the fan-in function. Right? And to do that, again, it's using the generator pattern. The fan-in function is itself a function that returns a channel. It takes two channels as input, returns another channel as its return value. And what we do is, again, make the channel and return it. But internally, we launch two independent goroutines, one copying the output from input1 to the channel, and the other one copying the data from input2 to the channel. So when we call fanIn of boring Joe and boring Ann, remember, they return channels. They become the arguments to the fan-in function, and then we launch two more goroutines to copy the data. And so we run this guy. Ann and Joe are now completely independent. And you can see, when it runs, that they run in not necessarily sequential order. Because there's Joe gives three communications in a row, there, before Ann had anything else to say. OK? So that decouples the execution of those guys. So even though it's all synchronous, they can independently execute. So what if, for some reason, we actually don't want that? We wanted to have them be totally lockstep and synchronous. Well to do that, remember I said these channels are first-class values in Go. That means that we can pass a channel on a channel. And so we can send inside a channel another channel to be used for the answer to come back. To do this, what we do is we construct a message structure that includes the message that we want to print, and we include inside there another channel that is what we call a wait channel. And that's like a signaler. And the guy will block on the wait channel until the person says, OK, I want you to go ahead. So let's see how that works. So here's our main loop-- again, just five times around the loop. We receive a message from the first person and send it-- this is, again, the fan-in functions on the other side of this. Receive another one and print that, and then wait them out. Now notice that this is message1 and message2. They have channels inside them that are the guys that are going to be used to do the sequencing. So inside the boring functions, now, we have this wait-for-it channel, and then everybody blocks, waiting for a signal to advance. So when we run it now, you can see they're back in lockstep, because even though the timing is random, the sequencing here, with message1 wait and message2 wait, means that the independently executing guys are waiting on different channels for the signal to advance. Now that's all fairly simple and kind of silly. We can make it a little more interesting by using the next part of concurrency in Go, which is the select statement. And the select statement is a control structure, somewhat like a switch, that lets you control the behavior of your program based on what communications are able to proceed at any moment. And in fact, the select statement is really sort of a key part of why concurrency is built into Go as features of the language, rather than just a library. It's hard to control structures that depend on libraries. It's much easier this way. So here's how a select statement works. It looks kind of complicated. There's a lot of text on this slide. But it's really pretty simple. A select statement looks like a switch. It's got a bunch of cases. And each case, instead of being an expression, is actually a communication. So you can see, there's a receive, a receive, and a send. And what happens in this select statement is when you come to the top of the select, it evaluates all of the channels that could be used for communication inside the cases, and then it blocks until one of them is ready to run. And once one is available to run, that case executes, and the whole select goes on. So it's much like a regular switch statement, except it blocks until a communication can proceed. And there's a default that you can put inside a select. If there's no default, then the select will block forever until the channel can proceed. If there is a default, what it means is, if nobody can proceed right away, then just execute the default. And this gives you way to do non-blocking communication. We're not going to depend on that, but I just wanted to make sure you knew it was there. So it's really pretty simple. The only other sort of wrinkle is that sometimes multiple channels are available at the same time. When that happens, the select statement chooses one pseudo-randomly. So you can't depend on the order in which the communications will proceed. Those of you who know Dijkstra's guarded commands will recognize the pattern there. It's very much like the guarded if statement inside Dijkstra's commands. So here's our fan-in function. Same functionality, but now written using select. So here's our old guy. This is exactly the function we had before, you can see. And here's the new one using the select statement. If I go back and forth, you can see the difference is, the original guy started two goroutines, one for each input channel to be copied to the output. The new guy starts a single goroutine because it's still got that generator pattern. But instead of two copy loops, it runs one copy loop, and it selects which of the two is ready and passes the data on appropriately. So this has exactly the same behavior as the other one, except that we're only launching one goroutine inside the fan-in function. Same idea, different implementation. Now we can use selects to do all kinds of interesting things. One of the most important is we can use it to time out a communication. So if you're talking to somebody who's very boring, chances are you don't want to wait very long for them to get around to saying something. So in this case, we can simulate that with a call to a function in the library called time.After. And here, this select statement says either we can get a message from Joe, or a second's gone by, and he hasn't said anything, in which case, we'll just get out of here. So time.After is a function inside the standard library that returns a channel that will deliver a value after the specified interval. So in this case, either we get a message, or a second goes by, and we don't get a message, and then we finish. So if we run it, you can see, Joe goes for a while. And then he takes too long to say something, and so we exit. OK? I think it always does this. I think I forgot to seed it. It's always saying the same thing. All right. But you see the idea. We just keep going around this loop until the timeout fires. Now you can do that another way. We might decide, instead of having a conversation where each message is at most one second, we might just want a total time elapsed. And to do that, we can use the time.After channel more directly by just saving it inside a timeout channel and using it inside the select statement. So in this case, this loop, this entire loop will time out after five seconds. So it doesn't matter how many times Joe says anything. After five seconds, we're out there. Boom. OK? So that times out the whole loop. The difference, here, is we're timing out each message. Here, we're timing out the whole conversation. Now another thing you can do with a select is instead of using a timeout, you could actually deterministically say, OK, I'm done. Stop now. So here's our inner loop again, the select inside the inner loop. There's the send on the message. But we actually have a second case which is a quit channel. And what we do there is in the main function, we create some quit channel. In this case, it doesn't matter what type it is. I just picked bool. And then after we've printed as many times we want what he has to say, we signal him and say, OK. I'm done. And so at that point, this case can proceed, because this guy's not communicating and will eventually stop. So again, this is the same thing as before. But now after only two executions, we decide that's enough. We tell him to quit. And so the quit case executes in the select, the function returns, and the boring conversation is over. Now there's a problem with that model, though. Because in here, in this case, what if after this function is done, he needs to do something inside here? So he gets the message to stop, but he might have some cleanup functions to do. Remember that when main returns from a Go program, the whole thing shuts down. Maybe he's got to remove some temporary files or something like that. We want to make sure that he's finished before we really exit, and so we need to do a slightly more sophisticated communication. And it's very easy to do that. We just turn around and say, send me a message back when you're done. And so in this case, we say "Bye!" But then Joe gets the message on the quit statement, does a cleanup, and then tells you, OK, I'm done. And this gives synchronization for the two programs, to make sure that they're both where they want to be. So in this case, you see we tell him "Bye!" And then the quit fires. We do whatever cleanup is required. Then we respond. But now we're telling him that we're done for sure, and so it's safe for him to exit. And this is a round-trip communication. Now speaking of round-trips, we can also make this crazy by having a ridiculously long sequence of these things, one talking to another one. So think of it like this. You've got a whole bunch of gophers who want to do a Chinese Whispers game, although I think Chinese Whispers with megaphones might sort of make it a little weird. But you see the idea, here. This guy's sends a message to him, sends a message-- forwards it, forwards it, forwards it. And the last guy receives the message and prints it out. Now I want to stress that this is not a loop. This is just going all the way around the chain, back to the answer here. And to make it interesting-- this is the gopher here-- what he does is receive from the left-- sorry, receive from the right, because this is where it's going. Coming in from the right. Sending to the left. So you receive on the right. And then we make it a channel of integers, so I'm going to add 1. And the reason for that is, that lets us count the number of steps, so the distortion in the Chinese Whispers game is we add 1 to the value. And I'm not going to go through all the details of this code. This is actually sort of subtle, and I don't want explain it all. But all it does is basically construct this diagram using channels to send the answers along. And then everybody's waiting for the first thing to be sent. And so we launch the value into the first channel and then wait for it to come out of the leftmost guy. So it goes, vvvvt, around the channel. And for the sake of fun, I'm going to run 100,000 gophers. Now remember that I'm going to compile the program, link it, and then run it. And it takes that long to do 100,000 goroutines and all the communication. [APPLAUSE] ROB PIKE: Those are some fast gophers. OK. Now this is obviously a silly example, but it's an honest one. Because I really am creating all those elements and doing the full communication and the whole shebang. So you can think of goroutines as being very, very lightweight things. They're even smaller than gophers. OK. But so far, everything we've been doing is very toy-like. And I want to stress that Go was designed for building system software. And I want to talk now about how we use these ideas to construct the kind of software that we care about. Now we did this at Google, and so what we're going to do is build a Google search engine. Sort of. It's still going to be a toy. I can't develop a Google search engine in less than about a half an hour, and I only have 15 minutes. But we can start. So think about what a Google search does. If you're going to the Google web page and you run a search, you get a bunch of answers back. And some might be web pages, there could be videos, there could be song clips, or weather reports, or whatever. There's a bunch of independently executing back ends that are looking at that search for you and finding results that are interesting. And there might also be some ads, so there's an ad system running, too. And so in parallel, you want to send all of these things out to the back ends and then gather all the answers back and deliver them. How do we actually structure that? Well, let's fake it, completely. Let's construct a thing called a fakeSearch, and all our fakeSearch is going to do is sleep for a while and then return whatever the fake answer is that it wants, which is very uninteresting. It says, here's your result. But the point is that there's multiple of them. And it's actually a function. So notice here, this search is a function that takes a query and returns a result. So that's sort of a type definition for what a search actually does. And we construct these functions for a web, an image, and a video service. OK? Very simple. All they do is pause for a while, and then print. And they're set to wait for up to 100 milliseconds. So let's test it out. So here-- this time I'm going to seed the random number generator, so the values are always different. And we start the timer, get the results from the search, and then print out how long it took. OK? So if I run this guy, you can see that was 168 milliseconds. Now remember, each of these guys could take up to about 100 milliseconds. So we could see up to, like, 300 milliseconds, maybe. There's 160. There's 94, that was a quick one. 213, that was a slow one. So these are actually, you know, running these guys. OK. Now let's make it an actual real function that returns all of the values back, right? So here, we actually have this Google function that takes a query and queries all of the back ends and gathers all of the results together and returns back a slice of the results, which is-- think of it as just an array of results. So here we run this guy, and you can see, there are the three things, taking a little longer to gather all of the data. OK. Trivia, right? The problem is that if you think about it, this is running one guy, waiting for his answer to come back, running another one, waiting for his answer to come back, running a third one, waiting for his answer to come back. Well, you know where this is going. Why don't we launch those in goroutines? So now for each of the back ends, we independently launch a goroutine to do the search, and then-- this is the fan-in pattern-- get the data back on the same channel. And then we can just print them out as they arrive. So they're going to come out of order now, but we're going to get all three of them back. But they're running concurrently, and actually in parallel, in this case. And so we don't have to wait around nearly as long. So there, we see, 76 milliseconds. That's pretty quick. 88 milliseconds. Now we're really only waiting for the slowest, the single slowest web search. 15 milliseconds. There you go. So that's pretty cool. And notice that this is a parallel program, now, with multiple back ends running. But we don't have any mutexes or locks or condition variables. The model of Go's concurrency is taking care of the intricacy of setting up and running this safely. Now sometimes, servers take a long time. They can be really, really slow. So remember, we set these up for 100 milliseconds. Once in a while, an individual search might take more than 80 milliseconds. And let's say we don't want to wait more than a total of 80 milliseconds for the whole thing to run. We want to use the timeout pattern now. So here's the fan-in pattern, and here's the timeout for the whole conversation. We run this guy now, and 81 milliseconds. That was pretty quick. 44 milliseconds. You see, these are fairly quick now. They're typically 80 milliseconds or less, which is what they should be, because we never wait. But if I run this enough times-- there. We timed out because, in this case, two of the queries took too long. And so all we got back was the web result. We didn't get the other two. And that's a kind of nice idea. We know that we're going to be able to get you an answer within 80 milliseconds. However, timing out a communication is kind of annoying. What if the server really is going to take a long time? It's kind of a shame to throw it on the floor. So now we add replication. So if we run three instances of the service, say, or five, one of them is likely to come back before the timeout expires. If only one of them is having a problem, the other ones can all be efficient. So how do we structure that? Well, here's our familiar pattern by now. We actually write a function called First that takes a query and a set of replicas-- this is the Go notation for a variadic function-- so we have a bunch of replicas of a search that we're going to do, for a single search. Like, replicas of the web search or replicas of the image search. And we make the channel of results. And then we launch the same search multiple times and then return the first one that comes back. See, all these guys are going to talk on the channel, but we're only going to return the first one that we get. And so this will give us the first result from all those back end guys. So here's a simple use of it, where we run two replicas. And that time we got replica 2 in 30 milliseconds. Replica 2, Replica 1 in 24. So you can see, it's whichever one comes back first. There's five milliseconds. And with that little tool, now, we can build the next piece, which is to stitch all of this magic together. So this is Google Search 3.0, full on. It's got everything in it. It has the fan-in function, it's got the replicated back end stuff, it's got a timeout on everybody. And so we should, with very, very high probability now, get all three of our web search results back in less than 80 milliseconds. So there's 40 milliseconds, 56 milliseconds. You notice, they're all three there. There's no timeouts. 18, 39, 51. And this is obviously a toy example. But you can see how we're using the concurrency ideas in Go to build, really, a fairly sophisticated, parallel, replicated, robust thing. And that's kind of the message. Because it's very simple to do that. There's still none of that sort of minutiae of memory barriers and nonsense that people who use threading approaches are aware of. And there's no callback, so this is very different from using, like, node.js, or something like that. That program is fairly easy to understand. More important, the individual elements of the program are all just straightforward sequential code. And we're composing their independent execution to give us the behavior of the total server. So to summarize what we just did there, we started with a very simple program that was slow and consequential and failure-sensitive, at least we pretended it is. And by using the concurrency ideas in Go, we made that same stuff run quickly, concurrently, in a replicated way, and with much more robustness. And this is sort of the lesson. This was actually why the concurrency features went into Go. It's because it makes it easy to go from this to this without worrying about safety issues and things like that. It's just a very much more straightforward approach to constructing what I loosely describe as server software. There's all kinds of party tricks. This is barely, barely touching the surface, and there'll probably have to be a talk again soon with a lot more rich examples using other features that I haven't shown you today. But there's lots of talks on the web already, with independent things about using Go concurrency stuff to solve some interesting problems. There's a Chatroulette toy, which Andrew talked about, that has a fairly amazing little inner loop in it that you should probably check out. A couple of years ago at I/O, I talked about a dynamic load balancer which uses channels as first-class values to construct some pretty interesting stuff. There's a legendary example called the Concurrent Prime Sieve, which is kind of an amazing thing. It was the first truly beautiful concurrent program I think I ever saw. But it's completely dwarfed by this work by Doug McIlroy, who's my old boss at Bell Labs, who wrote a concurrent power series library using a predecessor language to Go. And it treats the coefficients of a power series as values in a channel, sequential values in a channel. And using some very simple tricks, very much like what I just showed you, it manages to do some incredibly high-level mathematics that is very, very difficult to do any other way. It's really rather beautiful. So there's the links if you want to check them out. Now having just said how wonderful all this is, I want to throw out a word of caution. This stuff is fun. It's really fun to write your first concurrent programs and play around with this stuff. And you should definitely try them out and do things to see how they behave. But don't overdo it. These ideas in Go and, for that matter, the other languages I mentioned, are not not replacements for things like memory barriers to protect the innards of software. They're really sort of high-level building block-like things that you use to take simple pieces of and connect them together into larger concurrent things. They're big ideas. They're really tools for program construction. Remember, I said concurrency is a model for software construction? These concurrent tools are models for software construction. But sometimes, you don't need that much power. If you need a reference counter, you could write one with a channel on a goroutine. It's fun to do that as an exercise. But I didn't show you that, because I think that's a silly example. It's using much too heavyweight a tool for a very simple thing. So Go has these packages in the sync directory, called atomic and-- what was the other one called? Sync and sync/atomic. And they contain these low-level guys for things like reference counters when that's all you need. And sometimes it is all you need. So you have to balance the sort of program structure you're doing. Glue together the large things with these tools I've shown you, but sometimes if all you want to do is count the number of times somebody hits your page, all you really need is a reference counter. So as always, you should use the right tool for the job. So in conclusion, goroutines and channels are the concurrent features of Go. And they make it very, very easy to construct interesting concurrent software that solves real-world problems that include things like having multiple inputs, having multiple outputs, independent execution, timeouts, failures, replication, robustness. All those things that are sort of features of the modern programming landscape, Go actually gives you the tools to manage very, very well. And it's actually-- even if you're dealing with boring people, it's actually kind of fun to work with. So here's some links for you. I think that's the last slide. I'll just leave that one up so you can look at these during the question period. The Go home page, golang.org, has tons and tons of resources, videos, the language spec, package documentation, of which there's a phenomenal amount, lots of second-order documents like tutorials and blog posts and stuff like that. And then at the bottom and barely readable down at the bottom here, there's this tinyurl.com link to my talk at the Heroku conference earlier this year, about concurrency is not parallelism, which also has some other examples that are a little richer than some of the other ones I showed you today, because today I was really concentrating on the basics. So with that, I'll stop and maybe take some questions. [APPLAUSE] ROB PIKE: Go to the microphones, please, so that the people at home can play along. AUDIENCE: So one question I had-- so the only other tool, I think, that competes with Go for concurrency support is Haskell, which I use regularly. And Haskell does feature some speculative parallel operators. I was wondering if those might ever appear in Go. ROB PIKE: Haskell is a beautiful language. It's a really lovely language. But it has a very, very different model. And I think, to my mind, the functional stuff sort of comes for free with Haskell, but the whole idea behind the language is the lazy evaluation kind of stuff. And I think it's an excellent fit for Haskell. I don't think it's a good fit for Go. And in fact, Doug McIlroy took the concurrent power series example and rewrote it in Haskell. And it's a very beautiful program that I can't understand at all. But it's very beautiful. So I'm not trying to make fun of Haskell. I'm really not, because it is amazing language. But I think that the key point about the way the concurrency features work in Go is that they work with the way Go, as a language, also works. And you can't just borrow a feature from another language and stick it in yours and expect it to work well. That art of choosing what to put in your language is part of it. So this is not to say at all those ideas aren't really powerful. I just don't think they're a good match for Go. And if it turns out that I'm wrong, maybe they should go in. But offhand, I don't think they're a good fit. Yes. AUDIENCE: Can you comment on best practices for testing goroutines? Because it seems like you're quasi-integration testing. Or are there mocks for goroutines? Or how would you go about that? ROB PIKE: So best-- it's hard to hear you. So best practices for testing concurrent code? AUDIENCE: Exactly. And for testing goroutines. ROB PIKE: Best practices. Write good tests. [AUDIENCE LAUGHS] ROB PIKE: Are you worried about the non-determinism? AUDIENCE: Well, It seems like, if you think about goroutines as external services, you're thinking about basically integration testing. ROB PIKE: Well, one of the amazing things about-- let me back up here. It's actually kind of a non-issue because of the way the language works. So let me find the example I'm looking for. Where is it? Actually, I went back way too far. I could have used a much earlier one. Here we go. Look. Look at this guy here, this example. The total interface to this service is a channel. Nowhere does this function here know what that channel has behind it. It's just this function in the background that's doing something. It could be an arbitrarily complex computation. And once you realize that that channel is, in effect, the capability to a service, you can mock this service with a simple thing. In fact, this might be a mock for a service that returns values at arbitrary intervals. So I mean, it's a perfectly good question, but all the tools I've shown you to do that are right here. You don't need-- the whole idea of a channel is that it hides what's behind it. And so that's mocking, right there. You've got it. AUDIENCE: Right. Because it's a first-class-- ROB PIKE: It's a first-clas citizen in the language. AUDIENCE: --that's your mock. ROB PIKE: Right. AUDIENCE: Thanks. ROB PIKE: Cool. Sorry. Let me get to the slide you really want to look at. There we go. OK. Yes. AUDIENCE: Hi. My question is somehow related because I really like the way you take concurrency as a first-classing language. For instance, in typed language, you do the static analysis of the types, runtime analysis of the types, and everything works right. Do you plan on doing something similar in Go? Signs in a way that-- OK. If you do this, you're going to write to that lock or if you keep doing this, you're going to have a lock Or you are going to have a bottleneck the way you have arranged all these patterns. Do the Go compilers go through any kind of prettification of the concurrency properties of your program? ROB PIKE: I'm sorry. I'm having a really hard time understanding you because the speakers are pointing backwards, and I'm behind them. You're asking about the static channel network and how we typecheck that kind of thing? AUDIENCE: Go is great for writing concurrency programs, but do you plan or do you do any kind of verification of that concurrency, either statically or at runtime? ROB PIKE: OK. So you want to know about static verification of the thing? For that kind of thing, I think independent tools are the way to go. There's a thread sanitizer project that is coming out. I don't know if it's actually out yet. Is it out yet? Yeah, it is. It's from Google. And they have support for the Go environment. And so you can embed the thread-sanitized version of the world inside your program, and it will actually look for data races and stuff like that, if that matters. From a communication point of view, tools like the SPIN thing that Gerard Holzmann did, stuff like that I think is a really good way to model these kinds of things for static checking. And a long time ago, Gerard did some stuff for us to statically verify pieces of the Plan 9 kernel and its communication and synchronization stuff using SPIN. And I've been thinking of actually approaching him and asking him to support Go's primitives, so he can actually read a Go program and generate a SPIN model for it and verify it statically. And I think that's actually fairly straightforward to do, but it hasn't been done. AUDIENCE: Thank you. ROB PIKE: Please speak carefully and slowly so I can hear you. AUDIENCE: So I have two questions. I hope it's OK. You showed us a read from a time.After channel inside a select and inside a loop. Say I wanted to write time.After. How would I know from the other side that the channel has fallen out of scope so as to not be locked forever, waiting for the read. ROB PIKE: You mean, how do you know-- so this guy, if the timeout doesn't fire, what happens to him? AUDIENCE: Yeah, and what happens to the code that's in on the other side. ROB PIKE: It'll get garbage-collected. AUDIENCE: So-- ROB PIKE: So if you want to know that-- if this is a richer example, and this is not time.After but some more complicated thing, you want to know that he got your message? That's this example right here. AUDIENCE: No, I actually want to know, on the other side. If I'm writing the time.After function, and I wait my time, and then write to the channel-- ROB PIKE: Right. AUDIENCE: But the write locks me until it is read-- ROB PIKE: Right. AUDIENCE: So if it, instead of being read, it never gets read and the channel falls out of scope, will I just be locked there, and then I will be garbage-collected completely? ROB PIKE: Yep. It's actually-- there's more subtlety than that, but yeah. I mean, you shouldn't be worrying about it at this level. That's the short version of it. The time.After function uses some of these ideas, but it's slightly more sophisticated than the basic stuff I've shown you. If you really care about when your resource is released, then you have to write more sophisticated code than some of the stuff I've shown you. There is some stuff in the Google stuff where there's blocking going on behind the scenes that's also hidden from you. In the time available, I can't go through all of the subtleties of that. That's probably a subject for another talk. AUDIENCE: OK. But essentially, if I am locked on that channel that gets garbage-collected, so no one is going to unlock me, I will be garbage-collected, no problem, right? ROB PIKE: In many cases, yes. It depends on the specific example. AUDIENCE: Second question is, how did you do the any number of stacks that grow independently? ROB PIKE: I'm sorry, I can't hear you. AUDIENCE: The any number of stacks that grow independently. ROB PIKE: Yep AUDIENCE: Are you allocating the stack frames on the heap? How do you do that? ROB PIKE: They're allocated and managed, yes, by the heap. They have a special allocator. They're not garbage-collected in the standard collector, because we know more about them. But they're allocated and free fairly cheaply. AUDIENCE: Thanks. ROB PIKE: Sorry. I'm having trouble hearing you. AUDIENCE: I was wondering about the select control structure that you had. You were saying that if multiple cases are ready at the same time, then you pseudo-randomly choose one, rather than evaluating them in order. Is there a way to write it so that you could have a higher-priority case if you wrote it first? Or would you just have to nest them or something? ROB PIKE: You would nest them. AUDIENCE: You'd put the default as another select. ROB PIKE: Yeah. One of the many properties of Go is we tried to take away things like priorities and fine-tuning adjustments and stuff like that. There's enough power here to do that kind of stuff, but you have to write more code, as opposed to making the API really complicated. So you can do prioritized stuff in select, but yeah. You basically nest them to do it. You put the guys you care about most first, and if they're not ready, then you try the other ones. AUDIENCE: Gotcha. OK. ROB PIKE: This is somewhat related to the earlier question but a little bit more specific. So is there a way, if you create a channel, to determine how many readers or writers there are for a given channel? AUDIENCE: Yeah. there's a built-in function you can use to see how many values are in a buffered channel, if it's a buffered channel. For a non-buffered channel like these, there is no way at the moment to find out how many readers and writers there are. One of the reasons for that is any such question is inherently unsafe. Because if you care how many readers and writers there are, then that's because you're going to do some computation based on that. But that computation might be wrong by the time you end up doing the computation, because the number of values could change. When you have this kind of thing it's much more rigorous, because you know this communication actually proceeded, and there's no doubt about that. If you say, can I proceed? OK, I'll get a value. You don't know that someone else might have come in and stolen that value, and you didn't know about it. AUDIENCE: Right. ROB PIKE: So it's actually sort of an important point there. AUDIENCE: The example I was thinking of is, let's say in your main method, you create a channel. And then you attempt to read from that channel, but there are no writers to that channel. Or the inverse, you attempt to write to that channel, but there are no readers. ROB PIKE: Right. AUDIENCE: Do you just block indefinitely? Or-- ROB PIKE: Yeah. AUDIENCE: Does it determine that-- OK. So it doesn't determine-- ROB PIKE: You'll have to-- you know, if there's multiple writers, then the first guy to get there gets the first value, is how it works. And it's FIFO semantics. AUDIENCE: OK. Thank you. ROB PIKE: Yep. Any other questions? All right, at the back there's the usual collection of goodies to be handed out carefully. Please don't mob the helpful staff. And thanks for coming. [APPLAUSE]

Additional works

While at Bell Labs, Pike was also involved in the creation of the Blit graphical terminal for Unix,^[5] the Inferno operating system, and the Limbo programming language.^[1] Pike also developed lesser systems such as the vismon program for displaying faces of email authors.^{[citation needed]}

Over the years, Pike has written many text editors; sam^[6] and acme are the most well known and are still in active use and development.^{[citation needed]}

Pike started working at Google in 2002.^[1] While there, he was also involved in the creation of the programming language Sawzall.^[7]

Pike appeared on Late Night with David Letterman, as a technical assistant to the comedy duo Penn & Teller.^[3]^{[non-primary source needed]}

Personal life

Pike is married to author and illustrator Renée French; the couple live both in the US and Australia.^[8]

References

^ ^a ^b ^c ^d ^e ^f ""The Best Programming Advice I Ever Got" with Rob Pike". InformIT. Pearson Education. Aug 15, 2012. Archived from the original on 22 Sep 2023. Retrieved 15 January 2024.
^ Cox, Russ; Griesemer, Robert; Pike, Rob; Taylor, Ian Lance; Thompson, Ken (2022-04-01). "The Go programming language and environment". Communications of the ACM. 65 (5): 70–78. doi:10.1145/3488716. ISSN 0001-0782. S2CID 248385361.
^ ^a ^b Pike, Rob. "Rob Pike". 9p.io. Alcatel-Lucent. Archived from the original on 29 Jan 2023. Retrieved 15 January 2024.
^ "Dynamic generation and overlaying of graphic windows for multiple active program storage areas". Retrieved April 1, 2020.
^ Pike, Rob (October 1984). "The UNIX System : The Blit: A Multiplexed Graphics Terminal" (PDF). AT&T Bell Laboratories Technical Journal. 63 (8). 15. doi:10.1002/j.1538-7305.1984.tb00056.x. Retrieved 7 January 2024.
^ McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
^ Pike, Rob; Dorward, Sean; Griesemer, Robert; Quinlan, Sean (2005-01-01). "Interpreting the Data: Parallel Analysis with Sawzall". Scientific Programming. 13 (4): 227–298. doi:10.1155/2005/962135.
^ "Renee French – A River Runs Through It – Artist Interview". WOW x WOW. 27 July 2015.

External links

Wikimedia Commons has media related to Rob Pike.

Wikiquote has quotations related to Rob Pike.

The Good, the Bad, and the Ugly: The Unix Legacy – Slides of his presentation at the commemoration of 1000000000 seconds of the Unix clock. (Archive on cat-v.org)
Systems Software Research is Irrelevant (a.k.a. utah2000); slides; ps file
Pike's personal homepage
Pike's Google homepage (Archived 2008-02-12 at the Wayback Machine)
Rob Pike on GitHub

v t e Rob Pike
Operating systems	Plan 9 from Bell Labs Inferno
Programming languages	Newsqueak Limbo Sawzall Go
Software	acme Blit sam rio 8½
Publications	The Practice of Programming The Unix Programming Environment
Other	Renée French Mark V. Shaney UTF-8

Authority control databases
International	ISNI VIAF
National	France BnF data Israel United States Japan Czech Republic Netherlands Poland
Academics	Association for Computing Machinery CiNii DBLP Scopus
Other	IdRef

This page was last edited on 16 February 2024, at 03:21

From Wikipedia, the free encyclopedia