To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.

Concurrency (road)

From Wikipedia, the free encyclopedia

An extreme example: I-40, Business I-85, US 29, US 70, US 220, and US 421 ran concurrently in Greensboro, North Carolina. US 220 and US 421 were rerouted from this concurrency in 2008.
An extreme example: I-40, Business I-85, US 29, US 70, US 220, and US 421 ran concurrently in Greensboro, North Carolina. US 220 and US 421 were rerouted from this concurrency in 2008.

A concurrency in a road network is an instance of one physical roadway bearing two or more different route numbers.[1] When two roadways share the same right-of-way, it is sometimes called a common section or commons.[2] Other terminology for a concurrency includes overlap,[3] coincidence,[4] duplex (two concurrent routes), triplex (three concurrent routes), multiplex (any number of concurrent routes),[5] dual routing or triple routing.[6][7]

Concurrent numbering can become very common in jurisdictions that allow it. Where multiple routes must pass between a single mountain crossing or over a bridge, or through a major city, it is often economically and practically advantageous for them all to be accommodated on a single physical roadway. In some jurisdictions, however, concurrent numbering is avoided by posting only one route number on highway signs; these routes disappear at the start of the concurrency and reappear when it ends. However, any route that becomes unsigned in the middle of the concurrency will still be signed on most maps and road atlases.

YouTube Encyclopedic

  • 1/5
    1 145
    3 799
    1 036
    3 846
  • ✪ PetaBricks: A Language and Compiler Based on Autotuning
  • ✪ Extending F* in F*: Proof automation and Metaprogramming for Typeclasses
  • ✪ [Live Demo] Inside CockroachDB 2.0
  • ✪ DOE CSGF 2011: Scale-Bridging Computational Materials Science: Heterogeneous Algorithms for Het...
  • ✪ Mod-03 Lec-14 Concurrent statements and Sequential statements


bjbjA NCSU Creative Services PetaBricks: A Language and Compiler Based on Autotuning Saman Amarasinghe Computer Science, MIT October 17, 2011 Amarasinghe: Saman Amarasinghe Male Speaker All right. So, let s get started over here. It s my pleasure to introduce Saman Amarasinghe. Was that close enough? Oh, sorry. He is leading the compiler group at MIT and you may know him for L.L. Briggs [ph], for DynamoRIO, or for StreamIt, or some other projects that he s led in and are well known in the community. He s got his master s from Cornell, right? Sorry. It was your undergraduate, the bachelor s and both master s and Ph.D. from Stanford. So all yours. Amarasinghe: Okay. So thank you. So I am doing a small bit and switch in the title. So the title that was advertised was probably a lot more interesting, but it s almost the same material, exactly the same material. So what I m going to do is I m going to start with telling three side stories. These are kind of interesting experiences that has led to this language and compiler I m going to talk about, and then but on the other hand, this is a lot more general global view of, I think, where the world stands with respect to high performance. And then three observations that actually directly led to the language and I have a bunch of material; let s see how much I can cover within an hour. So, the first side story is about basically how to get how the current program must deal with performance. So if you look at today s programmers out there, for the last 10-20 years they have been very happy in the world [ph] they got. Most of them was completely oblivious to the process that their world was running. Because most [ph] _______ gave that, all the performance needed for your average programmer. So I call him the average programmer, the Joe programmer. And for them, every year about 50%-60% improvement in performance is good enough and that more than _______ good enough and they re pretty happy. And what that means is they were able to build solid boundary between hardware and software. If you look at something like Java, it basically abstracts out everything but hardware from the program. You don t even know where your program is going to run; it s going to run in some mission out there. And that was really nice for these people. This abstraction provided a lot of freedom for this average programmer. They were able to build very complex programs and worry about features of the program, not where it s running and how great a performance they re getting. So this is where the average programmer was. And then they are in things like research labs. They were a few people who really care about performance; these were like fewer, even less than a percent probably of that, and these guys were the ones who really was going all the way down, dealing with hardware issues and trying to get all these performers. So this boundary was really there for 99% of the people, they could just live outside and not think about it before [ph]. So this was made possible basically by Moore s law. So in a talk like this I m obligated to show this slide there, so what this shows is that every 18 months the number of process tests keep doubling; that s what the Moore s law says basically. And for about I need to probably update this slide, but ______ prior [ph] had the process in there. And this line has continued in here [ph]. Unfortunately what has happened was, till about 2005, this line basically tracked the performance. This is the SPEC performance for all these machines. So as you double the transistors, we actually got this performance also doubling for a while, and what has happened is now this is slowing down. But before I go why it slows us down, I want to basically look at this phenomenon about this performance. So all of you, and especially most of my lifetime, I have lived in this life, there, every 18 months performance _____. We have gotten a huge amount of performance. So the first question is, when you have excess of something, no matter what it is, people normally squander most of it. So how have they squandered the Moore s dividend? So the key thing is we have gotten about 10,000x performance gain in these 30 years. And where did it go? Are we using it? I mean, of course, if you look at the modern software area, you have a lot of them that s useful, but there are a lot of them [ph] actually wasted. So if you look at the last decade, we concentrated heavily on everything except performance. And we looked at things like correctness, programmer productivity, and that is where all the research was done, that s where a lot of teaching was done, and if you look at languages, tools, all those things reflect that. Basically, everything except performance. In fact, software engineering is probably the only engineering discipline out there where performance and efficiency is not the central theme. If you go talk of the way we do in software engineering to somebody like a mechanical engineer or civil engineer, they will think you re crazy: that those fields are all about doing more from less resources. And we are basically mainly about doing less with more resources. And the interesting thing is, this is really reflecting through our entire community. So I teach a software engineering course, so at some point I was looking at all the things I am teaching our students I said, Wait a minute, let me see how I can do some of these things we talk about. So I just took Matrix Multiply and said, Look I had a bunch of lectures about immutable types, dynamic dispatch, object-oriented programming, all done very nice, and this is what our schools are learning these days. And probably a lot of you, this is what you re learning. You look at that [ph]. I take Matrix Multiply and I will use all these great concepts and apply that Matrix Multiply. So I did, and I ran with a basic 2K by 2K [ph] matrix and I got a number; this is in fact shows it s pretty darn slow. So I said, Okay, this doesn t look good, and immutable types is not something you want to do in Matrix Multiply because it makes then [INDISCERNIBLE] to the power of four [ph], because you have to keep copying the matrix. That s not nice. So I said, Okay, I get enough immutable types, and suddenly I get 220x performances. Okay, this is probably where probably most people will start probably, they won t probably do immutable types, and they say, Look, I was doing this nice fast and accurate [ph] dynamic dispatch because I want to have different type of matrices with integers and plots [ph]. I said, Wait a minute. Matrix, you don t [ph] need dynamic dispatch, moist of the matrices are basically double so just do one type in there. And if I do that I get another final 22x [ph]. And then I said, Oh ______ a matrix is a matrix. I mean it s just a nice two-dimensional array [ph], you no longer have these fancy objects here, you get enough objects. Now they get another 2x improvement. Now basically what I have done is gotten rid of all the things I was teaching in that class. So back to very simple language in here. And then I said, Okay, we might as well just get you to learn the language itself by do what I m teaching now [ph]. I do Java, let s go to C, and to my surprise, just copying that exact almost exact piece of code to see, gave me just 2.1x performance increase. So it s like, wow, this is interesting. Huh? No, I mean Java is doing a good GIF compiler [ph]. On the other hand, I have done this in, like, Python and PHP and Java skip [ph] that, I guess, like a couple thousand x [INDISCERNIBLE]. So at this point I said, Wait a minute. This is interesting, but when I was a graduate student a long time ago, there was a huge amount of emphasis in getting good performance; we actually went and hacked the code to get good memory program system performance and stuff like this. And okay, let me try to remember how we did all those things in the good old days. And things like and if you re doing Matrix Multiply, you transpose the matrix, you get unit stride in one matrix; okay, you transpose one, you get another 3.4x performance, then you tile for cache locality, you get another 1.7x performance, you recognize [ph] you get another 2.8x performance, you prefetch, you get another 2.7x performance, and at that point I start using the BLAS label because I didn t want to right prefetching code by hand. And then finally get parallelization. At that point I got 3.5x. The interesting thing is we got to get started. I am have a 296,000x performance gain for this part ________. So, basically comparing the typical software engineering approach, very high-level, object-oriented, all these nice cool things we are teaching everybody and what they re practicing these days to what a good performance engineer would do; basically looking at every ounce of performance, looking at all the architectural details and getting really good performance. A huge difference. So, a lot of times, when I ask this, the people are surprised how much of these in fact, Matrix Multiply is probably not your typical thing that every program you can do, but this is huge. So I want to figure out, okay, how do you do this? I just compare, and a lot of times when we look at performance we look at power. Power is an interesting issue. Or energy. So if we look at energy. If you look at miles per gallon difference between this supertanker, which is the gallons per mile, versus this one, it s only 14,000 [ph] _____. So in fact, if you want to compare, you know, to have 20 supertankers to compare, so if you think about it, if somebody comes with the technology [ph] saying, Look, I can run 20 supertankers with the same amount of fuel required to run this small thing, it would be a revolution. I mean, we d solve the world s energy crisis. And this is the kind of things we are at least in Matrix Multiply, in this example we are just wasting. So, the first thing is, there s a lot of performance out there we had, and the kind of the stack we had built, also, if you look at it, today s machines [ph], how we are running, the bottom rows we have the hardware and then we put a virtualization layer on top of that, on top of the running operating system; on top of that we re running a browser, on top of running, we re running interpreter [ph], on top of running like a Java script. I mean if you look at that list, we are just wasting a huge amount of resources. And so _______ is, even though we feel like we don t have any performance yet, we have a lot that we left behind. So if you think about when you look at the next system, if we design them well, we will actually be able to gain a lot. So the reason that we had to do that is, as I was alluding to here before, your performance starts flattening out and there now basically this has flattened, and in fact it s almost going down for a single ______ performance [ph]. So we are not getting single performance anymore. So you can t just wait for a year and say, Look, I wrote the program, it s running slow, wait until the next generation. [INDISCERNIBLE] That s what it used to be. Not going to work anymore. So there s no performance gains we are getting and performance has to come from somewhere else. So if you want performance, better use something like better languages, you need to be learning [ph] disciplined programming, you can t just use the XSU things in here, and/or you want to do performance engineering, so you won t actually really engineer a program [ph] for performance. And of course the other way of getting performance is parallelism. So what has happened is Moore s law has morphed into, basically, providing performance by parallelism because now every generation what s happening is you re not getting additional performance but you are getting additional cores [ph]. So now, every 18 months you double the number of cores you re getting. So if you want performance you d better use those cores [ph]. And so now the key thing is performance is parallelism. But the problem is this is really hard. Because you look at, take these programmers who are completely oblivious to performers, and now we see them not only have to deal with performance, they have to deal with parallelism. [INDISCERNIBLE] They just have to basically go all the way down and deal with all these complexities. So putting it more ______. So right now we have this performance of oblivious programmer, nice flat road, no issues in there. And we said, Okay, you ran for you had a nice 20 years on a flat road, now go climb this mountain. Goodbye. And that s basically what we are giving them because we gave all the problem, we gave it to the programmers. We said, Okay, you worry about all those issues [ph]. The interesting thing is, is there a better trajectory? So something that we ll say, we ll never get the flat trajectory here. You can t say those good old days are gone; you can t just ignore it. But can you ask them to do something less than dealing with every problem? So one thing we have been looking at is, can we ask programmers to deal with concurrency? Can we ask where the parallel you can run parallel, but you don t have to do all the performance optimization; in the compiler, we will try to get ______. So we are trying to find the right middle ground, because I don t think this is going to work anymore and this is not sustainable and we have a better picture [ph]. So, one way to do this is basically separate the word in there. So first is a parallelism extraction. You want to get things in parallel. The interesting thing is most of the past programs are written to simulate something in the real world. That real world is actually parallel, and can we get I think there are a lot of natural parallels among here, and by the time we write a program in a language like C or Java, we just basically obscure it. And can we get back without obscuring it? And also this need for parallel thinking, things like theory, algorithm, data structures basically has to be thought to in a parallel way. I ll get back to some of this point a little bit later. And then there is parallelism management. Once we know the parallels that are available, we need to map and get good performance. That, I think, is something the complier handles [ph]. The key thing in here is, when there s parallelism, let users provide that. If there is no parallelism available, just you should go [ph] that program is not going to give you parallelisms. That s a non-starter. But at least easy things. Let the programmers express that and then let s build a system where we can manage and actually lead to good performance. So that s the basically the big motivation for a bunch of my work. So next story that I want to talk about in these side stories is future-proofing software. So this slide is out of order. So let me go in there. So if you look at the Joe programmer, if they wrote a program in, say, 1970, the program will still work, and it will not only work, it will work faster. And what happens is these programs are still running faster because it s wrote [ph] the same basically, abstract machine models, a simple 191 [ph] model, and they still work faster. And because now machines are much faster, they run. Except what s happening is the new programming models out there is not the simple 191 model, they have these multicores, and these things keep changing. And the new reality is these machine models are going to keep changing a lot, and if you write a program for today s multicore, it s not going to work in tomorrow s multicore because they re very different. So future proofing [ph] programs for Joe, it suddenly becomes really hard. And I m going to go back in my slides for some reason. So, what happened is if you look at the program in one of those research labs I talked about, they had this issue for the longest time because they were trying to get the last ounce of performance [ph]. And what that means is every time there s a new machine they had to target that. So at some point it was a vector machine, so they did a _______; the next time they got a ______ machine so they did open [INDISCERNIBLE]. Then they got a distributed-memory machine, they wrote MPI. So they have been having these issues of basically reporting [ph] programs every year after year and that s a huge problem. So basically putting it basically, most software lasts 30+ years, most of the time, and if you go to a research lab, a lifetime of a computer system is less than six years, and every three years you get a new system. And every three years you report and very soon the kind of program atrophies because the ______ program was written beautifully by a physicist and somebody hacked it and somebody else hacked it, and after about three or four levels of hack, it s just the original meaning is very hard to figure out. And this is reflected here. Whereas if you look at something like production costs on the user space, like if you look at even the latest Microsoft software, there s a vulnerability here. You see that same vulnerability exists in Windows XP. That means 20 years ago, that was set up with the same piece of code [ph] still ______. I mean that s very difficult to do in the new year; we can t write a piece of code and expect it to survive [ph] long ______. So what happens if there s nothing in a machine model anymore? Between processor types keep changing, different generations keep changing, because this is the first time exponential is exposed to software, which is the number ______. Beforehand, the number of processors was not exposed [ph] and we didn t care about it. But now a number of code [ph] is exposed, so the software has to deal with something, doubling, probably every two years. And that s a really big change. So, how to do what we did to Java for performance? With Java, we said write once to run ______ so we got portable. We could function in portability. But we need to get performance portability, and this is really, really hard. So why is it hard? Anytime anyone decides the language, that s this big occasion [ph]. One thing you want to do is, of course, right now you want to get really good performance. So the way you get good performance is you expose a lot of architectural details all the way down. You say, Okay, I will expose everything you need to get good performance. And so you can do that; you can write the program and it will run really, really hard [ph]. And what that means is you kind of get into a local minima, and if the next machine is somewhere here, there s no way you can just get out of and get there; you need a heroic effort. There s no way you can do it automatically because you might not even have information. For example, if you look at something like MPI, message passing interface, that basically exposes all the granular stuff about the machine, all the hierarchy and stuff like that you program for that; it worked beautifully today but tomorrow it will not work at all. That s the hard part. On the other hand, if you re trying to do something much higher level, you can try to hide those architectural details, but if you got too high, you re above the clouds, you don t see the landscape. So you might land at probably top of a mountain than actually at the bottom because you don t see what s going on. So a lot of languages, things like list languages, all those things that raise the level of abstraction too high, never able to get good enough ______. So those programs lasted forever because they were not tied to anything. But on the other hand, they never gave any good performance either. So things like in high-performance C [ph], high-performance Fortran, the beautiful language, you have a lot of abstract constructs and you will uniformly run slow, no matter what. And so that s how that _______. So to be effective at future-proof, you have to do this very delicate balance. You got to restrict choices when a property is hard to automate or constant across architectures, because if something is constant if every architecture is going to do the same thing again and again, you don t want to abstract it out; it doesn t serve you any purpose by abstracting out. Expose it to the user. If you have to look at a language like C, you did a beautiful job of basically on all the ______ architectures, hiding out the differences of such things like the type of registers out there, ______ range [ph] and stuff like that, but exposes common properties because one instruction at a time is cheaper [ph] so all those things are exposed [INDISCERNIBLE] in there [ph]. On the other hand, if there s something automatable and variable, it means you think you are varying [ph] between machines, you hide it from the uses. And languages like C, you hide it from the user, hoping you can automate some things that reduce the location from the early ______ compilers are not that great, but ask people when they become better and better and at some point they became better than a compiler. I mean hand a person can do by hand, compilers became better. So the key thing is finding that balance and that s really critical, and I think going forward, we need to do that well. So the third interesting story I will say before I get into what we were doing, is the evolution of programming language [ph]. How did we get there? The reason I am going to talk about it is I want to motivate the fact we need some new languages and why. So, if you look at the first computers out there, they were very limited in power. In fact, one of the most complicated things or the most daunting task those computers did was the compiling of programs. So you had to make the language of the compiler easy. You can t make the compiler too difficult. So the way you make the life of a compiler easy is create these languages that didn t give you that many choices for the compiler [ph]. You said okay, tell the compiler exactly what the machine has to do, and the compiler can do it, and we ll be good. So what that means is when we have an algorithm, instead of giving anything high level, you told the compiler go do this, do this, do this very specific way, they all get the result. So you just give them exactly one algorithm, exactly one ______ through the program, one data layout, you do parallel of one parallelism ______. It was beautiful. So you found this is, I assume, your brain, you know a bunch of ways of solving the problem, you found what you thought was the best way of doing that and directly mapped into the system [ph] space of the program, which I assume this green says run really fast, red thinks run slow ; you found a good enough space that you can run that program and then map it ______. So yeah, do this this way and you ll get good performance [ph]. And you got good performance that you cared about [ph]. So what happened as time progressed? Computers became powerful, and what that means is compilers can do more things. So we said, Yeah, that s what you use us, but let me try to figure out there s something better you can do for this program. Can you run faster? So what you did was, from the space that was provided, you start looking around it; okay, can you help with the register location strategy, can you help with a memory layout strategy? You look for different things, trying to find a better _______. So within that space you can find something better. So this is where we started. But in the last decade what happened was, when you look at optimizing compilers, we had a lot more computer power. So then what we decided was, okay, now we need to find a lot more things to do. So we started doing heroic analysis. So the idea there is the languages was giving you this one way of doing that. What we were trying to do is other possibilities, we were trying to expand, expand, expand. But the problem is just the choice of writing the program; you have eliminated a lot of knowledge. So the programmer had a lot of ideas in their head but by choosing this one, a lot of those things are not now available to the system. So we had heroic analysis, but even doing that, there s limited things you can do. So this where we decided we needed to read the language. So the way we approached this thing was twofold. One said, okay, if you know multiple ways of solving this problem, we can basically ask you to do all this different ways. And what that means is we ask you to express the intent of the program, not a method of solving; just tell me what you are trying to solve, don t tell me how to solve it. By telling you that for example, by giving you an example, I can figure out that. Another important thing we did was figure out muscle basically outpaces brain. At least the computers are more slow and something keeps doubling every 18 months, then that the human brain is still what it was four years ago; we haven t doubled our capacity [ph]. So don t make all the problem in your head, do something that actually gives you more and more performance. So how do you do those things? So that basically led to a better [ph] ________. So, the theme of telling stories, so I m going to tell you three direct observations that led to the language. First this algorithm in place [ph]. So I m a compiler guy. I take somebody s program and make it run really fast if I can. And after I do all these creative [ph] transformations and get it to run faster on my machine, I ll go talk to probably who wrote the algorithm or some other numerical guy and say, Look what I can do with this algorithm. I can run this fast [ph]. And a lot of times they would look at it and say, Wait a minute, if you are trying to do it in that machine, I won t use that algorithm. That s another thing that we like to do much better in that _______, and that s really frustrating. Because I do all this work and they say it s useless. I will do something different. So what that means is the realization that, as we go about solving a problem, there are many solutions, and some solutions are better under some circumstances. For large input size you might want to do something [INDISCERNIBLE] for example, if you look at parallelism, there s nothing called [ph] an absolutely based algorithm, no matter what, in most problems. And so because of that, how do you take advantage of that? This becomes an even bigger issue when you go to multicores because multicores expose this exponential parameter [ph] out, basic number of cores [ph]. So the best algorithm for two cores or one core, it s not the best one for four cores, it s really not for 16 and not for 1,000. So algorithms have to keep changing. And so if you pick one, you re kind of stuck in one data point. And we had _____ like that. So basically there s no single algorithm that can solve all cases [ph], so we had to deal with it at the language level. So that s the first observation. Second one, actually, what I alluded to right before, the world is parallel, and many people have figured this one out. So if you look at mathematicians, they describe things parallel; they have a set notation, sigma, simultaneous equations, and this is a very nice parallel way of describing that. For some reason, the computer scientists decided parallelism is hard; from all the time when they said, Okay, we will come up with a sequential abstract and that will simplify our life. Everything we do in theory and everything is basically to sequentialize [ph] the world. So a statement executed sequentially, if you do a set, you say, Oh, it s a loop, s not a set; it goes to 1 to n. You choose order [ph] _____. Even a recursive definition, you say, Okay, if your given if n ____ we can give you f(n), n + 1 solution [ph]. We just do if there is a parallel world and imposed sequence [ph] _______. And this worked wonderful for a while. It let us limit complexity in many problems. But now it s really coming to bite us [ph]. And we had to really redo that. And some of it is not that difficult. So when I look at a lot of programs I find the problems, they weren t really parallel. You write, sequentialize it in C, and you re asking the compiler to re-parallelize. Doesn t make sense. So how do you make matter of parallelism easy? So the idea there is make parallelism the easiest way to describe something, make sequential _____ harder [ph] than parallelism. So sequential should be a special case that you have to work a little bit harder to get, than the other way around. The third observation was, when I started writing compilers, you had to modify [ph] ______. We knew what hardware was going to be [ph]. There was a process [ph], we knew exactly how the process works in there; if you had a nice model Intel or IBM or whatever ______ told us what it does and we knew what the compiler did [ph]. These days, the systems are way too complicated. Sometimes Intel doesn t even know what their processes are doing. Now some people there they say they get surprised what happens in the process. And the compiler, something like GCC, has 30+ process [ph], it s really hard to know if you do something early what s going to happen later, why it s doing something. And there are a lot of subtle interactions going all over the place and there are thousands of numbers [ph] in there. But, computers are cheap. And in fact, we have all this parallel resource here. So instead of trying to come up with this complex model in our heads, why don t you do end-to-end execution, observe it, and using machine learning to figure out what s best. So how can we basically use machine learning to simplify our lives. That is basically the three driving forces behind PetaBricks. So, next part, what I m going to do is I m going to describe the PetaBricks language a little, giving example, talk about the compilers, some [ph] _____, and two extensions we have done in language [ph], trying to give you a feed [ph] of where we are going. This is some still fairly early work, and so you will see that in the _______. So if you look at PetaBricks in here. So here s a simple matrix model. So what it says is you get a Matrix A _______ and you get Matrix B and you produce Matrix AB. And the way you do it is you basically have this piece of code that calculates that ______ for the rule. And the rule says I m going to calculate Cell AB and I don t tell you which cell it is [ph]. I don t give you any order. So all the cells have to be calculated this way. You figure out how to do it. And of course you do it by basically taking a row and a column and doing dot product [ph]. So what I have done is given you an implicitly parallel description of basically Matrix Multiply [ph]. I didn t tell you what order; I have freedom to do it any order I want, and then of course if you have to have a sequential constraint, I had to do some additional work here, so okay, by the way some of the orders are not something you can do. So I made the parallelisms really easy to do that [ph]. The next thing I said was, okay, if you know other ways to Matrix Multiply, tell them to me too [ph]. So you can do Matrix Multiply by basically taking two regions and doing two small multiplies and adding them together. Or you can do it by basically subdividing the matrix in a different way and calculating submatrices in there. Or you can do it in other parts; there are multiple parts. And in fact, if you give this one, our compiler will automatically get rid of these. So you don t nearly have to give these three in here [ph]. You can do that. However, that s not all how you can do Matrix Multiply. That s been called a Strassen s algorithm that runs asymptotically better, and that s very different. And I don t know a single compiler that, given the simple Matrix Multiply, that automatically can get _______ this one. And the worst thing is this is not always the best way of doing _______. This has bad cache memory behavior; sometimes other ______ blocking might be better. So there s no way in modern language to get that point across. In PetaBricks ______ is another way of doing it. You figure out what works best and that s what the compiler did [ph]. So basically what we have done is made algorithmic choice basically a key aspect of the language. We ask, we make it easy for programmers to give us multiple ways of doing that. And what complier can do is use this multiple ways and figure out the right hybrid, or basically poly-algorithm, that combines multiple algorithms together to get the best performance [ph]. And that can change, basically given different data, given different architectures, you can do something different. And also what we did was we made a language where we synthesized outer control flow. So basically, we can do it by the rows, columns, diagonals, reverse order, parallel; we can do any different way we want, and it s up to the basically compiler to figure that out [ph]. We don t try to undo one and do something different. We just made it completely easy to do that. So that makes our life much easier. So what we ask programmers to do is if there s some order we cannot do, tell us that. So sequentiality, give us additional information that we will drop some more of those in there [ph]. And basically allows compiler to explore this nice choice space. So let me give you a little bit of a fast background in what the compiler does, and walk through that. So in order to do that I am going to do this by giving a small example. So this example is doing a Rolling Sum. Okay, so in order to do Rolling Sum, I figured out two ways of doing a Rolling Sum. The first way of doing the Rolling Sum basically says I want to calculate the sum here, and the way I calculate the sum from B is basically by taking the previous B element and A and adding that. So I take this one, add, and I take this one and add it. So what it says is I m calculating cell of i using a cell of A, previous cell here, and A value and [INDISCERNIBLE] do that [ph]. The other way of doing basically a Rolling Sum is basically, I can basically, each one I can figure out what needs to be added and add them together. So basically you will see the difference in _____. This one has a lot less operation, but under [ph] the _____ sequence, that I had to do this before I do this and I had to do this before I do this; I have to sequentially do it [ph]. This is a very parallel one. It can calculate all the sums parallel, except I do a lot more operations here because I m doing each [ph] m adding them together. So, given different architecture, different situations, one might be better than the other and that s what the system is able to do, the compiler. But this is how we describe that. So, after describe this program, the compiler basically goes through some analysis. So to give you a kind of a rough feel for what happens in here. So we do three kind of new analyses in the compiler, what we call applicable regions, choice grids, and choice dependency graphs. The first thing we do is we made [ph] because we are getting these rules, the interesting thing is most of the programs, when you look at the complexity, it comes a lot of times in the boundary condition. You have this nice matrix, you can do something; the boundaries you had to this, very complicated. So a lot of times, by the time we know [ph] the boundary condition, it really complicates the ______ support [ph]. But it doesn the boundary condition is something you don t care that much about performance, but it s performance matters in the middle. So by doing these rules, we let you write different rules. You can write some rules that only work for the boundary, other rules work for other areas. Middle, or vice versa. So in there, what we did was we wrote this rule. This rule only works for these three elements. Because to calculate an element, you need the previous element. So in there there s no previous element. So this rule only applies to these three [ph]. Whereas this rule applies to all the elements. So at this point we have different ways of calculating the word [ph], so what you find is you find the rules and figure out which part of the program that rule can be used to calculate which part of the data. And then by doing that we can actually nicely separate out these weird [ph] boundary conditions; you don t have to write one complicated mess. And the next thing what we do is you write the entire data set by rules. So we can say, Wait a minute, this data item. Can we only use rule 1 to calculate? These data items can use both rules, zero and one, to calculate. By doing that I can say okay I do a different set of analysis to calculate this item because I have a different set of choices than here [ph]. So I define the word according to that. And then finally I come up with a choice dependency graph; that basically says, given the region, what are the rules and what are the dependencies you can use? So in here I can only use full [ph] zero, here s my dependence in here, and for the others I have four different rules I can calculate [ph]. So this gives you an indication what is all the freedom that the execution have to get the results. So putting it all together, what happens is you first get your source program and PetaBricks compiler, if you create [ph] this autotuning binary. What that tells us is here all the things you can use and here are the dependences you have [INDISCERNIBLE]. So some stuff, some rules you can do all the data, some rules you have to wait until it satisfied certain conditions. So it will give you all the choices in here. And then what you do is you can do offline autotuning. We use this, we run we have a data [ph] [INDISCERNIBLE]. We generate data and keep running through all this ______ we get all the possibilities in here. And once you figure out a really good choices that will give you the best algorithm, basically you can recompile that in and get a final binary that doesn t have all these choices available. So you don t have all the choices _____. So, autotuning, basically, used to genetic tuner to find all these hard binary decisions, and n-ary search algorithm [INDISCERNIBLE] so with block size, you had to search a linear spacing [ph]. And one interesting thing we do in our autotune is we start basically with very small dataset. So if you have matrix of 2 by 2 you can try millions of different choices. And then we use the best set to basically see the next level of _____ larger [ph] ______. So as you go to a larger and larger problem, you don t have to go through the entire space; you don t look at all the possibilities. We re not eliminating anything; we are basically giving it less probability. So for example, like blocking; a 2 by 2 matrix has no blocking, it s very slow. You don t want to have bunch of one [ph] ______. So at that time of the evaluation we ll say that s really bad, and it will get a low priority. But as we go larger and larger and as you start looking at a matrix that basically doesn t fit into anyone s cache, finally when you try blocking suddenly you realize it s a good algorithm; it improves and it becomes it goes into the better [ph] ______. By doing that, we build, and the reason for doing that is if we run a really bad algorithm in a really large dataset you can be there forever. So we want to make sure that we don t try really bad things that often, especially when you look at _______. This bottom [ph] autotuning actually helps a lot. So, let me show you some of our early results. So, here is Sort. Sort is this classic algorithm. There are many different ways of doing Sort, and in fact some stuff, something like insertion Sort is really good when the data is small. But if there s a lot of data, things like Radix, Sort becomes much better. And so what we can do is and there s a lot more crossover at this point that you don t see we can learn basically hybrid, the poly-algorithm. That s better than everybody. What it does is it simply [ph] goes down at a switch point, a different point and switch it to different algorithm. Let me show you a little pictorial about type of things we do [ph]. So let s look at this, four sorting algorithms: Mergesort, Insertsort, Radixsort, and Quicksort. So poly-algorithms is not new. If you actually go look at Standard Template Library in there, that s in fact a poly-algorithm. That s a Mergesort to image [ph] Sort that, at 15 elements, switch to Insertionsort. So we were pretty impressed, like, wow, these people have done that and we said, Wait a minute, why it s 15 [ph]? And then we re all trying to figure out where did that come from and we realized, in like 1999, it was introduced by SGI, in SGI compilers ST Library, that, and they picked 15. And still it s 15 because 15 is the variable that is in the program. There s no way the compiler can change that. And so we said, Does 15 make sense? In fact, something like 2,500 makes sense today because the caches are big and stuff like that. But still it s 15; if you go look at your ST Library there will be a constant that at 15 it s going to switch. And so we said, Okay, look. So instead of 15, if you actually do it right, what will happen? The interesting thing is if you look at something, Xeon, one core, people start with Radix ______, and at about 98, it will _____ 4-way [ph] Mergesort, and then about 75, it should secure [ph] Insertionsort. That s great. So that s what happened. But if you look at 8-way Xeon, it s a very different study. What it will do is it will start with the N-way Mergesort, do a Mergesort, and about 1,200 because now it s parallel, switch to Quicksort, and at about 600, it s Insertionsort. It s a very different cut-off point. And something like Sun Niagara, that s even a different story; it has a very different memory system. So it never do anything other than Mergesort. You do different size Mergesorts at 2, 4, 8, 16 Mergesort, having other different ways [ph] _______. So if you like put them together, so here s the kind of ______ chart [ph]. So what I show here is different architectures with a system, Mobile Core 2 Duo and different two-way Xeon systems and one Sun Niagara. And here s the algorithm it creates, like an infinite [ph] it s a Quicksort, and then it switches to do a Merge [ph] so that this one, 1-way Mergesort, 8-way Mergesort, and Insertionsort. So there are all these combination algorithms that are very different. The first question is, so how important is the selection [ph]? I mean, is there a big performance difference [ph]? So, in order to try that, we basically looked [ph] at each of algorithm so Mobile algorithm means this one in here and ran it on other systems and looked at the slow down. So if you use this algorithm that s given here, and ran it on a 1-way Xeon, it s runs about 60% slower. And if you run it on a 8-way Xeon, it runs about 60% slower. And if you took the algorithm ______ in Niagara, and ran on a 2-way [ph] Xeon, it s 2, 2.3x slower. So what that means is even if you look at architecture, it s not better [ph], almost the same generation [ph], not that different. If you actually go and do [ph] on it for a machine, there is almost [INDISCERNIBLE]. This, you re not building that much. This is a simple architecture, simple algorithm, but you get a huge difference here. So, if you look at something like Matrix Multiply, there are all these different ways you can do Strassen s, transpose, recursive blocking, simple basic [ph] in here, so the performance you re getting here and then what we can do is do a hybrid one that s better than ______. The nice thing about hybrid is it should always be at least the best you can get, and that and we shouldn t get into a place that s worse [ph] because we can always switch to that algorithm. And if you look at something like Eigenvector Solve, because this is because it ll be composite [ph], you can keep switching algorithms as we go down; you can actually get a performance that basically is much better than everything else. So what happens is that at the low end if you [INDISCERNIBLE] and then basically it s set to DC in, basically, _______. So you get results [ph]. So this is some area with which we are writing a lot more benchmarks these days in here and try to get a better understanding of the language. So, I want to now switch gears a little bit and talk about two different extensions, depends on how much time I have that I can cover. So the first one is variable accuracy. So we are used to this word, we are things look black and white. So what we said was, okay, if your algorithm, it will solve [ph], it will give you one result, always. But the world is not that simple. The world is a place where there are a lot of shades of gray. So things like iterative algorithms, basically. So if you trace a certain amount of times, at some point somebody will say, Okay, this is good enough, I m going to stop, I like this ______. Or things like signal processing, things like images and sound. What s the perfect sound? If you are well, you re happy. And different people have probably different perception. And so there are a lot of this give and take you can do, things like approximate algorithm. So many of these things you can trade accuracy for speed. Or, given a certain amount of time, what s the best accuracy _______. So you have this _______. So, most of the time, this is what all users want: solve this to a certain accuracy as fast as possible, using whatever you can, or vice versa; given a certain time, give me the best accuracy you can get. So one interesting algorithm for doing this is called multigrid. So, for people who probably don t know multigrid, I ll do a quick intro into the multigrid. This is using iteratively solving partial differential equations on a gridded domain. So you have bunch of data points. So one way to think about that is, assume you have a big sheet of metal; in one place you put some heat source, and you want to figure out how the heat is going to be propagated [ph]. The way you can do it iterative is you divide it into a bunch of grid points and each point you basically [INDISCERNIBLE]. And you do it and for every point you do that one iteration. So you have this heat to propagate little bit, and you do it enough times, at some point you get the full propagation of the heat. So if you do that. And that s what you do. But you can do it many different ways. So, one is called relaxation, update points using neighboring values. It s an extensive computation and sometimes we have five points in the data, you look at your point, and then for nearest neighbors, and you update that. But that takes a long time. Because if you think about a large sheet [ph], that every step, you need to propagate one pixel, one pixel, one pixel slowly propagating all the way. So to get through a large thing [ph] it would take a very long time to do that. The other way people can do is what they all restriction interpolation. So I sort of think I have very fine [ph] grid because we find a good result. Why not first basically, a coarser [ph] grid ______. And what you do is, then you have much fewer points and that value will propagate very fast. And then after it propagates through that, you can again do interpolation and make [INDISCERNIBLE]. So what happens is you have a very fine resolution in here and you relax to basically come up with the coarser grid [ph] solution here, and then at some point to relax a lot in here. And then once you are happy with that you can basically go and again interpolate into the finer grid you want [ph]. So when you are causing it, a value from here to here, propagating in like two steps, here it would take a lot more steps propagation. So, in the field, there are papers published on how to do that; there are things like called V-cycles, W-cycles, full MG V-cycles. All these things on paper. So if you start with this resolution and go down, down and come up, or go down, down and come up a little bit, go down. So all these things are basically each of these shapes is a paper. So there are a lot of questions. What do relaxation operations [ph] do? How many iterations to run this thing? How close we can do? All these steps we can do. So there are three main things you can do. At some point, if you have a small amount of data points, you can do a direct solve [ph]. You can write a system of equations [ph] and solve that as exactly _______. And if it is too many data points it s [INDISCERNIBLE]. Or you can iteratively solve something; you can run iterations again and again and again basically do, do this [INDISCERNIBLE]. Or you can recursively go to a smaller, a coarser grade, do something in here, come back out. And so you have the standard V-shape, and then at some point you can go do direct solving here, in there. Or you can do direct solve a little bit, go up, and then you can do all of this kind of iterations [ph] and all that. And also when you iterate, you have to figure out how many times you want to iterate. So this is I hope I give a very fast overview of what this algorithm is about. So, what happens is what you get is this trade-off between accuracy and time. There are a bunch of algorithms that will give you a result with certain accuracy, taking certain time, and you get this two-dimensional split [ph]. The interesting thing is this side is better, that was highest in accuracy, so you don t want something, get low [ph] accuracy running long time, that s not that good [ph]. So what you need is actually this optimal frontier; this is what s most important for you. These are the algorithms that basically need [ph] accuracy versus time, and of course you want everything; it depends on the amount of accuracy you want, you can trade off that. And so what happens is there are a lot of these algorithms in here so it s too hard to remember everything. So what we can do is we can quantize this space and say each quanta will give you one algorithm. So remember this one for each of these quantas. That s sort of best algorithm to do that. So, what we have done is we used dynamic program to manage this autotuning. We searched through these shapes and figured out the optimal for that size in here and we can build it up in here. So if you look at what s going on, our program looks like this; what the program has is a best direct solver in here, in there. And then we have iterative solver that we can do, and then we can call recursively, go down and fall and come back up here. And all those things have a number of iterations that are also numbers, it s run [ph], or as much as you think it s good. That s how it looks. We don t tell you how much. So figure out what s good, on that ______ model [ph] type [ph], give me their answer. And so what we have generated is a bunch of language extensions. So for example, we have variables that are saying, okay, I don t assign a value; you figure out what s best for these variables. And we have accuracy metric that when you re measuring you tell me how to measure the accuracy of this program. And some of the time it might be very simple things like the average or means ______ or you can write your [INDISCERNIBLE]. And then of course you can say how many quantization you do, and of course you need a data generator so we can generate enough of this data [ph] ______. So this is what we re asking for the programmer and then we will do what s right. So what happens is when you re training, you will find each algorithm for each of these means for one resolution is there [ph]. And for next resolution we will construct algorithms using these as our building blocks and construct one. So we will get all these data points with all the algorithm built out of this one. And then some of these things will be the best in this region and we ll get that. So each resolution you will use the previous resolutions values to build it _______ algorithm [ph]. So what happens is if you look at it, what happens is this algorithm the level of algorithm might only use, basically, the higher-level algorithm in here. So for example, we ll say we will do this in finer and we go to 2x, run two times this one at some level, by coarsening it, and then use that result. So then, to get to the higher level, you basically have all these choices and _______ run this one time, coarse [ph], run this two times, and then probably direct _______ and go back out [ph]. By resolution you mean the grid is twice as big? Amarasinghe: Yes, exactly. So basically assume I have a grid of 4,096, so we can say, okay, that means accuracy 107, and then you want a less accurate solution for a smaller grid, run it twice, and this accuracy run sometimes [ph] ______ even less accurate, run, and coarsen it. At some point when they are small enough I can do direct solve, I can go back up. So I can do this very combination; that s the best algorithm. So what happens is we come up with these crazy shapes, different shapes for different algorithms. We can figure out, okay, given the accuracy, given the size of data, what s the best way to go through that [ph]. And we can learn that, and that s the really cool part on that. And also, because we are using this optimal substructure, these things get _______ in even larger programs [ph] [INDISCERNIBLE]. So here s one result in there because this is two-dimensional results. Hard to solve everything for one accuracy value being here, so you can have all this individual program but we ll find something that basically has the best of all ______. To give you another view, this is another interesting algorithm, which is bin packing. So if you get a bunch of different size of boxes and if you have a bin to pack it up, the optimal solution is _______ in there. But there are a lot of heuristics that you can find [ph]. Some heuristics are very fast than others but they might not be that good. So what we re showing here is that if you want an algorithm that s only 50%, do the best you can get. There are very depending on the size, there are some algorithms that are really fast; we ll get through 50% of the best you can get [ph]. But if you want something to 10% [ph], there are other algorithms, that s like something like first fit, to give you close to 10% in this size, but it will take longer to _______. So given the accuracy one [ph] and given the data size, you can say, okay, this is the algorithm I want to run with that [ph]. So I can learn that space. So it s a complex space but this is what our learning algorithm is [ph]. Okay. I think we are about 5% m going to skip the next section. I don t think I have time for that. Let s directly go to conclusions. So, I think it s time we come up with languages that are based on autotuning. I think autotuning is fine today, a lot of people have done autotuning as libraries, but I think we need to design [ph] language [ph]. So that because there s this convergence of multiple forces. First of all, the multicore menace. We have multicores, and we have no idea how to use it, and these things keep changing faster and we have to find a way to do that. And second thing is we want the future-proof our program. We want to write a program once and ten years down the line we want it to work. We had it for now and by the way we are going with multicores, we are going to lose it, unless we figure out some way of doing that. The other most interesting thing is we have a lot of muscle available in computing. Except we don t have a human brain _______. And in fact, the compilers and programming languages, they are everybody uses to take advantage of computing power [ph], but we are the ones who are the least users of computers [ph]. If you look at things like CADs, tools, they re all embracing high-performance computing; they are their tools to run on large systems, they re doing beautiful visualizing all of those things. In compilers, we are still running on the one core and only parallels, so we have this to make _______. There s no single parallel compiler, that we have parallelizing compilers, but none of the compilers itself have had them [ph]. So the key thing is how can we compilers and this community use the computing power that we are trying to manage ourselves to actually make things better [ph]. I think there s a lot more we can do if we can embrace that and say, Okay, look, we are going to use a lot of computing power. And in PetaBricks what we have shown is it can be done, and it has benefits _______. However, the product is far from done, it s not an acceptable thing. There are a lot of questions to be answered. For example, we re asking people to do more work. Normally say okay, get the first program, first algorithm written, and get it working and then try the second one and the third one. In these days, time to market is critical; it s very hard to convince somebody once you get it finished to do some extra work. So will people be able to do this, say, Okay, look, I will give you other option, even though my first option is perfectly sufficient ? [INDISCERNIBLE] So we re asking for extra. So the convincing people who do that, I think it s still up to their saying okay still the average Joe programmer that has released that _______, going to give me other choices so they re going to move onto the next thing [ph] or how we do that. And also, now, you take a couple of programs and build a very complex algorithm. How do you know it s correct [ph]? I mean there are a lot of complexity and verification, validation, and in fact, if you do dynamically, that every time you run you run a different algorithm. And that s a nightmare for a lot people. So one saving grace was what we found was, especially for when you re not doing variable position, we can compare each algorithms against it. In fact, we found bugs [ph]. So if you write a new algorithm, it better produce the same results at the other ones we have. So when you are doing basically learning, you re not only just looking at the time, you re actually looking at the results comparing. And we actually found a lot of bugs and said, Oh, wait a minute, in that case, okay, some algorithm predicted a different result. Okay, something might be wrong. So we can actually use it for testing [ph]. But for verification, validation, still up in there, how do you validate a poly-algorithm that _______. So that s all I have. I m open for questions. [APPLAUSE] Amarasinghe: Should we open up for some questions _______ because they might help [INDISCERNIBLE]. Questions? I know we have [INDISCERNIBLE]. They re sitting, they re paying attention; they might be hearing [ph] otherwise otherwise they wouldn t be sitting there for hours. Okay there s a question. Go ahead. Thanks for the talk. Can you hear me? Amarasinghe: Yes. Thanks for the talk. Towards the end, when you were talking about variable accuracy, variable accuracy computation, how do you anticipate users reasoning about the various deals that they can specify for their particular code? Because, in some sense, one of your examples had a field where you had to specify the range of accuracy or accuracy quantiles. Did you have some tool that helps users to figure out exactly what those quantiles should be? Amarasinghe: So right now, I think accuracy quantiles is an optional thing. What we will do is we will look at the range of variable you can get and basically chalk it up to pieces [ph] in there. And the accuracy quantile, that becomes important if you have like a non-deviant kind of range where you have so, at this point, we don t have any tools like that because a lot of it is, I think, semantics of the program. If it is a simple linear rate, you don t need to specify that. But we have found some programs that there are things like accuracy becomes very important in certain ranges where you have to quantitize it much more smaller versus other ranges. So, if you know more information about your program, we won t provide them with the tools to do that, but that s optional. I think, thinking about some of these _______ accuracies [ph] can be hard, but on the other hand, it might be liberating, too, because what we re asking you to do is tell us less. So instead of saying, i equals hmm, how many times should I trade [ph]? do whatever you want. So i equals one [ph] instead of saying one to picking a number, instead of one, two [ph], I will leave it blank. In some sense I think I am asking you to do less programming than more programming. So hopefully, this of course has to be proven. In fact, I think this might be easier for the programmers because something like trading algorithms, a lot of times people spend a huge amount of time trying to figure out what s the right number of iteration [ph]. And here what we re saying is don t do it. [END RECORDING] NCSU Creative Services TCSDLS-20111017-Amarasinghe October 17, 2011 Page PAGE Transcript prepared by Rogers Word Service 919-834-0000 1-800-582-8749 h$_ h$_ h$_ h$_ [Content_Types].xml #!MB ;c=1 _rels/.rels theme/theme/themeManager.xml sQ}# theme/theme/theme1.xml G$$DA : BR {i5@R V*[_X ,l\Y Ssd+r] 5\|E Vky- V4Ej 6NGU s?^V *<")QH @\&> 7;wP EBU` 5<V8 LStf+] C9P^ wB>VD GGHPXNT, /M,W m2iU [[v _Xtl theme/theme/_rels/themeManager.xml.rels 6?$Q K(M&$R(.1 [Content_Types].xmlPK _rels/.relsPK theme/theme/themeManager.xmlPK theme/theme/theme1.xmlPK theme/theme/_rels/themeManager.xml.relsPK <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <a:clrMap xmlns:a="" bg1="lt1" tx1="dk1" bg2="lt2" tx2="dk2" accent1="accent1" accent2="accent2" accent3="accent3" accent4="accent4" accent5="accent5" accent6="accent6" hlink="hlink" folHlink="folHlink"/> me</key> <strin Ann E richard Normal.dotm Dave Pond Microsoft Macintosh Word Rogers Word Service Ann E Title Microsoft Word 97-2004 Document NB6W Word.Document.8



Most concurrencies are simply a combination of at least two route numbers on the same physical roadway. This is often practically advantageous as well as economically advantageous; it may be better for two route numbers to be combined into one along rivers or through mountain valleys. Some countries allow for concurrencies to occur, however, others specifically do not allow it to happen. In those nations which do permit concurrencies, it can become very common. In these countries, there are a variety of concurrences which can occur.

An example of this is the concurrency of Interstate 70 (I-70) and I-76 on the Pennsylvania Turnpike in western Pennsylvania. I-70 merges with the Pennsylvania Turnpike so the route number can ultimately continue east into Maryland; instead of having a second physical highway built to carry the route, it is combined with the Pennsylvania Turnpike and the I-76 designation.[8] A triple Interstate concurrency is found in Wisconsin along the five-mile (8.0 km) section of I-41, I-43, and I-894 in Milwaukee, Wisconsin.[9] The concurrency of I-41 and I-43 on this roadway is an example of a wrong-way concurrency.

The longest Interstate highway concurrency is I-80 and I-90 for 265 miles (426 km) across Indiana and Ohio.[10]

There are examples of eight-way concurrencies: I-465 around Indianapolis and Georgia State Route 10 Loop around downtown Athens, Georgia. Portions of the 53-mile (85 km) I-465 overlap with I-74, US Highway 31 (US 31), US 36, US 40, US 52, US 421, State Road 37 (SR 37) and SR 67—a total of eight other routes. Seven of the eight other designations overlap between exits 46 and 47 to create an eight-way concurrency.[11]

Regional examples

North America

The QEW concurrent with Highway 403 in Ontario
The QEW concurrent with Highway 403 in Ontario

In the United States, concurrencies are simply marked by placing signs for both routes on the same or adjacent posts. The federal Manual on Uniform Traffic Control Devices prescribes that when mounting these adjacent signs together that the numbers will be arranged vertically or horizontally in order of precedence. The order to be used is Interstate Highways, U.S. Highways, state highways, and finally county roads, and within each class by increasing numerical value.[12]

Several states do not officially have any concurrencies, instead officially ending routes on each side of one.[a] There are several circumstances where unusual concurrencies exist along state borders. One example occurs along the OklahomaArkansas state line. At the northern end of this border Oklahoma State Highway 20 runs concurrently with Arkansas Highway 43 and the two highways run north–south along the boundary.[14]

Concurrencies are also found in Canada. British Columbia Highway 5 continues east for 12 kilometres (7.5 mi) concurrently with Highway 1 and Highway 97, through Kamloops. This stretch of road, which carries Highway 97 south and Highway 5 north on the same lanes (and vice versa), is the only wrong-way concurrency in British Columbia.

In Ontario, the Queen Elizabeth Way and Highway 403 run concurrently between Burlington and Oakville, forming the province's only concurrency between two 400-series highways.[15] The concurrency was not in the original plan which intended for both the QEW and Highway 403 to run parallel to each other, as the Hamilton–Brantford and Mississauga sections of Highway 403 were initially planned to be linked up along the corridor now occupied by Highway 407.Later it was planned for the Mississauga section of Highway 403 would be renumbered as Highway 410 but this never came to pass.[16] Consequently, Highway 403 was signed concurrently along the Queen Elizabeth Way in 2002, remedying the discontinuity to avoid confusing drivers that wanted to travel between the two segments without using the toll Highway 407. Nonetheless, many surface street signs referring to that section of freeway with the QEW/Highway 403 concurrency still only use the highway's original designation of QEW, although the MTO has updated route markers on the QEW to reflect the concurrency.[17]


Concurrency of the city beltway, a European road, and three first-class roads in Hradec Králové, Czech Republic
Concurrency of the city beltway, a European road, and three first-class roads in Hradec Králové, Czech Republic
Concurrency of several cycling routes in Písek, Czech Republic
Concurrency of several cycling routes in Písek, Czech Republic

In the United Kingdom, routes do not run concurrently with others. Where this would normally occur, the roadway takes the number of only one of the routes (usually, but not always, the most important route), while the other routes are considered to have a gap and are signed in brackets (the equivalent of "to" signs in North America). An example is the meeting of the M60 and the M62 northwest of Manchester: the motorways coincide for the seven miles (11 km) between junctions 12 and 18 but the motorway between those points is only designated as the M60. European route numbers as designated by UNECE may have concurrencies (for instance E15 and E30 around Greater London), but since the E-route numbers are unsigned and unused in the UK, the existence of these concurrencies is purely theoretical.

In Sweden and Denmark, the most important highways use only the European route numbers that have cardinal directions. In Sweden the E6 and E20 run concurrently for 280 kilometres (170 mi). In Denmark the E47 and E55 run concurrently for 157 kilometres (98 mi). There are more shorter concurrencies. There are two stretches in Sweden and Denmark where three European routes run concurrently; these are E6, E20 and E22 in Sweden, and E20, E47, and E55 in Denmark. Along all these concurrencies, all route numbers are posted with signs.[18]

In the Czech Republic, the European route numbers are only additional, and they are always concurrent with the state route numbering, usually highways or first-class roads. In the state numbering system, concurrences exist only in first-class and second-class roads; third class roads do not have them. The local term for such concurrences is peáž (somehow derived from the French word péage). In the road register, one of the roads is considered the main ("source") road and the others as the péaging (guest) roads. The official road map enables a maximum of five concurrent routes of the intrastate numbering system.[19] Cycling routes and hiking routes are often concurrent.

The Middle East

In Israel, two freeways, the Trans-Israel Highway (Highway 6), and Highway 1 run concurrently just east of Ben Shemen Interchange. The concurrency is officially designated "Daniel Interchange", providing half of the possible interchange directions. It is a one-mile (1.6 km) segment consisting of eight lanes providing high-speed access between the two highways. Access from Highway 1 west to Highway 6 south and Highway 6 north to Highway 1 east is provided via Route 431, while access between Highway 1 east to Highway 6 north and Highway 6 south to Highway 1 west are provided at Ben Shemen Interchange. The other movements are provided through the concurrency.[20]

Wrong-way concurrencies

This westbound highway in southwestern Virginia simultaneously carries I-77 and I-81 in opposite directions. The wrong-way concurrency is also reflected in US 52 and US 11, which are concurrent with I-77 and I-81, respectively.
This westbound highway in southwestern Virginia simultaneously carries I-77 and I-81 in opposite directions. The wrong-way concurrency is also reflected in US 52 and US 11, which are concurrent with I-77 and I-81, respectively.
An example of a wrong-way concurrency in Oklahoma City, Oklahoma; the wrong-way concurrency is highlighted in red.
An example of a wrong-way concurrency in Oklahoma City, Oklahoma; the wrong-way concurrency is highlighted in red.

Since highways in the United States and Canada are usually signed with assigned cardinal directions based on their primary orientation, it is possible for a stretch of roadway shared between two highways to be signed with conflicting, even opposite, cardinal directions in a wrong-way concurrency. For example, near Wytheville, Virginia, there is a concurrency between I-77 (which runs primarily north–south, as it is signed) and Interstate 81 (which runs primarily northeast–southwest, but is also signed north–south). Because of the way they intersect, the section of Interstate where they overlap has the two roads signed for opposite directions, leading to the town's nickname of "Which-Way-Ville".[21] One might simultaneously be on I-77 northbound and I-81 southbound, while actually traveling due westbound.[22] An unusual example of a three-directional concurrency occurs southeast of Rhinelander, Wisconsin, where US 8 westbound (the actual compass direction) converges with southbound Wisconsin Highway 17 and northbound Wisconsin Highway 47.

Effect on exit numbers

Often when two routes with exit numbers overlap, one of the routes has its exit numbers dominate over the other and can sometimes result in having two exits of the same number, albeit far from each other along the same highway (although in one case the number 96 is only 6 miles [9.7 km] apart). An example of this is from the concurrency of I-94 and US 127 near Jackson, Michigan. The concurrent section of freeway has an exit with M-106, which is numbered exit 139 using I-94's mileage-based numbers. US 127 also has another exit 139 with the southern end of the US 127 business loop in Mount Pleasant, Michigan. (US 127's mile markers in Michigan reflect the cumulative distance north of the Ohio state line; the numbers resume north of the I-94 overlap and reflect the distance accumulated on that concurrency.)[23]

However, there are also instances where the dominant exit number range is far more than the secondary route's highest exit number, for example the concurrency of I-75 and I-85 in Atlanta, Georgia — where I-75 is dominant — the exit numbers range from 242 to 251, while I-85's highest independent mile marker in Georgia is 179.[24]

Consolidation plans

US 1/9 concurrency signed on one shield
US 1/9 concurrency signed on one shield

Some brief concurrencies in the past have been eliminated by reassigning the designations along the roadways. This can involve scaling back the terminus of one designation to the end of a concurrent section. At the same time, there could be an extension of another highway designation that is used to replace the newly shortened designation with another one.

Between states, US 27 in Michigan previously ran concurrently with I-69 from the Michigan–Indiana state line to the Lansing, Michigan, area. From there it turned northwards to its terminus at Grayling. In 1999, the Michigan and Indiana departments of transportation petitioned the American Association of State Highway and Transportation Officials for permission to truncate US 27 at Fort Wayne, Indiana.[25] In 2002, Michigan removed the US 27 designation from I-69 and extended the US 127 designation from Lansing to Grayling.[26] MDOT's stated reason for the modification was to "reduce confusion along the US 27/US 127 corridor".[27] After US 27's signage was removed, the highway north of the Lansing area was renumbered US 127, and the US 27 designation was removed from I-69.[27]

Some consolidation schemes involve the use of incorporating two single-digit numbers onto one marker, as along the US 1/9 concurrency in northern New Jersey.[28] In the mid-20th century, California had numerous concurrencies, but the California Legislature removed most of them in a comprehensive reform of highway numbering in 1964.[29]

See also


  1. ^ Arkansas's highways exist in many officially designated "sections" rather than form concurrencies. Arkansas Highway 131 exists in five sections as an example.[13]


  1. ^ Esri (March 4, 2014). "Realigning Concurrent Routes". ArcGIS Help 10.2 & 10.2.1. Esri. Retrieved April 8, 2014.
  2. ^ "Freeway Flaws: Fixing Them May Take Decades". Star Tribune. Minneapolis. June 3, 2005. common sections ... 2 freeways share a single right-of-way
  3. ^ Esri (December 19, 2012). "Realigning Overlapping Routes". ArcGIS Resource Center. Esri. Retrieved April 8, 2014.
  4. ^ Office of Highway System Engineering (August 1995). "State Highway Routes Selected Information, 1994 with 1995 Revisions" (PDF). California Department of Transportation. Route 3. Archived from the original (PDF) on March 16, 2007. Retrieved March 7, 2012. Coincident with Rte 299
  5. ^ Reichard, Timothy. "Guide to Highway Multiplexes". Central PA/MD Roads. Retrieved April 8, 2014.
  6. ^ Kanillopoolos, John J. (October 19, 1982). "Dual and Triple Routing on State Trunklines". Letter to Trunkline Numbering Committee. Lansing: Michigan Department of Transportation. Retrieved June 3, 2019 – via Wikisource.
  7. ^ Kanillopoolos, John J. (March 17, 1983). "Dual and Triple Routing on State Trunklines". Letter to Trunkline Numbering Committee. Lansing: Michigan Department of Transportation. Retrieved June 3, 2019 – via Wikisource.
  8. ^ Pennsylvania Department of Transportation Geographic Information Section (2010). Tourism & Transportation Map (Map). Scale not given. Harrisburg: Pennsylvania Department of Transportation. §§ E10–L11.
  9. ^ Srubas, Paul (April 9, 2015). "It's Officially Interstate 41 Now in Wisconsin". Green Bay Press-Gazette. Retrieved May 27, 2015.
  10. ^ Federal Highway Administration (December 31, 2013). "Table 1: Main Routes of the Dwight D. Eisenhower National System of Interstate and Defense Highways as of December 31, 2013". Route Log and Finder List. Federal Highway Administration. Retrieved April 8, 2014.
  11. ^ Indiana Department of Transportation (2007). Indiana Transportation Map (Map) (2007–08 ed.). Scale not given. Indianapolis: Indiana Department of Transportation. Indianapolis inset.
  12. ^ Federal Highway Administration (2009). "Chapter 2D. Guide Signs: Conventional Roads, §2D.29: Route Sign Assemblies" (PDF). Manual on Uniform Traffic Control Devices (Revisions 1&2, 2009 ed.). Washington, DC: Federal Highway Administration. p. 148. ISBN 9781615835171. Retrieved December 20, 2014.
  13. ^ Planning and Research Division (April 2010). State Highways 2009 (Database). Arkansas State Highway and Transportation Department. Archived from the original (ZIP) on July 7, 2011. Retrieved April 11, 2011.
  14. ^ Arkansas State Highway and Transportation Department Planning and Research Division (2010). State Highway Map (Map). 1:950,400. Little Rock: Arkansas State Highway and Transportation Department. § A1.
  15. ^ Ministry of Transportation of Ontario Geomatics Office (2010). Official Road Map / Carte Routière (Map) (2010–11 ed.). 1:250,000 (in English and French). Toronto: Ministry of Transportation of Ontario. § R25.
  16. ^ Mitchell, Bob (April 6, 1995). "Rae Announces 407 Extension". Toronto Star. p. BR03.
  17. ^ "Signs of the Times". Milestones. Ontario Good Roads Association. 2 (1): 26, 31. February 2002. Archived from the original on April 26, 2012. Retrieved January 2, 2012.
  18. ^ Check Google Streetview at 55°33′00″N 13°03′06″E / 55.5500993°N 13.0517037°E / 55.5500993; 13.0517037, 55°34′48″N 12°15′42″E / 55.5800398°N 12.2615534°E / 55.5800398; 12.2615534 and neighboring locations[full citation needed]
  19. ^ "Číslo peažující silnice], explanatory notes to the road map, Ředitelství silnic a dálnic" (in Czech). Directorate of Roads and Highways. Archived from the original on 2015-05-18. Retrieved 2015-05-17.[full citation needed]
  20. ^ עורכת אחראית אלנה בלינקי; Elena Belinki (2014). 2014 אטלס הזהב [Atlas HaZahav 2014] (in Hebrew) (9th ed.). מפה הוצאה לאור [Mapa Publishing]. ISBN 978-965-521-136-8. Archived from the original on 2014-07-14. Retrieved 2014-06-08.
  21. ^ "Which-way-Ville Making Sense of Wytheville". Appalachia Magazine. December 2, 2017.
  22. ^ Virginia Department of Transportation (2012). Official State Transportation Map (Map) (2012–14 ed.). c. 1:832,680. Richmond: Virginia Department of Transportation. §§ F6–G6.
  23. ^ Michigan Department of Transportation (2013). Pure Michigan: State Transportation Map (Map). c. 1:975,000. Lansing: Michigan Department of Transportation. §§ J10, M11. OCLC 42778335, 861227559.
  24. ^ Georgia Department of Transportation (2011). Official Highway and Transportation Map (PDF) (Map) (2011–2012 ed.). Scale not given. Atlanta: Georgia Department of Transportation. Main map, §§ B1, I2; Atlanta inset, § E5. OCLC 770217845.
  25. ^ Special Committee on U.S. Route Numbering (April 17, 1999). "Report of the Special Committee on U.S. Route Numbering to the Standing Committee on Highways" (PDF) (Report). Washington, DC: American Association of State Highway and Transportation Officials. Archived (PDF) from the original on October 16, 2017. Retrieved May 24, 2008.
  26. ^ Ranzenberger, Mark (April 27, 2008). "US 127 Signs Getting Updated". The Morning Sun. Mount Pleasant, MI. pp. 1A, 6A. OCLC 22378715. Retrieved August 23, 2012 – via NewsBank.
  27. ^ a b Debnar, Kari & Bott, Mark (January 14, 2002). "US 27 Designation Soon To Be Deleted from Michigan Highways" (PDF) (Press release). Michigan Department of Transportation.
  28. ^ New Jersey Department of Transportation. Signage for US 1/9, NJ 21, US 22, and I-78 (Highway guide sign). Newark, NJ: New Jersey Department of Transportation. Retrieved December 5, 2009.
  29. ^ "Route Renumbering: New Green Markers Will Replaces Old Shields" (PDF). California Highways and Public Works. 43 (1–2): 11–14. March–April 1964. ISSN 0008-1159. Retrieved March 8, 2012.

External links

This page was last edited on 1 January 2020, at 14:29
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.