Anton (computer)

Anton is a massively parallel supercomputer designed and built by D. E. Shaw Research in New York, first running in 2008. It is a special-purpose system for molecular dynamics (MD) simulations of proteins and other biological macromolecules. An Anton machine consists of a substantial number of application-specific integrated circuits (ASICs), interconnected by a specialized high-speed, three-dimensional torus network.^[1]

Unlike earlier special-purpose systems for MD simulations, such as MDGRAPE-3 developed by RIKEN in Japan, Anton runs its computations entirely on specialized ASICs, instead of dividing the computation between specialized ASICs and general-purpose host processors.

Each Anton ASIC contains two computational subsystems. Most of the calculation of electrostatic and van der Waals forces is performed by the high-throughput interaction subsystem (HTIS).^[2] This subsystem contains 32 deeply pipelined modules running at 800 MHz arranged much like a systolic array. The remaining calculations, including the bond forces and the fast Fourier transforms (used for long-range electrostatics), are performed by the flexible subsystem. This subsystem contains four general-purpose Tensilica cores (each with cache and scratchpad memory) and eight specialized but programmable SIMD cores called geometry cores. The flexible subsystem runs at 400 MHz.^[3]

Anton's network is a 3D torus and thus each chip has 6 inter-node links with a total in+out bandwidth of 607.2 Gbit/s. An inter-node link is composed of two equal one-way links (one traveling in each direction), with each one-way link having 50.6 Gbit/s of bandwidth. Each one-way link is composed of 11 lanes, where a lane is a differential pair of wires signaling at 4.6 Gbit/s. The per-hop latency in Anton's network is 50 ns. Each ASIC is also attached to its own DRAM bank, enabling large simulations.^[4]

The performance of a 512-node Anton machine is over 17,000 nanoseconds of simulated time per day for a protein-water system consisting of 23,558 atoms.^[5] In comparison, MD codes running on general-purpose parallel computers with hundreds or thousands of processor cores achieve simulation rates of up to a few hundred nanoseconds per day on the same chemical system. The first 512-node Anton machine became operational in October 2008.^[6] The multiple petaFLOP,^[7] distributed-computing project Folding@home has achieved similar aggregate ensemble simulation timescales, comparable to the total time of a single continuous simulation on Anton, specifically achieving the 1.5-millisecond range in January 2010.^[8]

The Anton supercomputer is named after Anton van Leeuwenhoek,^[9] who is often referred to as "the father of microscopy" because he built high-precision optical instruments and used them to visualize a wide variety of organisms and cell types for the first time.

The ANTON 2 machine with four 512 nodes and its substantially increased speed and problem size has been described.^[10]

The National Institutes of Health have supported an ANTON for the biomedical research community at the Pittsburgh Supercomputing Center, Carnegie Mellon University, and currently (8/20) continues with an ANTON 2 system.

YouTube Encyclopedic

1/3
Views:
6 316
4 750
5 578

Transcription

Folks, hopefully you can hear me. My name is (unintelligible). I’m a professor here in Computer Science and also the director of (unintelligible). A great pleasure to introduce David Shaw with the Triangle Computer Science Distinguished Lecture Series. David is one of the most well-known names in Computer Science. And that’s for a very good reason. He pioneered the use of high performance computing in the financial markets and is now having a similar transformative impact on HPC and its uses in molecular dynamics and (unintelligible) simulations. David earned his Ph.D. in Computer Science in 1980 I believe through Stanford (unintelligible) and served on the faculty of the Columbia University Computer Science Department until 1986. He founded the D.E. Shaw Group which is a big building on Wall Street in 1988 and since 2001 he’s also had time to run a hands on computational bio-chemistry research program at D.E. Shaw Research. He was appointed to the President’s Council of Advisors on Science and Technology that’s DTAP by President Clinton in 1994 and was reappointed again by President Obama in 2009. He was elected to the National Academy of Engineering in 2012. He’s a fellow of the American Academy of Arts and Sciences and a winner of the prestigious (unintelligible). His recent work focuses on one of the great challenges of our time and spans Computer Sciences and Bio-chemistry how to accurately simulate the movement of protein modules for more than just a few microseconds. Dr. Shaw and his research team developed Anton, which is a special purpose HPC machine with (unintelligible) problems. The D.E. Shaw Research Team’s work led to “Science Magazine” naming the machine that the research enables as one of the ten most important scientific breakthroughs in 2010. On a personal note I want to tell you that David is one of the most gentle, gentle (unintelligible) and pleasant people to work with you’ll ever meet. And it is truly a great joy for me to welcome him to this Distinguished Lecture Series. Thank you (unintelligible). Thank you. The last part means a lot to me. Let me just start by telling you a little bit about what we’re doing at the Shaw Research. As Stan said, I abandoned Wall Street about eleven years ago to spend full time on a problem or a set of problems that was intriguing to me as a computer scientist, even though I really had no background at that point in Biology or Chemistry or any of the areas that turned out to be necessary. And that was the question of how we can simulate the motion of biologically important macromolecules. Just big molecules in the body that are important in various ways and in particular, protein molecules which are key to almost everything in the body. Those and nucleic acids account for a lot of what runs our bodies both in terms of proteins that have structural units and also a signaling unit. Things where you’ll have one protein that attaches to the other and causes it to change shape, which in turn means a canal signaled to another one. And you have this big sort of complex wiring diagram that signals that are transmitted. And then there are things that have holes that go into the membranes around cells and let some molecules through and some others don’t get through. Those are proteins also. There were a lot of interesting Biology (unintelligible). And the way I sort of fell into this was that a good friend of mine, Rich Freezner, who’s a Chemistry professor up at Columbia started telling me about the research he was doing and our daughters have always been and are still best friends. And neither of us is very good at vacations. So, when we got together with the family we’d take long walks and he’d tell me about this is the hard part. We can see these simulations, but not for long enough to really see what’s going on with the behavior of these proteins. So, having you know when you have a hammer everything looks like a nail, I had remarkably few hammers at that point. What I had done at Columbia and in my doctoral work also was to design and build massively parallel machines. So, the first thing that occurred to me was can you build a special purpose architecture that was designed in a very different way from an ordinary computer that could vastly speed up this problems to get me to the time scales that Rich said were really important. And that’s what the Shaw Research focuses on. There are a couple of little related problems, but basically we’re interested in use of what are called molecular dynamic simulations, which are basically just following how a protein moves over a period of time and then using that to discover how, interesting things about how proteins move, how they interact with other molecules, and then the ultimate goal is to try and understand biological mechanisms, translating the structures and those motions into a better understanding of the underpinnings both of normal physiology and also disease. When something goes wrong, where there’s a mutation in a protein, where an encounter of something else that gums up the works, what’s going on and more importantly and this is the long-term focus of the Shaw Research we’d really like to cure some diseases, have some way of understanding mechanisms and then designing molecules and in particular typically small organic molecules of the sort that most drugs are. Designing one’s intervene and diseases and cure people or extend lives or improve the quality of life. So, we use our hammer which is molecular dynamic simulations to study these problems and try and come up with something that could actually be useful at some point. So, let me just take you through first some of the underpinnings. I just gave you sort of a quick sketch of proteins, but I want to explain where they come from and then how they are structured and how they move. So, let me just give you a little bit of a refresher course in basic bio-chemistry. So, for start, DNA codes for proteins. We now know it does an awful lot of other stuff also, but it’s fairly simple code where every three base pairs together codes for one of twenty amino acids. It could code for more, but some of it is redundant. And there are twenty of these types of amino acids. They’re linked together just like beads on a chain. So, it’s literally that simple. There are twenty colors. The beads go in a certain order. And once that happens, that determines the three dimensional structure that they have in the body. And they’re sort of different levels here. And I’m mentioning this largely so that you can interpret some of the pictures and movies. You start with this structure. It folds up typically, into most of it into some intermediate structures which can be helixes or sheets that lie next, strands that lie next to each other and form sort of a flat sheet of paper like sheet. And then the final stages, those things typically fold up into a sort of globular shape. They take on some characteristic shape. And we know experimentally that for most proteins even if you do this in the test tube, if you put this string of amino acids, it’ll fold up into that correct shape. It’s not true for quite everything, but to a first approximation, that’s what you can think of. So, it’s DNA, to beads on a chain, folds up into a three dimensional structure. And that structure moves and does interesting things. So, where are we? Well, we’ve decoded the genome. And even that’s sort of an over simplification. But that was a huge breakthrough scientifically. And it’s driven a lot of advances that we have now. But what that basically gives us is something like a parts list. We have these identifying numbers. You can sort of look them up. But for a large fraction of all of those genes, we don’t know what the proteins are that come out of them. And even when we do, we often don’t know the structure. For many proteins we do. We’re getting more all the time. But some of the most important ones, some of the ones that live in membranes and control active switches and control a lot of different things, we really don’t know what they look like, much less how they all work together. And that’s what you’d really like to know. So, one of the problems and the question of determining structure, but also the question of how does the machine actually work. So, how can you do that? Now this is sort of a computer scientist interpretation of the world, which is there is Computer Science and then there is everything else, which is all mysterious. So, you know, from my viewpoint laboratory experiments which were the traditional way of getting information about most things biological, the basic principal by which they work is magic. And what you do is you come up with fancy instruments. Many of the things are wet. You shoot high energy particles at things. You watch them vibrate and a lot of things like that, and magically you get data out of there. This is obviously impossible, so that was ruled out as a possible paradigm for research in our lab. And in addition to that, it’s very hard to get certain sorts of information. Experiments are really, really powerful and can get a lot of information that we can’t get in our lab. But at the same time it’s very hard to get detailed moving pictures. It’s hard enough to get still pictures of some of these structures. And the amount of information that we have about exactly what the motions are that proteins go through in a biological time scale is very, very primitive. Whereas simulation on the other hand, if we start with a structure, if we try and fill in the gaps and see well it moves from this known thing to this known thing, we try and see ah when this happens this other thing moves. That’s the kind of thing where if you have a simulation and all the data is recorded there, you can take a look at it, rerun the movie, go backwards, study the part that you want, that’s a very appealing way to get complimentary information. And in some cases we think that is a hypothesis to be validated by experimental tests or refuted as the case may be. So, how do you do those kinds of simulations? You can very accurately simulate tiny molecules. But for molecules the size of a whole protein moving in a solvent environment, citing it’s mostly water with some ions floating around. You have to use some approximations. In fact you have to. You can’t solve the shroding away the equation and get exact solutions for pretty much anything. And in this case you have to use some pretty severe approximations. The state of the art in that area is what’s called molecular mechanics, sorry molecular dynamic simulations with what’s called like your mechanics sports field. It’s a classical model of the underlying physics. And I’m going to show you what one of those looks like now. Not because I hope that anybody’s going to go home (unintelligible) any formulas, but I’m trying to do this mostly from a computational viewpoint. Actually incidentally, how and I should say this quickly are most of your computer scientists? How many are computer scientists and engineers and so forth? Okay. And then biologists and chemists and other (unintelligible). Actually more evenly balanced than I’d realized. Okay. Good. I’ll try and shift a little bit the balance of who I bore for how much of the time. But let me just run through a quick overview from a computer scientist’s viewpoint of what the algorithm looks like. And it’s actually very straight forward it’s just very computationally demanding. So, when you do a molecularly dynamic simulation, this is what you’re doing. You start by dividing time up into a lot of tiny time steps. Then you’re going to integrate Newton’s laws of motion over this time by you know in an iterative way. The problem is these time steps have to be very small in order to keep the equation stable and not come up with crazy numbers that where the energies blow up and all the atoms go flying off in random directions. And the timescale that you would have to simulate at with straightforward techniques are on the order of a (unintelligible) segment which is ten to the minus fifteenth seconds. So, whatever you have to do, you have to do a lot to simulate any reasonable length of time. And what you have to do in each one of those time steps is conceptually, it’s not quite as bad as this, but conceptually for every pair of atoms in the system you’re going to calculate the forces exerted by you know all of the atoms on let’s say this atom. So, it’s a quadratic problem. You’re looking for all possible pairwise interactions. And the way that you know the forces that you look at are ones that are muttled by this molecular mechanics force field I mentioned. This is what one looks like and I’m not going to go through most of it. But in every step we’re going to evaluate something that looks very close to these equations. There’s several different force fields, but they usually have almost exactly this kind of structure with minor variations. So, the first three are what are called bonded terms and those express the forces they approximate the forces between covalently bound atoms. The first one is two that are connected directly to each other. Second one is if you have three atoms, the angle between them. And then the third one is a torsion angle. So, if I had four atoms the way they twist around and there’s some preferred position, preferred distances, preferred angles. There’s sort of a harmonic penalty here and a trigometric one there for being too far away from it. So, this sort of pulls things that are bonded together back to preferred states and penalizes them if they’re too far apart. That’s not the major problem in doing empty simulations. The major problem is with these non-bonded interactions which are interactions between pairs of atoms that are more than I guess three or so apart, three atoms apart, but which can be near each other in space and exert strong effects on each other. So, the simplest one which I’ll bet most of you all will remember is (unintelligible) law that says the, what I’m showing here is actually not forces, it’s the potential. And so it’s the gradient to this that determines the forces. But the potential energy between them is proportional to the two charges. If it’s two positives they want to repel. If it’s a positive and a negative, they want to go together. Multiply those together and divide by the distance between them. And this isn’t a conceptually complicated part of the formula, but this is what actually takes most of the time in an empty simulation. It’s just the electrostatic act, the electrostatic interaction, (unintelligible) law. And the problem is that the one over r falls off pretty slowly, gets smaller. But you can’t ignore the interaction between a positive and a, the tractor ports between a positive and a negatively charged atom within the length of say a protein or even part of the way out into the water. So, this takes a lot. And again you’re doing this quadratic number of you know, it’s quadratic in the number of atoms, and you have a lot of these time steps. And then to a lesser extent this term of what’s called the Van der Waals interaction. This part is basic. The one over our sixth term basically just means that independent of charge, atoms sort of like to be closer together and this part here says but not too close. The one over r to the twelfth part. This is not the correct functional form. It’s a rule of thumb that people have sort of fitted to. In fact this one right here is a quantum mechanical weirdness, which is based on the idea that electrons are permeons. And so their position in this three m dimensional space is such that if they get very close to each, if the electron, you know basically you have an entity symmetry property where if you interchange to electrons the wave function has to be inverted. And so if they’re right on top of each other it has to go to zero. And there’s sort of a smoothness thing that’s imposed. It’s weird quantum mechanical stuff. But the way to think about it is you really can’t let any two electrons get too close which means really you can’t let two atoms get too close to each other. And it doesn’t look anything like this, but that’s true from any of these things. There are gross approximations that happen to be fairly close, close enough to get surprisingly good results in at least many cases. And I’m gonna; my very last slide is going to be, but not all cases we don’t know just how bad it really is. Anyway, we’ve gotten just to refresh our memory. We divided time into tiny steps. Calculate all these pairwise forces, then for each one of those pairs evaluate something that looks like this. And then that gives us forces if we move each atom just a tiny bit. And then we repeat the process over and over again. And the problem is that over and over again is a lot of times. So the target that I was interested in when I first organized the group was to get to timescales like a millisecond or so. And at the time we started it that was about three orders of magnitude longer than anybody had been able to simulate. Now it’s about two orders of magnitude longer than you know had been done previously. And the reason for that is that’s a time frame at which a lot of interesting things start to happen biologically and they just hadn’t been accessible in some cases even to experimental work and never before to simulation. So why do this at all? Well, let’s just do you know set up some things that are quite extreme not true. What if it turned out that the force field that I just showed you was perfectly accurate? You know it just solved the shrodding or equation. And then what if empty simulations were infinitely fast so you could simulate for whatever period of time you wanted. Why would that be interesting? Well you know one thing with the, if you wanted to know the structure of proteins whose structures of aren’t known, you would just sort of throw them in there just like you did in an experiment, watch them fold into the structures that they have, watch em interact with each other, watch how they move and you could stay out of a laboratory with all of this mythical equipment and simply do data mining on all this perfect data that showed how all of your molecules is going to move. Now the problem with that is that this isn’t true. An empty is definitely not infinitely fast. And it’s also not the case that these force fields are perfectly accurate. But we’ve now seen in fact when you make this the length of time you can simulate significantly longer than it was before we can actually model some sorts of things that had been inaccessible before, and those primitive force fields actually are good enough to tell you some interesting things. So, what happens in that time frame? Well one thing is that big structural changes often occur at times far longer than what was possible before this machine that we built. The record length simulation before was ten microseconds, which was done by (Kyle Sultan?) on one of the world’s most powerful super computers. And that is, there’s a huge amount that was learned by simulations that length and shorter. But a lot of the kinds of things that we were interested in were on the order of between that level and two orders of magnitude longer. And over that time frame we know through a lot of experimental data that major structural changes in the protein often happen. You know it’ll go from this shape to something entirely different. And you just don’t see that if you’re simulating a very small amount of time. Also a lot of interactions with proteins between proteins and other proteins, between proteins and nucleic acids, DNA and RNA. And importantly to us, interactions between proteins and drug molecules. These things very often in the changes and shape and in dynamics that they induce, those kinds of things often occur over time frames that are significantly longer than the previous ten microsecond (unintelligible) time scale. And the final one is probably the most famous of the problems throughout this whole area, which is called the protein folding problem. And protein folding is actually, means different things to different people and the part that I think most people hear about and associate with that most is just predicting the three dimensional structure of a protein from its amino acid sequence. But the part that scientifically is probably of more interest to people in the field is how the process of folding works. Now folding is a process by which this long chain of amino acids folds up into a three dimensional structure and understanding the process. What drives that? Does it go; you know does it always go the same way? Does this happen first and then this and this all the time, or does it sometimes go here? Does it stop and get stuck and then back up? Very little was known about that. And so that was one of the things that we wanted to simulate. Folding also sometimes does go wrong in various situations. And mis-folding is implicated in various diseases. But it’s probably best known both as a scientific problem and because it’s a very useful task, very useful hard task that can help us with a lot of the other changes that are important, scientifically, but also in terms of potential therapeutics. So, what to do about this, this two order of magnitude shortfall. Well, our hammer was something called Anton, Anton One. And this is one of the machines. This is a five hundred twelve node machine. These are based on custom basics that we designed, and so this is five hundred twelve of those basics and the surrounding stuff around them. And what Anton is, is a special purpose super computer if you will. It just is for molecular dynamics simulation. It’s not a general purpose supercomputer. You know if we knew how to design a machine that could do any kind of scientific simulation, order of the magnitude faster, you know it would be a great service to the world and we would certainly try and tell everybody about it. But I have no idea about that. We’re able to have the luxury of wanting to do just one thing, but much much faster. And Anton is in fact about a hundred times faster as I mentioned than was previously possible with a general purpose machine. But it’s very brittle. It’s not, you know in theory you could program each of the nodes and they’ve got some general purpose processors on them, but that wouldn’t be a smart thing to do, wouldn’t be very efficient, and it would be really difficult to program, because that’s not what it’s designed for. But Anton was designed for justice when application. And together with a very different kind of algorithmic kind of approach on several levels. And I’ll say something about that a little later, but I want to first give you a shot of what just a few movies showing you what an empty simulation really is and what you might use it for. So Anton. We’ve got, it’s been in operation for gee I don’t know, two and a half years or something I think. I might be off a little bit. We built thirteen machines of which let’s see, all of them are the size you just saw except one is twice that size and one of them is four times that size. And each one of them is capable of millisecond scale simulations which we run on a regular basis. Also the runs we do are shorter, somewhat shorter but still long by conventional standards. But sometime when we need to we’ll take a look at how a molecule moves for a very long period of time. So, for those of you who are involved in you know in this area, you know an obvious question is well a millisecond of what. A millisecond simulating what system? And there are a couple of standard benchmarks, but one of them is (unintelligible), which is a system that has about twenty-five thousand atoms including water molecules. What we do is we simulate. Sometimes you can use a an approximation of the water environment, without representing every water molecule. We do, because we happen to be interested in some interactions of water molecules with the surface of a protein. But you can learn a lot of interesting things on, without that as well. So, those are the ground rules. And then the other part which is very important from a Computer Science viewpoint is what we’re interested in is running very long single simulations. Now that’s a much harder problem unfortunately than running a very large number of short simulations. And that’s not to say that that isn’t an interesting thing to do. For example B.J. Ponde at Stanford has done some very interesting work where he, this is a great model, what he has done is gotten people interested in the problem of protein folding and they’ve not only dedicated their computers and now their video game machines to running with their spare cycles over the internet, but there’s a reasonable number of people at this point who are actually buying new X-boxes or whatever they are now, my kids could tell me, connecting to the internet and they buy them just so that they can contribute to this project. And as a result they just managed to mobilize huge amount of computer power for something that’s extremely worthwhile. But what he’s got is a bunch of short simulations, you know two hundred thousand of them I think on average are connected. The question is how can you combine all of that and get something meaningful. And each time I’ve said well you know it’s very interesting, you can get this, but of course you couldn’t (unintelligible). B.J. has found a way to do that. And so I’ve stopped. I’m now out of the business of saying things that B.J. can’t do with his distributing processing approach. But I think he would be the first to say that that’s a different kind of set of things that you can do. There’s some purposes for which you really have to be able to see one long trajectory rather than combine information from a lot of short ones to answer many important biological questions. Now why is it hard, to oh yeah I should mention the other way that you could make the problem easier is if you were simulating very large molecules and you wanted to get the same speed up on a per atom basis, that’s a much easier problem also. And in both cases the reason is that the thing that becomes a bottleneck pretty quickly has in most parallel computation with the exception of what are called embarrassingly parallel problems. But the thing that hits you as a bottleneck is interprocessor communication. How do you communicate between all these different processors so that you can break the problem up into a lot of pieces, cooperate to solve one big problem together. And very often what happens is you have more computation, more computational power available on the chips especially as things move. The way (unintelligible) is turning out right now is that we get more and more silicon real-estate. We can have more stuff going on simultaneously on a chip, but the speed with which we can communicate between chips is not growing anywhere near as fast. So this is what’s you know in various forms this keeps rearing its head when you keep try and accelerate a process by having you know a million cooks in the kitchen and trying to get your eggs cooked in four seconds, it just doesn’t work at a certain point. So, inter-processing communication not surprisingly is a lot of what we focused on. And ultimately if I don’t run out of time and I think I’m okay, I’ll say something about some of the things that we’ve done to minimize that. In the way you have to alter the computation to allow you to do that. So, oh, let me mention one other thing that I always forget to mention. One of the things that we want to be able to do is observe all molecular dynamics trajectories, see proteins move for long periods of time, but there’s another thing that I didn’t really anticipate the impact of until you know we saw how group members were actually using it. And that is for rapid turnaround it turns out that you know sometimes you just say I want to run this simulation. I have to run it for you know five hundred microseconds of you know simulated time. A lot of the time what you do is you kind of play. And you say let’s simulate this just you know see qualitatively what’s going on here and then you see some interest in motion and you’re saying I wonder if that’s real. Is that repeated? Okay. Run one long simulation. And typically it’s even worse than that. You try one thing this fails. This fails. You look at that. You play it back. You run this. You try other runs. If that process involves, even if you have like one desktop computer and it was very cost effective if it takes three months every time you do that casual experiment then in real world time you’re never going to get to the important experiments. So part of what turned out to be important was simply being able to run what was previously a very long simulation you know over a matter of an hour or so and get the feedback and decide what’s next that you want to do based on that. So, this is just a quick summary of the absolute links. This is Sultans’ result on what’s called the ww domain and we actually went Anton one was up and running. One of the first things that we did was try this again and do a simulation of the ww domain. And actually what Klaus was looking at is the protein folding problem to see if he could get the amino acid sequence for this particular protein domain, a chunk of a protein to fold up into its native structure. The one it has in the body. And it didn’t fold. And he was able to show that the reason was not only because he hadn’t simulated long enough, he had good evidence that there were force field defects that were involved also. So, for various reasons we wanted to test that and so we ran a simulation that was over a millisecond long and were able to get the ww domain to fold and unfold repeatedly. We actually had to improve the force fields using some quantum mechanical data that we’d analyzed over a period of a couple of years in our lab. But when we did we were able to see reversible folding and unfolding. I’ll just explain quickly because there will be a couple of references to this later. A protein will left with some devices at biological temperature it’ll fold up into this three dimensional shape. But if you heat it up at a certain point it’ll sort of fall apart. It will denature. And there is something called the melting temperature which is the temperature roughly speaking where it’ll be folded about half of the time. So, what we did is simulated at that melting temperature, which is a temperature at which we would expect to see a fold and unfold, and fold and unfold enough times that we could get statistically significant results about the pathway of folding. What happens first? How does it work? What moves at a microscopic level? Those kinds of questions. And what these changes in the force field we were able to observe that. And then the current longest simulation I think at this point is still about 2.2 milliseconds. Although we have one running now that may be longer at this point. I’m not sure. So, oh yeah just one other quick thing for those of you who are involved in, actually I could be computer scientists are interested in biology or biologists who do computational work. That’s one of the Anton machines we built we made available free of cost for our non-profit institutions for research use. And we gave that to the National Resource for Biomedical Super Computing which is at the University of Pittsburgh and there’s (unintelligible), which is of course near there. So far, this was just changed, but there have been seventy research groups that have made use of it. And the National Academy of Sciences allocate to the timing instead of waiting for people who want to use it. And one of the things that we’re interested in is to get people involved in doing simulation. We haven’t been doing that before. Among other things they can think of a lot of stuff that just wouldn’t have occurred to us, and many of them have expertise with molecular systems of a sort that we don’t. And you know from our viewpoint we sort of feel like because we’d like to design drugs at some point. Anything that helps the community molecular dynamic simulations is great for us. You know and hopefully it’ll turn into a tool where we’ll all understand better than we would of by ourselves what it can do. So, if anybody’s interested in doing this kind of simulation the second year competition just happened and they announced the winner a few months ago I guess. But we hope to be able to continue this program. Okay. Let me show you what MP simulations actually look like and then I’ll talk a little bit about what makes Anton fast what the strategies were. So first, I’m just going to show you a simulation of a molecule. It happens to be something called GPW, which was not particularly of interest to us biologically. In fact this simulation I think was run primarily to test out the machine. Or allowed statistics and things about stability and other things to ensure that it was not, that we weren’t seeing accumulating errors. So, it’s one of the first that we ran and what I’m going to show you is, oh first of all it’s an idealized model. Each one of these little balls is an atom. They’re color coded for the atom fight. What you’re going to see is sort of smoothed version, because in this is such a long simulation that if we were actually showing you all the motion of the individual atoms you’d just see sort of a blur. Because there’s a lot of high frequency and vibration and so forth. But I’m going to try and give you a sense for how long a period of time we’re simulating compared to what was going on before. So, this is what an empty simulation looks like. And you’ll see little things happening, little rotations. But then every once in a while it makes a big move. And those kind of moves were never seen in the previous simulations because they just weren’t long enough. Sometimes it’ll just extrude out a lot. And then sometimes it’ll pop back into its original folded native state. And it’s funny, because you know we sort of knew what to expect by this point. But the first time I saw, it was not actually this animation but a related one. The first time I saw one it was kind of like oh that’s what proteins are doing. Because no matter how much you read about it, just watching a movie you get a slightly different feeling for what it’s all about. Now, let me just by way of comparison show you what’s the what sultans’ you know previous longest simulation the ten microseconds simulation would look like on this time scale. I’m doing this by hand, but roughly this. It would be about that. So, there was a lot of information. Again, if we weren’t sort of low pass filtering it you would see a lot of vibration, but you would completely miss the big structural changes that we’re particularly interested in. Let me just show you one…I talked about the ww folding. This is one of the folding events. You know it goes back and forth. Here’s just one of them so you can see it. In terms of the representation. Before I was showing you the individual balls and here this is what’s called a ribbon representation where you’re just seeing the polypeptide chain, the string of beads moving around. I want to scroll like that. That’s a helical structure and then sometimes it’s random. And what you’re seeing here in gray is the made of three dimensional structure. That’s what it actually looks like in your body. And what we do is start it here in just a random extended configuration and if you’ll see it fold. And this one I think I should really smooth this more. Because it’s a little jerky. But you can see it. Try to find the right position again based on the laws of physics that’s expressed in that force field. And sometimes it gets close, pops out, it won’t be quite right. When it gets to something it’s very close to the folded state it’s mutually stabilizing. There are a lot of interactions that’s sort of mutually reinforcing. And you’ll still see some wiggling around of the tails and that’s something that apparently really does exist. It correlates with a lot of experimental data. But basically it’s going to settle into its three…excuse me, three dimensional structure. And I think I eliminated the slides yeah that would show that. But the kinds of things that we were able to discover was the order of formation of some of these helical structures and whether it followed a single pathway and the bottom line is that yes it appears to for the most part. And other sorts of technical things that have been debated for many years. We don’t have a comprehensive answer here, and this is a good time to switch to the next slide. This, the one before was a year before. This is another science paper that was just last year where we were able to do the same thing for a systematically chosen set of twelve proteins. Billy had categorized a lot of the fastest folding proteins and we took one representative on from each of a set of different structural categories and we added actually two, because we wanted to make sure our force field didn’t prefer some things over others. So, we added some mixtures of different kinds of structures. And what you’re seeing here is…I can probably figure out which is real and which isn’t if I spent time. Fortunately it’s not easy to do that. But one of these is the native structure. That’s embarrassing. I can’t remember which. And the other one, probably the red is the structure that it folded into when we did a folding simulation. So, for a very diverse set of proteins covering not all proteins. And it’s a very systematic and pressed biased set which is simple small proteins that fold very fast and maybe there’s something systematically different about the way that class of proteins fold. But that caveat not withstanding we saw all of them fold correctly. One of them is known to be not thermo stable, and you have to you know with certain mutations it’s more stable. The one that wasn’t in fact we did find it had problems finding the right one, but the variant that’s known experimentally to fold reliably does here too. So, this told us some important things about most importantly the quality of the force fields, the quality of the physical models not just that we use, because we’ve just made minor tinkering based on evolved physical models that have come out of the community for a period of decades now. But those force fields that the whole field tends to use actually are sufficient to reliably fold proteins with different kinds of structures. These are the helical structures and this is actually a mixed one, I guess a simple one with beta sheets is this one and successfully get them to fold. Now big question is how is this going to apply to more complex proteins. So far what I’ve said is we can’t infer from this that it’s going to work for complex proteins also. Personally I’m a little more pessimistic than that. Although we haven’t published all this yet, we’re gathering some data now in our most recent study that suggests although it doesn’t show conclusively that for more complex proteins probably even if we could simulate long enough to match their experimentally observed holding times the machine wouldn’t get the correct answer and that that would be a force field related problem. And you know it takes a little time, but afterward anyone wants to talk about it I can tell you why I very strongly suspect that’ll be the case. And what that means is that more work is going to have to be done on force fields within our own group or other groups. And it also means the future versions of Anton have to be able to handle some of the kinds of changes that we think are likely to be necessary. Probably Anton One would do it fine. We would just have to figure out what has to be changed about these physical models. But there’s no guarantee. And it could be that we all find hey you know these parts of that force field that I showed you before are just fine. But here’s this part that we knew was an approximation and sure enough nature really doesn’t like it. This is not modeling it. Nature has some nightmare cases, some funny quanta mechanical thing that it uses in making natural proteins. And that if you don’t get it much more right than it is now, you’ll never model the free energy landscape of the proteins. You know how it finds the preferred energetically preferred structure to the end and hence the changes that it’s going to make. We don’t know. I don’t even have a hunch about how far we can push it. But we know that there are limitations now and that maybe that there are going to continue to be limitations that they’ll always be limitations, but ones that are biologically significant and if it’s a distorted idea or none at all about certain biologically relevant phenomenon. So, that’s where we are at this point. And let me just show you a couple more examples and then talk a little more about the computer sciencey stuff. I’m not sure exactly why I threw it in; this is just kind of a cool picture. But this is a potassium channel. It’s a protein that passes selectively potassium ions through a membrane from the outside, through a membrane that surrounds the cell or it can happen in other kinds of internal membranes as well. And so what we were able to do for the first time is actually to see the process of conduction of ions through there and see how it works and where the things sit. And I’m actually showing you just the port, what’s called the port domain, the part that’s the two. But there’s a little thing here that detects the voltage in the surrounding environment, causes this to open or close. We’re going to have a paper coming out very shortly that shows for the first time what the closed state looks like for this ion channel. We think it’s a pretty general observation, what the sequence is, how the thing closes and so forth. But this one is just the ions going through. And we also you know look at things like the rate at which these things passed through. You’ll see there’s some little water molecules in between the ions. And we wanted to know what the preferred pattern is. Usually they’re (unintelligible) but not always. And a variety of other questions. These little (unintelligible)groups move around and sort of hold it in various positions. And other stuff. But it’s just fun to watch that kind of movie. And then here’s just a couple of quick things more on the applied side. One of the classes of compounds that we studied are proteins called Kinase’s, which are responsible for phosphorylating other molecules which can then act as a signal to drive all sorts of other things. You know changes in other proteins and things going together and apart. And in particular it turns out that this class of proteins is very central in many forms of malignancies. So, this particular one, Abl Kinase is known to be involved in chronic miologenous leukemia. And in fact there’s a classical mutation called the Philadelphia chromosomial mutation. And what do you have in this kinase is a part that switches into an active state that causes the uncontrolled proliferation of cells and you know patients die. And in a control region that takes signals in but keeps it mostly turned off, but turns it on when it wants that replication to occur. In this case it gets lopped off and fused in a way that the regulatory part is missing. And it’s just always turned on. So, there’s a drug, we didn’t come up with this, but something called (unintelligible) that’s (unintelligible) that is one of the, it’s like you know one of the exciting success stories in the field of cancer of which are precious few unfortunately, where patients who used to die of this disease, many of them are living for years now. And in fact we still don’t know how long they’ll survive. But this drug does something. And it was known that this switch involved that you know when the thing when the protein, when the kinase turned on or off there was a characteristic three amino acid group here called the DFG group that moved from the in to the out position. I’ll just show you quickly. You’re seeing a flip occur, although it’s a little hard without seeing this in binocular vision to see what’s going on. But that flip causes it to move between a cancer causing and a non-cancer causing state. And so what we were interested in doing is figuring out how we could, how does that work. What is it doing? Turns out it blocks this transition, holds it in an inactive state, in a very particular way. And there are some other residues whose pronation state controls that flip. And based on building this structural model and this mechanistic model for this malignancy, we were able to make a bunch of predictions for experiments to do that we couldn’t do, but we collaborated with John Kuriyan whose an experimental expert on Kinases and we predicted that if our model was right, that if you changed the surrounding PH then the binding of (unintelligible) would be enhanced and if it’s the other way it would be inhibited. We also designed some mutations instead. If you made this mutation we predict that it’ll change in the following way. John then ran those experiments in his lab and it matched very well. This is a kind of a model for art collaborations. They’re almost always with experimentalists. I wish we could do some of those things in house, but we don’t have any experimental capabilities of our own. And the nice thing is it’s not just we come up with hypothesis and ask somebody to validate it, but the experimentalists we collaborate with almost invariably know more than we do about the particular biological system that they’d studied for you know often their whole academic careers. Where if we you know do simulations on various types of molecules. So, a lot of the time they’ll look at something and we’ll say what do you think about this. We observed this. And they’ll say that doesn’t mean anything. But wait a second. Can you rerun that? John did this once and they said replay that. Replay it again. Can I, can you just slow this down? Huh. That looks like it’s about three and a half (unintelligible). Is there a salt bridge forming there? And he knew right away what to look like. To me it’s a lot of balls and you know things and graphs of how far apart things were. He said, what would happen if you do this. So, we tried experiments and wonderful interactive process. And that’s the kind of collaboration we really enjoy. There are a few labs where they do both including some people here, and that’s a really powerful combination. Anyway, so that’s Kinases. Last thing I wanted to show you in terms of some real simulations is one where this one is also leukemia oriented. It’s actually the same type of leukemia, no actually no slightly different target. And there’s a drug called (unintelligible) that’s known for buying to this one, which is called Src kinase. It’s an approved drug. And what you think Chan did, which had he asked me first I would have probably said well you know it’s if you want to try it, of course that’s okay, but it’s you know it’s elaborate work. I mean you know it’s probably a waste of (unintelligible) time. I might of just said look why don’t you just do this instead. But even that he just did it, because it was a simple experiment. And what he did was he took this (unintelligible) and put it in a simulated box of water. You can just see it peeking around here. It’s actually not attached anywhere. And he just released it with no you know fake non-physical forces. But nothing. Just let it swim around to see whether it could find a place find a correct place where that drug actually binds on the Src kinase. And the interesting thing is that the place that it bound let me show it to you first and then I’ll say this. But it’s going to move really quickly. It’s going to be searching around trying to find a home somewhere on this kinase. So, yeah it’s a different kind of representation of a protein showing the surface. And it seems, spends a fair amount of time initially there, but it can’t quite figure it out. And eventually it’s locking into what turns out to be an extremely good fit. And the interesting part, well there’s several interesting things. The fact that it can find the correct binding spot and bind as well compared to an experimentally known bound structure is interesting. But the thing that’s really cool here is the binding pocket, the thing it’s fitting into on the protein doesn’t actually have the correct form until this (unintelligible) this drug induces that structural change. It’s not a matter of the drug sort of moves by and finds the exact lock and key, it’s also helping to create the correct fit. And they’ve co-evolved. This drug didn’t exist out there, but this is an area where signaling is going on and some indigenous molecule, some molecule in the body itself has presumably found a way to bind. But it causes that to happen and that kind of induced fit is very interesting to me in particular. So, let me show you one more of those again because it’s fun. I think right now this other drug which is involved in breast cancer; another approved drug is hidden behind here. But you’ll see the same thing. There we go. And this is a different representation that you’ve seen. Almost all of them, all of the different sort of little tubes to show it. But you can see more, you don’t see the shape filling part here, but this allows us to see what chemical groups are making contact here. This is sort of a greasy area right here. Just like you know if you pour oil in the top of you know a pot of water when you’re making spaghetti and the oil droplets kind of come together. That’s a hydrophobic effect and that works with molecules too. Whereas some of the things over here are forming these special kinds of bond co hydrogen bonds. Polar molecules line up so that they interact in a way that’s actually fairly tricky. So, we can, with this kind of representation we can sort of see what’s going on. But we’re doing more studies of this sort now. I don’t think it’s the straightforward thing of well let’s release a lot of candid drug molecules into a box and see if they stick to a lot of the molecule interests. That’s not really practical. It takes long enough that you wouldn’t do this with a million compounds. But in terms of studying the process of drug binding, the recognition of binding sites is quite interesting from that viewpoint. Maybe it’ll lead to something good. And I’m just very grateful that (unintelligible) doesn’t do what he’s supposed to be doing. Anyway let me say a little bit at least. Eww. I should have done this earlier. We, I’m supposed to stop talking in five minutes I guess and then we have questions? Okay. Okay. That’s a long time. So, the longer I talk the less likely I’m going to be, take, be horribly grilled. So, I can fill up a lot of space when I’m filibustering. Now let me try and get through it and take some questions. So, what makes Anton fast for this particular application? The first thing although it’s kind of a buzz word is co-design. This really is an example of when I came up with the basic idea of Anton; it was literally the same day when I finally was working out the end of the algorithm and the end of the architecture. They were mapped to each other. And if you know at this point now when we’re designing Anton Three, actually Anton Two should be here in a year or something like that I guess a little more. But at every stage when we’re thinking about the architecture, we’re thinking about the algorithms also. And Anton was designed to use a very different algorithmic approach especially on the communication side that was well suited to mapping things onto a silicon surface. If we had taken standard hardware and tried to come up with new algorithms I mean maybe we would have come up with some. There are a few things that we’ve done that are applicable to conventional clusters also. But we would, would never have gotten to this kind of speed. And conversely if we had come up with, if we had designed hardware to take current algorithms, I forget if I said it. I may, it’s both directions. I forget which one I said first. But we couldn’t take standard hardware and design algorithms or standard algorithms and design hardware for it and get anything like this kind of speed up. So, a lot of it is mapping one to the other with a sense for how silicon can be used in a very different way if you’ve got a specialized problem. And I realize the way I just phrased it made it sound like this is some general principle and I want to make sure nobody leaves with the impression that you know that we’ve discovered some magic way to design for any special purpose application. That’s not the case. And sometimes I’m asked well what can you learn about using, designing special purpose machines for various problems. And I don’t honestly know what you can learn from the Anton experience. In fact one thing that’s easy to say is there are many problems that are very important in scientific computation for which this kind of approach would definitely not work. And some examples are things where you can rigorously prove in fact that there’s a bound on how much speed up you can get from parallism typically in force by a communication problem. And so in those cases it doesn’t make sense to pursue the special purpose approach and try and get nature speed up. So, you’ll speed up one part all the way and then the thing that’ll hold the back will be something you can’t do anything about. There are other cases where it’s not so obvious this wouldn’t work. But I honestly don’t know what the range of applicability is. And even in areas that seem like they’re the same thing you know I visited national, several of the national labs and several other groups where people are actually doing MD simulation and we’ll talk afterward and I’ll say oh gee you’re doing MD simulation. Maybe something like this can work. And what’s lost and variably happened is when we went into the details it actually wouldn’t. You know you’ll find out well their force field is a little different and takes a lot longer and they’re really not interested in particles moving around because they have crystal structures or studying metals or they want this interface. Or they study enormously large systems to see the kinds of things they’re interested in, like cracks propagating through, big bulk materials. And they’re willing to simulate for very short periods of time or some other things involving parameters or other things that look like details that when you look at it mean that you really can’t apply this kind of technology there. So, the bottom line is I’m not sure how much really applies. But the algorithm and the architecture are focused very heavily on executing particle, particle interactions very quickly or particle point interactions. That can either be two particles interacting together along the lines of the force field that I described or particles interacting with a grid, a regular sort of lattice. And I won’t describe why exactly but there are a couple of applications for that, for interacting a particle with distant atoms. I’ve said that I was lying a little bit about interacting all pairs of atoms. If something is far enough away then you can actually cheat a little bit and represent the very distant ones using some sort of a continuum approximation. Either with some nice smooth folia transforms or representing it using some other kind of series expansion. A fast multi-pull method that you can use. You could use an iterative solution to the (unintelligible) equation for just some things. Or a lot of different approaches that you could use. And in any of those, mapping things and interacting on a grid is an important thing to be able to do. So, it’s very fast for that. I’m going to skip over this except to say that, actually I’ll get to that. Poor isobility is something that I only alluded to briefly, but it’s a defect in current force field’s known defect. Most of the ones that are out there make an assumption that is manifestly not true which is that every atom bears a certain typically partial charge on it and independent. You know the model is that it depends on the immediate covalent neighbors, the immediate bonded neighbors, but not on the electrostatic environment. And that’s just wrong. If you have an atom here and you know there’s a big positive charge over here, it’s going to sort of smoosh the electron cloud, I guess drag it toward it in this case and cause charge to sort of be distributed in different ways. And the main thing if you get any decent results, I wouldn’t predict that you could get meaningful results without having that. But in fact most people who do these simulations including us are using non-porous models. So, it’s a design, Anton is designed to accommodate variations in most existing force fields and to accommodate one particular model of porusability and several other things. But it’s still a very special purpose really, laboratory instrument more than what you could ordinarily call a super computer. So, okay where is the underlying speed up from? It’s really from a couple of different things. One is that for the specialized calculations that Anton has to do for MD simulations, you have massive, massive specialized paralyzation. A bunch of really stupid simple units that can only do one thing. Basically like an in body engine. Something you can interact different particles really, really fast, but that’s all it can do. That’s just part of the machine. And then there are places where you actually do need some flexibility and programmability. But you only want to do the things that you absolutely have to that way. And fortunately this isn’t as any ingenuity on our part, although many members of the group are extremely ingenious and come up with all sorts of things I would never have thought of, but we’re lucky. We’re very lucky that MD simulations actually have a property where the parts that you spend most of your time doing are very regular mapped neatly onto silicon and the parts that are less regular or things that if you design special purpose hardware for would be used for low fraction of the total time, those things don’t happen very much, like the bonded interactions. There is math to do, but there aren’t that many atoms that you know a given atom isn’t bonded closely to very many other atoms. The part that takes a long time is the interaction with all the other ones which is very regular. The little idiocencricies of all the local interactions you can afford to do more flexible, because it isn’t the inter loop. It’s not the thing that takes most of the time. So, that’s one thing. Oh yeah, the hardware, it’s very dumb. Actually I have a picture of that. And the only thing that’s variable is some tables and a few parameters. But it’s not that part of it but the engine that does most of the work is not programmable at all. There are other things that are sort of you know, there are other parts that are programmable. And they’re pretty horizontal and you know data flows in complicated ways but it’s very difficult to program and it’s not the source of major power that we’re using for the inter loop part. The other part is the communication. And the first thing to say is within the chip the data is flowing in a highly regular pattern. You don’t have things where it’s going off and being stored in memory all the time and pulled out and then going to random locations. A lot of things have a whole wave of data going through in a very regular pattern. So things tend to go only where they’re needed at just the right time. And that makes it very easy to save on the control circuitry and to avoid memory bottlenecks and so forth. And then when it comes to outside memory with a conventional microprocessor you know a lot of that time is actually spent bringing data in from memory. And in fact with modern microprocessor you go to all sorts of lengths to avoid having to do that very much, like keep stuff you’ve recently used on the chip and local faster storage and even so the memory, access to memory is a big bottleneck. And basically through a first approximation Anton doesn’t have to go to any external memory at all. Now that’s not quite true, but the inter-loop part, all the data is residing on the chip. It all moves around here and you don’t have to go out. For a very big systems on a small machine, you would sometimes. But it’s designed to basically run with memory that’s internal to the chip. And then on an inner chip level also, the communication is handled in a very careful way. And that extends to two different aspects of communication, one of which is latency. Latency is just a measure of if I’m sending data from one processor to another, one chip to another, even if I have a tiny amount of data, suppose I want to send one bit of information from this chip to that chip, how long does that take? What’s the sort of fixed overhead? That’s the latency. How long it takes to get there. And then the other part is bandwidth. If I want to then continue pumping data down here to these other locations and in fact have all of them send to all of the others or some pattern like that, how much data do I have to shove down all of those pipes? How fast can I do that? Both of those are critical and we’ve given a lot of attention to minimizing the time required for those also. So, I’m just going to quickly go through, because I do want to hear some questions. (Unintelligible). This is how it’s wired. It’s basically a three dimensional mesh, but with end around connections. It’s really a hypertorous and so the data gets exchanged here, but if it falls off of the right side it goes to the left. And there’s sort of two reasons. One is, it’s not exactly a reason but this has been used. This interconnection topology has been used even for general purpose super computers. But in our particular application with the method that we use in Anton One, we’re actually simulating not just one conceptually. We’re not just simulating one box full of water and protein. What we’re actually simulating is an infinitely tiled array where this box is repeated like a crystalline structure going off into infinity. So, all that really means is that if this whole, if this system being simulated is distributed among these say five hundred twelve processors…if an atom runs off the right side of that box, it magically reappears on the left side of the whole box. So, you know it’s what’s called periodic boundary conditions. And that’s nice that it has some good formal properties which allows you to do some sorts of solutions to the distant problem. I won’t go into it much more, but this is a convenient architecture for that reason and preserves the locality of space. I guess I should have said this directly. This is something where you allocate, you divide space into a bunch of little cubes and then the interconnection pattern mirrors the way those cubes are physically connected or approximate to each other in the machine and that models the space. I’m sure there’s a more elegant way to have said that. Anyway here’s the stupid part that does lots of the work. This is what we call the PPIP the Pairwise Point Interaction Pipeline. There are thirty-two of these on every chip. And I’m not going to go through exactly what they do, but the key thing to just notice here is that there are a lot of little blue boxes. And every one of these blue boxes is doing some arithmetic. Something is (unintelligible) computation. And it’s pretty much a linear pipeline. You’re not seeing a lot of these loops going around here. You don’t freeze while waiting for things to happen. Just to give you a rough idea the thing that they’re more of than anything else is basically calculating a very low resolution distance between two atoms. And all it’s trying to do is filter out all the ones that are obviously too far away to interact in this part of the, this subsystem. And then the ones that are left it’s going to do a more accurate calculation and then it’ll do a bunch of other stuff. And then there are table look ups and some stuff here that does cubic (unintelligible) and you know things like that. The take home message is just lots of dumb things. And each one of these little units is very, very small. You know multipliers their width is proportional to the square of the bit width. So, this one actually is I believe now eight bit. I designed it as I think it was six, but they wouldn’t let me have something that wasn’t a power too. So, it’s a little bigger. But that’s a tiny little multiplier and then all of these units are very tiny. And the data flow is very regular and so forth. So, that’s why you get a huge amount of arithmetic density. And I think I left out the flexible part. Just very quickly. Okay what I just showed you thirty-two of those is twenty-eight stage pipeline. So, that’s where you get a lot of the speed. On Anton One that’s about a gigahertz clock, but it’s doing, at any given point in time you’re doing a lot of operations. And I talked about the particle to particle and the polar isobility. And I’m just going to run by these slides very quickly. Hey, what I’ll say here is just for a given atom, and actually I’ve sort of alluded to this, we’re actually going to do all of the pairwise quadratic number of calculations inside some box. And then we use other kinds of approximation. In our case based on a technique that was developed in the group but is based on some of the things that many other people use. They use a (unintelligible) transforms in a way of decomposing the interactions of the two parts. What you’re left with in this part is the in body problem, the classic in body problem of physics in a form where you often see it in computational science which is a range limit at in body problems all pairs within a certain distance. And there’s an algorithm I came up with that minimizes the total bandwidth of the communication. Rather than go through it and its heritage, let me just show you graphically what it looks like. So, here is one of those little cubes in space. That’s the green box. And the traditional method for doing this was something where…okay. So, here’s, here’s a box and let’s say this atom that’s in this box has to interact with all atoms within a certain distance of it. And that whole volume of where the atoms are that has to interact with some atom inside that box is shown in blue here. So, it’s pretty close to a hemisphere. And that’s the traditional method. You interact pairs of atoms inside the box where at least one of those two atoms live. And it sort of makes sense you would want to do that. At least one of the atoms you don’t have to move anywhere. But it turns out that’s actually sub-optimal. And the NT method, what it does is if I took the space such that if you had any pair of atoms, they always go to some box that in general neither one of them live in. They might by coincidence, but it turns out the most efficient way both in absolute terms and asymptotically is to have them meet on neutral territories though I referred to that. And so you define one area which I call the plate and another one which is the tower. And everything inside this place has to interact with everything inside this tower. And then as the number of processors or processor like things, I won’t explain exactly what I mean by that goes up, the advantages go up. And in fact if you had sixty-four processors there really isn’t one. This is the traditional approach, this is the NT approach. But if I go to five hundred twelve processors and then 4K and finally up to 32K processors what happens is that this scales as the volume, so it’s, if the interaction distance is r it’s going to scale as r squared. And over here what you’re going to get is scaling with r to the three halves, where the you know with the number of processors. Because this becomes an infinite (unintelligible) thin plate and this becomes an infinite (unintelligible) thin rod going up there. Now what really matters as somebody once told me, we don’t live in Esentopia. And that’s absolutely right. You have to look at the constant factors. Constant factors are very good for this application in Anton One. And we’ll be using something pretty much like it in Anton Two. But I’ve been, including much of last night, been spending a lot of my time thinking about Anton Three, which is another five maybe even six years out into the future. And am busy trying to figure out ways to rip to shreds all the earlier assumptions. I think this general approach probably will not survive, because now we’re worried about a whole different set of constraints. The speed of light going through the wires between our chips is going to be one of the limiting factors in how fast we can accelerate MD simulations. Latency will become more important than bandwidth. We’ll have tons of area on the chip, but the clock cycle speed won’t speed up that much more. The world is changing gradually, but in distinct ways. And so we’re trying to think how what’s that next generation going to look like. And in particular how are we going to change not the laws of physics, but the abstract models that we use. Are there some ways we can cheat? You know it’s becoming so difficult to import distant atoms to somewhere else. Does it really matter to get those effects very accurately or can we do some things, as I suspect is the case, that are kind of even grosser approximations but with functions that are wrong but well behaved and where various things that you can expect to be cancelling out. And maybe spend more time on the local stuff. Don’t know the answer to all of this but it’s going to change. The next generation especially after this generation coming up is going to look significantly different or it won’t represent a big improvement in speed again. And that’s something we would like to see. Maybe not forever, but we’d really like to see a substantial increase in the length of time we can simulate. Are those lights saying please get off the stage or… It looks better with them. But anyway I’m going to pretty much close this. Just say what each light is rather than telling you what’s in it. Basically what the way we get this, well what’s inside the chip all the particles in the tower are moving in this way. Yeah that’s kind of the other extreme isn’t it? And then the ones from the plate move here with a linear amount of input we have a potentially quadratic number of interactions. And then when it comes off of the chip they get, we have a reduction. They get combined. The forces all get combined together on the same particle and then we’re back to linear. So, the amount of IO which has to be limited pretty strictly is linear, but the amount of computation can be quadratic. That is just sort of a you know it’s a very severe over simplification. Last slide which is about force fields again. We, I already mentioned this. We don’t know just what limits of current molecular mechanics force fields are. We do know now and especially over the last year I think we’ve provided pretty strong evidence that you can learn about a lot of things that weren’t known before, understand a lot of processes through long simulations even with today’s generation of force fields. We don’t know what the limits are. And one of the advantages, is you know even though some of the limits may be real limits, the nice thing is that we can simulate long enough now that hopefully we’ll be able to discover what some of those limits are in a way that we couldn’t have if we couldn’t run such long simulations. Seems like a pretty, as I say it, it seems like kind of a lame excuse. You know we couldn’t, if we run into limitations well at least we’ve found where the limitations are. But that’s true. And there are actually some interesting things scientifically about that. Because if we find out what the missing pieces are where are the problems? Biological problems. I’m not really concerned with underlying physics. We know what those are. But does it really matter if we approximate this? Is there something else that has to be represented quanta mechanically? What matters in terms of biology is a non-obvious, at least to me, that is the answer is not obvious to that question. And this may be a naive way to think about it, but here’s the way I think about it. A lot of it just has to do with how did proteins actually evolve or have to evolve. You know is it the case that proteins are designed in a way that they have to be a robust in the face of thermal noise. And most mutations are actually not ones that cause something horrible to happen in the case of silent mutations maybe things are designed in a way that proteins are so robust that it’s okay to calculate some of these things wrong in particular categories. Or we also know that proteins serve as switches, so they ought to have a careful pre-energy balance. In some cases you want them to sort of be balanced like this. Something causes them to change and it falls down this side of the hill. And then maybe you really want a subtle change to make it go down this side of the hill also. So maybe nature has conspired. Nature does a perfect example they do a perfect job, or it does a perfect job of solving the (unintelligible) equation. It’s really, really good at that. But we can’t do that. So, maybe for it, it really is fine. It can be carefully balanced. If anything’s wrong it’ll mess up our simulations. But nature doesn’t care, because it does it exactly. Maybe that’s the case. And I don’t know, I don’t have a good enough intuition if anybody does; know which way that’s going to pan out. But hopefully by getting the answers to some of those questions, at least we’ll learn something about bio-physical theory. And if we’re lucky maybe we really will be able to simulate for long enough and through new force fields accurately enough that we can design a lot of drugs and save some lives. We’ll see. Take some questions.

References

^ David E. Shaw; Martin M. Deneroff; Ron O. Dror; Jeffrey S. Kuskin; Richard H. Larson; John K. Salmon; Cliff Young; Brannon Batson; Kevin J. Bowers; Jack C. Chao; Michael P. Eastwood; Joseph Gagliardo; J.P. Grossman; C. Richard Ho; Douglas J. Ierardi; István Kolossváry; John L. Klepeis; Timothy Layman; Christine McLeavey; Mark A. Moraes; Rolf Mueller; Edward C. Priest; Yibing Shan; Jochen Spengler; Michael Theobald; Brian Towles; Stanley C. Wang (July 2008). Anton, a special-purpose machine for molecular dynamics simulation. Vol. 51. ACM. pp. 91–97. doi:10.1145/1364782.1364802. ISBN 978-1-59593-706-3. S2CID 52827083. {{cite book}}: |journal= ignored (help) (Related paper published in Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA '07), San Diego, California, June 9–13, 2007).
^ Richard H. Larson; John K. Salmon; Ron O. Dror; Martin M. Deneroff; Cliff Young; J.P. Grossman; Yibing Shan; John L. Klepeis; David E. Shaw (2009). High-Throughput Pairwise Point Interactions in Anton, a Specialized Machine for Molecular Dynamics Simulation (PDF). IEEE. ISBN 978-1-4244-2070-4. Archived from the original (PDF) on June 5, 2011. Retrieved January 13, 2009. {{cite book}}: |journal= ignored (help)
^ Jeffrey S. Kuskin; Cliff Young; J.P. Grossman; Brannon Batson; Martin M. Deneroff; Ron O. Dror; David E. Shaw (2009). Incorporating Flexibility in Anton, a Specialized Machine for Molecular Dynamics Simulation (PDF). IEEE. ISBN 978-1-4244-2070-4. Archived from the original (PDF) on December 4, 2008. Retrieved January 13, 2009. {{cite book}}: |journal= ignored (help)
^ Cliff Young; Ron O. Dror; J. P. Grossman; John K. Salmon; Shaw, David E.; Joseph A. Bank; et al. (2009). "A 32x32x32, spatially distributed 3D FFT in four microseconds on Anton". Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. New York, NY: ACM. pp. 1–11. doi:10.1145/1654059.1654083. ISBN 978-1-60558-744-8. S2CID 5611246.
^ "National Resource for Biomedical Supercomputing". Archived from the original on May 23, 2010. Retrieved May 14, 2010.
^ David E. Shaw; Ron O. Dror; John K. Salmon; J.P. Grossman; Kenneth M. Mackenzie; Joseph A. Bank; Cliff Young; Martin M. Deneroff; Brannon Batson; Kevin J. Bowers; Edmond Chow; Michael P. Eastwood; Douglas J. Ierardi; John L. Klepeis; Jeffrey S. Kuskin; Richard H. Larson; Kresten Lindorff-Larsen; Paul Maragakis; Mark A. Moraes; Stefano Piana; Yibing Shan; Brian Towles (2009). "Millisecond-scale molecular dynamics simulations on Anton". Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis - SC '09 (PDF). New York, NY, USA: ACM. pp. 1–11. doi:10.1145/1654059.1654099. ISBN 978-1-60558-744-8. S2CID 4390504. Archived from the original (Portland, Oregon) on April 23, 2012. Retrieved April 20, 2012.
^ Pande Group (March 2017). "Client Statistics by OS". Stanford University. Retrieved February 3, 2012.
^ Vijay Pande (January 17, 2010). "Folding@home: Paper #72: Major new result for Folding@home: Simulation of the millisecond timescale". Retrieved September 22, 2011.
^ John Markoff (July 8, 2008). "Herculean Device for Molecular Mysteries". The New York Times. Retrieved April 25, 2010.
^ Shaw, David E; Grossman, JP; Bank, Joseph; A Batson, Brannon; Butts, J Adam; Chao, Jack C; Deneroff, Martin M; Dror, Ron O; Even, Amos (2014). "Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer". SC14: International Conference for High Performance Computing, Networking, Storage and Analysis. New Orleans, LA: ACM. pp. 41–53. doi:10.1109/SC.2014.9. ISBN 978-1-4799-5499-5. S2CID 3354876.

External links

D. E. Shaw Research website

This page was last edited on 21 February 2024, at 09:53

From Wikipedia, the free encyclopedia

YouTube Encyclopedic

Transcription

See also

References

External links