Value at risk (VaR) is a measure of the risk of loss for investments. It estimates how much a set of investments might lose (with a given probability), given normal market conditions, in a set time period such as a day. VaR is typically used by firms and regulators in the financial industry to gauge the amount of assets needed to cover possible losses.
For a given portfolio, time horizon, and probability p, the p VaR can be defined informally as the maximum possible loss during that time after we exclude all worse outcomes whose combined probability is at most p. This assumes marktomarket pricing, and no trading in the portfolio.^{[1]}
For example, if a portfolio of stocks has a oneday 5% VaR of $1 million, that means that there is a 0.05 probability that the portfolio will fall in value by more than $1 million over a oneday period if there is no trading. Informally, a loss of $1 million or more on this portfolio is expected on 1 day out of 20 days (because of 5% probability).
More formally, p VaR is defined such that the probability of a loss greater than VaR is (at most) p while the probability of a loss less than VaR is (at least) 1−p. A loss which exceeds the VaR threshold is termed a "VaR breach".^{[2]}
It is important to note that, for a fixed p, the p VaR does not assess the magnitude of loss when a VaR breach occurs and therefore is considered by some to be a questionable metric for risk management. For instance, assume someone makes a bet that flipping a coin seven times will not give seven heads. The terms are that they win $100 if this does not happen (with probability 127/128) and lose $12,700 if it does (with probability 1/128). That is, the possible loss amounts are $0 or $12,700. The 1% VaR is then $0, because the probability of any loss at all is 1/128 which is less than 1%. They are, however, exposed to a possible loss of $12,700 which can be expressed as the p VaR for any p <= 0.78%.^{[3]}
VaR has four main uses in finance: risk management, financial control, financial reporting and computing regulatory capital. VaR is sometimes used in nonfinancial applications as well.^{[4]} However, it is a controversial risk management tool.
Important related ideas are economic capital, backtesting, stress testing, expected shortfall, and tail conditional expectation.^{[5]}
YouTube Encyclopedic

1/5Views:190 81755 252187 26669 53395 476

✪ 7. Value At Risk (VAR) Models

✪ Paul Wilmott on Quantitative Finance, Chapter 19, Value at Risk (VaR)

✪ FRM: Three approaches to value at risk (VaR)

✪ VaR (Value at Risk), explained

✪ 2015  FRM : VAR Methods Part I (of 2)
Transcription
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. KENNETH ABBOTT: As I said, my name is Ken Abbott. I'm the operating officer for Firm Risk Management at Morgan Stanley, which means I'm the everything else guy. I'm like the normal stuff with a bar over it. The complement of normal I get all the odd stuff. I consider myself the Harvey Keitel character. You know, the fixer? And so I get a lot of interesting stuff to do. I've covered commodities, I've covered fixed income, I've covered equities, I've covered credit derivatives, I've covered mortgages. Now I'm also the Chief Risk Officer for the buy side of Morgan Stanley. The investment management business and the private equity holdings that we have. And I look after lot of that stuff and I sit on probably 40 different committees because it's become very, very, very bureaucratic. But that's the way it goes. What I want to talk about today is some of the core approaches we use to measure a risk in a market risk setting. This is part of a larger course I teach at a couple places. I'm a triple alum at NYU no I'm a double alum and now I'm on their faculty [INAUDIBLE]. I have a masters in economics from their arts and sciences program. I have a masters in statistics from Stern when Stern used to have a stat program. And now I teach at [INAUDIBLE]. I also teach at Claremont and I teach at [INAUDIBLE], part of that program. So I've been through this material many times. So what I want to do is lay the foundation for this notion that we call risk, this idea of var. [INAUDIBLE] put this back on. Got it. I'll make it work. I'll talk about it from a mathematical standpoint and from a statistical standpoint, but also give you some of the intuition behind what it is that we're trying to do when we measure this thing. First, a couple words about risk management. What is the risk do? 25 years ago, maybe three firms had risk management groups. I was part of the first risk management group at Bankers Trust in 1986. No one else had a risk management group as far as I know. Market risk management really came to be in the late '80s. Credit risk management had obviously been around in large financial institutions the whole time. So our job is to make sure that management knows what's on the books. So step one is, what is the risk profile of the firm? How do I make sure that management is informed about this? So it requires two things. One, I have to know what the risk profile is because I have to know it in order to be able to communicate it. But the second thing, equally important, particularly important for you guys and girls, is that you need to be able to express relatively complex concepts in simple words and pretty pictures. All right? Chances are if you go to work for big firm, your boss won't be a quant. My boss happens to have a degree from Carnegie Mellon. He can count to 11 with his shoes on. His boss is a lawyer. His boss is the chairman. Commonly, the most senior people are very, very intelligent, very, very articulate, very, very learned. But not necessarily quants. Many of them have had a year or two of calculus, maybe even linear algebra. You can't show them look, when you and I chat and we talk about regression analysis, I could say x transpose x inverse x transpose y. And those of you that have taken a refresher course think, ah, that's beta hat. And we can just stop it there. I can just put this form up there and you may recognize it. I would have to spend 45 minutes explaining this to people on the top floor because this is not what they're studying. So we can talk the code amongst ourselves, but when we go outside our little group getting bigger we have to make sure that we can express ourselves clearly. That's done in clear, effective prose, and in graphs. And I'll show you some of that stuff as we go on. So step one, make sure management knows what the risk profile is. Step two, protect the firm against unacceptably large concentrations. This is the subjective part. I can know the risk, but how big is big? How much is too much? How much is too concentrated? If I have $1 million of sensitivity per basis point, that's a 1/100th of 1% move in a rate. Is that big? Is that small? How do I know how much? How much of a particular stock issue should I own? How much of a bond issue? How much futures open interest? How big a limit should I have on this type of risk? That's where intuition and experience come into play. So that's the second part of our job is to protect against unacceptably large losses. So the third, no surprises, you can liken the trading business it's taking calculated risks. Sometimes you're going to lose. Many times you're going to lose. In fact, if you win 51% of the time, life is pretty good. So what you want to do is make sure you have the right information so you can estimate, if things get bad, how bad will they get? And to use that, we leverage a lot of relatively simple notions that we see in statistics. And so I should use a coloring mask here, not a spotlight. We do a couple things. Just like the way when they talk about the press in your course about journalism, we can shine a light anywhere we want, and we do all the time. You know what? I'm going to think about this particular kind of risk. I'm going to point out that this is really important. You need to pay attention to it. And then I could shade it. I can make it blue, I can make a red, I can make it green. I'd say this is good, this is bad, this is too big, this is too small, this is perfectly fine. So that's just a little bit of quick background on what we do. So I'm going to go through as much of this as I can. I'm going to fly through the first part and I want to hit these because these are the ways that we actually estimate risk. Variance, covariance [? as ?] a quadratic form. Monte Carlo simulation, the way I'll show you is based on a quadratic form. And historical simulation is Monte Carlo simulation without the Monte Carlo part. It's using historical data. And I'll go through that fairly quickly. Questions, comments? No? Excellent. Stop me look, if any one of you doesn't understand something I say, probably many of you don't understand it. I don't know you guys, so I don't know what you know and what you don't know. So if there's a term that comes up, you're not sure, just say, Ken, I don't have a PhD. I work for a living. I make fun of academics. I know you work for a living too. All right. There's a guy I tease at Claremont [INAUDIBLE] in this class, I say, who is this pointy headed academic [INAUDIBLE]. Only kidding. All right, so I'm going to talk about one asset value at risk. First I'm going to introduce the notion of value at risk. I'm going to talk about one asset. I'm going to talk about price based instruments. We're going to go into yield space, so we'll talk about the conversions we have to do there. One thing I'll do after this class is over, since I know I'm going to fly through some of the material and since this is MIT, I'm sure you're used to just flying through material. And there's a lot of this, the proof of which is left to the reader as an exercise. I'm sure you get a fair amount of that. I will give you papers. If you have questions, my email is on the first page. I welcome your questions. I tell my students that every year. I'm OK with you sending me an email asking me for a reference, a citation, something. I'm perfectly fine with that. Don't worry, oh, he's too busy. I'm fine. If you've got a question, something is not clear, I've got access to thousands of papers. And I've screened them. I've read thousands of papers, I say this is a good one, that's a waste of time. But I can give you background material on regulation, on bond pricing, on derivative algorithms. Let me know. I'm happy to provide that at any point in time. You get that free with your tuition. A couple of key metrics. I don't want to spend too much time on this. Interest rate exposure, how sensitive am I to changes in interest rates, equity exposure, commodity exposure, credit spread exposure. We'll talk about linearity, we won't talk too much about regularity of cash flow. We won't really get into that here. And we need to know correlation across different asset classes. And I'll show you what that means. At the heart of this notion of value at risk is this idea of a statistical order statistic. Who here has heard of order statistics? All right, I'm going to give you 30 seconds. The best simple description of an order statistic. PROFESSOR: The maximum or the minimum of a set of observations. KENNETH ABBOTT: All right? When we talk about value at risk, I want to know the worst 1% of the outcomes. And what's cool about order statistics is they're well established in the literature. Pretty well understood. And so people are familiar with it. Once we put our toe into the academic water and we start talking about this notion, there's a vast body of literature that says this is how this thing is. This is how it pays. This is what the distribution looks like. And so we can estimate these things. And so what we're looking at in value at risk, if my distribution of returns, how much I make. In particular, if I look historically, I have a position. How much would this position have earned me over the last n days, n weeks, n months. If I look at a frequency distribution of that, I'm likely don't have to I'm likely to get something that's symmetric. I'm likely to get something that's unimodal. It may or may not have fat tails. We'll talk about that a little later. If my return distribution were beautifully symmetric and beautifully normal and independent, then the risk I could measure this 1% order statistic. What's the 1% likely worst case outcome tomorrow? I might do that by integrating the normal function from negative infinity for all intents and purposes five or six standard deviations. Anyway, from negative infinity to negative 2.33 standard deviations. Why? Because the area under the curve, that's 0.01. Now this is a one sided confidence interval as opposed to a two sided confidence integral. And this is one of these things that as an undergrad you learn two sided, and then the first time someone shows you one sided you're like, wait a minute. What is this? Than you say, oh, I get it. You're just looking at the area. I could build a gazillion two sided confidence intervals. One sided, it's got to stop at one place. All right so this set of outcomes and this is standardized this is in standard deviation space negative infinity to 2.33. If I want 95%, or 5% likely loss, so I could say, tomorrow there's a 5% chance my loss is going to be x or greater, I would go to 1.645 standard deviations. Because the integral from negative infinity to 1.645 standard deviations is about 0.05. It's not just a good idea, it's the law. Does that make sense? And again, I'm going to say assuming the normal. That's like the old economist joke, assume a can opener when he's on a desert island. You guys don't know that one. I got lots of economics jokes. I'll tell them later on maybe or after class. If I'm assuming normal distribution, and that's what I'm going to do, what I want to do is I'm going to set this thing up in a normal distribution framework. Now doing this approach and assuming normal distributions, I liken it to using Latin. Nobody really uses it anymore but everything we do is based upon it. So that's our starting point. And it's really easy to teach it this way and then we relax the assumptions like so many things in life. I teach you the strict case then we relax the assumptions to get to the way it's done now. So this makes sense? All right. So let's get there. This is way oversimplified but let's say I have something like this. Who has taken intermediate statistics? We have the notion of stationarity that we talk about all the time. The mean and variance constant is one simplistic way of thinking about this. Do you have a better way for me to put that to them? Because you know what their background would be. PROFESSOR: No. KENNETH ABBOTT: All right. Just, mean and variance are constant. When I look at the time series itself, the time series mean and the time series variance are not constant. And there also could be other time series stuff going on. There could be seasonality, there could be autocorrelation. This looks something like a random walk but it's not stationary. It's hard for me to draw inference by looking at that alone. So we want to try to predict what's going to happen in the future, it's kind of hard. And the game, here, that we're playing, is we want to know how much money do I need to hold to support that position? Now, who here has taken an accounting course? All right, word to the wise there's two things I tell students and quant finance programs. First of all, I know you have to take a time series course I'm sure this is MIT. If you don't get a time series course, get your money back because you've got to take time series. Accounting is important. Accounting is important because so much of what we do, the way we think about things is predicated on the dollars. And you need to know how the dollars are recorded. Quick aside. Balance sheet. I'll give you a 30 second accounting lecture. Assets, what we own. Everything we own we have stuff, it's assets. We came to that stuff one of two ways. We either pay for it out of our pocket, or we borrowed money. There's no third way. So everything we own, we either paid for out of our pocket or borrowed money. The amount we paid for out of our pocket is the equity. The ratio of this to this is called leverage among other things. All right? If I'm this company. I have this much stuff and I bought it with this much debt, and this much equity. Again, that's a gross oversimplification. When this gets down to zero, it's game over. Belly up. All right? Does that make sense? Now you've taken a semester of accounting. No, only kidding. But it's actually important to have a grip on how that works. Because what we need to make sure of is that if we're going to take this position and hold it, we need to make sure that with some level of certainty every time we lose money this gets reduced. When this goes down to zero, I go bankrupt. So that's what we're trying to do. We need to protect this, and we do it by knowing how much of this could move against us. Everybody with me? Anybody not with me? It's OK to have questions, it really is. Excellent. All right, so if I do a frequency distribution of this time series, I just say, show me the frequency with which this thing shows. I get this thing, it's kind of trimodal. It's all over the place. It doesn't tell me anything. If I look at the levels the frequency distribution, the relative frequency distribution of the levels themselves, I don't get a whole lot of intuition. If I go into return space, which is either looking at the log differences from day to day, or the percentage changes from day to day, or perhaps the absolute changes from day to day it varies from market to market. Oh, look, now we're in familiar territory. So what I'm doing here and this is why I started out with a normal distribution because this thing is unimodal. It's more or less symmetric. Right? Now is it a perfect measure? No, because it's probably got fat tails. So it's a little bit like looking for the glasses you lost up on 67th Street down on 59th street because there's more light there. But it's a starting point. So what I'm saying to you is once I difference it no, I won't talk about [INAUDIBLE]. Once I difference the timeshares, once I take the timeshares and look at the percentage changes, and I look at the frequency distribution of those changes, I get this which is far more minimal. And I can draw inference from that. I can say, ah, now if this thing is normal, then I know that x% of my observations will take place over here. Now I can start drawing inferences. And a thing to keep in mind here, one thing we do constantly in statistics is we do parameter estimates. And remember, every time you estimate something you estimate it with error. I think that maybe the single most important thing I learned when I got my statistics degree. Everything you estimate you estimate with error. People do means, they say, oh, it's x. No, that's the average and that's an unbiased estimator, but guess what, there's a huge amount of noise. And there's a certain probability that you're wrong by x%. So every time we come up with a number, when somebody tells me the risk is 10, that means it's probably not 10,000, it's probably not zero. Just keep that in mind. Just sort of throw that in on the side for nothing. All right, so when I take the returns of this same time series, I get something that's unimodal, symmetric, may or may not have fat tails. That has important implications for whether or not my normal distribution underestimates the amount of risk I'm taking. Everybody with me on that more or less? Questions? Now would be the time. Good enough? He's lived this. All right. So once I have my time series of returns, which I just plotted there, I can gauge their dispersion with this measure called variance. And you guys probably know this. Variance the expected value of x i minus x bar I love these thick chalks squared. And it's the sum of x i minus x bar squared over n minus 1. It's a measure of dispersion. Variance has [INAUDIBLE]. Now, I should say that this is sigma squared hat. Right? Estimate parameter estimate. Parameter. Parameter estimate. This is measured with error. Anybody here know what the distribution of this is? Anyone? $5. Close. m chi squared. Worth $2. Talk to me after class. It's a chi squared distribution. What does that mean? That means that we know it can't be 0 or less than 0. If you figure out a way to get variances less than zero, let's talk. And it's got a long right tail, but that's because this is squared. [INAUDIBLE] one point can move it up. Anyway, once I have my returns, I have a measure of the dispersion of these returns called variance. I take the square root of the variance, which is the standard deviation, or the volatility. When I'm doing it with a data set, I usually refer to it as the standard deviation. When I'm referring to the standard deviation of the distribution, I usually call it the standard error. Is that a law or is that just common parlance? PROFESSOR: Both The standard error is typically for something that's random, like an estimate. Whereas the standard deviation is more like for sample KENNETH ABBOTT: Empirical. See, it's important because when you first learn this, they don't tell you that. And they flip them back and forth. And then when you take the intermediate courses, they say, no, don't you standard deviation when you mean standard error. And you'll get points off on your exam for that, right? All right, so, the standard deviation is the square root of the variance, also called the volatility. In a normal distribution, 1% of the observations is outside of 2.33 standard deviations. For 95%, it's out past 1.64, 1.645 standard deviations. Now you're saying, wait a minute, where did my 1.96 go that I learned as an undergrad. Two sided. So if I go from the mean to 1.96 standard deviations on either side, that encompasses 95% of the total area of the integral from negative infinity to positive infinity. Everybody with me on that? Does that make sense? The two sided versus one sided. That's confused me. When I was your age, it confused me a lot. But I got there. All right so this is how we do it. Excel functions are var and you don't need to know that. All right, so in this case, I estimating the variance of this particular time series. I took the standard deviation by taking the square root of the variance. It's in percentages. When you do this, I tell you, it's like physics, your units will screw you up every time. What am I measuring? What are my units? I still make units mistakes. I want you to know that. And I'm in this business 30 years. I still make units mistakes. Just like physics. I'm in percentage change space, so I want to talk in terms of percentage changes. The standard deviation is 1.8% of that time series I showed you. So 2.33 standard deviations times the standard deviation is about 4.2%. What that says, given this data set one time series I'm saying, I expect to lose, on any given day, if I have that position, 99% of the time I'm going to lose 4.2% of it or less. Very important. Think about that. Is that clear? That's how I get there. I'm making a statement about the probability of loss. I'm saying there's a 1% probability, for that particular time series which is all right? If this is my historical data set and it's my only historical data set, and I own this, tomorrow I may be 4.2% lighter than I was today because the market could move against me. And I'm 99% sure, if the future's like the past, that my loss tomorrow is going to be 4.2% or less. That's [? var. ?] Simplest case, assuming normal distribution, single asset, not fixed income. Yes, no? Questions, comments? AUDIENCE: Yes, [INAUDIBLE] positive and [INAUDIBLE]. KENNETH ABBOTT: Yes, yes. Assuming my distribution is symmetric. Now that's the right assumption to point out. Because in the real world, it may not be symmetric. And when we go into historical simulation, we use empirical distributions where we don't care if it's symmetric because we're only looking at the downside. And whether I'm long or short, I might care about the downside or the pretty upside. Because it'll be short, and I care about how much is going to move up. Make sense? That's the right question to ask. Yes? AUDIENCE: [INAUDIBLE] if you're doing it for upside as well? KENNETH ABBOTT: Yes. AUDIENCE: Could it just be the same thing? KENNETH ABBOTT: Yes. In fact, in this case, in what we're doing here of variance covariance or closed form var, it's for long or short. But getting your signs right, I'm telling you, it's like physics. I still make that mistake. Yes? AUDIENCE: [INAUDIBLE] symmetric. Do you guys still use this process to say, OK KENNETH ABBOTT: I use it all the time as a heuristic. All right? Because let's say I've got and that's a very good question let's say I've got five years worth of data and I don't have time to do an empirical estimate. It could be lopsided. If you tell me a two standard deviation move is x, that means something to me. Now, there's a problem with that. And the problem is that people extrapolate that. Sometimes people talk to me and, oh, it's an eight standard deviation move. Eight standard deviation moves don't happen. I don't think we've seen an eight standard deviation move in the Cenazoic era. It just doesn't happen. Three standard deviation you will see a three standard deviation move once every 10,000 observations. Now, I learned this the hard way by just, see how many times do I have to do this? And then I looked it up in the table, oh, I was right. When we oversimplify, and start to talk about everything in terms of that normal distribution, we really just lose our grip on reality. But I use it as a heuristic all the time. I'll do it even now, and I know better. But I'll go, what's two standard deviations? What's three standard deviations? Because by and large and I still do this, I get my data and I line it up and I do frequency distributions. Hold on, I do this all the time with my data. Is it symmetric? Is it fat tailed? Is it unimodal? So that's a very good question. Any other questions? AUDIENCE: [INAUDIBLE] have we talked about the [? standard t ?] distribution? PROFESSOR: We Introduced it in the last lecture. And the problems set this week does relate to that. KENNETH ABBOTT: All right, perfect lead in. So the statement I made, it's 1% of the time I'd expect to lose more than 4.2 pesos on 100 peso position. That's my inferential statement. In fact, over the same time period I lost 4.2% 1.5% of the time instead of 1% of the time. What that tells me, what that suggests to me, is my data set has fat tails. What that means is the likelihood of a loss a simple way of thinking about it [INAUDIBLE] care whether what that means in a metaphysical sense, a way to interpret it. The likelihood of a loss is greater than would be implied by the normal distribution. All right? So when you hear people say fat tails, generally, that's what they're talking about. There are different ways you could interpret that statement, but when somebody is talking about a financial time series, it has fat tails. Roughly 3/4 of your financial time series will have fat tails. They will also have time series properties, they won't be true random walks. True random walks says that I don't know whether it's going to go up or down based on the data I have. The time series has no memory. When we start introducing time series properties, which many financial time series have, then there's seasonality, there's mean reversion, there's all kinds of other stuff, other ways that we have to think about modeling the data. Make sense? AUDIENCE: [INAUDIBLE] higher standard deviation than [INAUDIBLE]. KENNETH ABBOTT: Say it once again. AUDIENCE: Better yield, does it mean that we have a higher standard deviation than [INAUDIBLE]? KENNETH ABBOTT: No. The standard deviation is the standard deviation. No matter what I do, this is standard deviation, that's it. Don't have a higher standard deviation. But the likelihood of the put it this way the likelihood of a move of 2.33 standard deviations is more than 1%. That's the way I think of it. Make sense? AUDIENCE: Is there any way for you to [INAUDIBLE] to KENNETH ABBOTT: What? AUDIENCE: Sorry, is there any way to put into that graph what a fatter tail looks like? KENNETH ABBOTT: Oh, well, be patient. If we have time. In fact, we do that all the time. And one of our techniques doesn't care. It goes to the empirical distribution. So it captures the fat tails completely. In fact, the homework assignment which I usually precede this lecture by has people graphing all kinds of distributions to see what these things look like. We won't have time for that. But if you have questions, send them to me. I'll send you some stuff to read about this. All right, so now you know one asset var, now you're qualified to go work for a big bank. All right? Get your data, calculate returns. Now I usually put in step 2b, graph your data and look at it. All right? Because everybody's data has dirt in it. Don't trust anyone else. If you're going to get fired, get fired for being incompetent, don't get fired for using someone else's bad data. Don't trust anyone. My mother gives me data, Mom, I'm graphing it. Because I think you let some poop slip into my data. Mother Theresa could come to me with a thumb drive [INAUDIBLE] S&P 500. Sorry, Mother Teresa. I'm graphing it before I use it. All right? So I don't want to say that this is usually in here. We do extensive error testing. Because there could be bad data, there could be missing data. And missing data is a whole other lecture that I give. You might be shocked at [INAUDIBLE]. So for one asset var, get my data, create my return series. Percentage changes, log changes. Sometimes that's what the difference is. Take the variance, take the square root of the variance, multiply by 2.33. Done and dusted. Go home, take your shoes off, relax. OK. Percentage changes versus log changes. For all intents and purposes, it doesn't really matter and I will often use one or the other. The way I think about this all right, there'll be a little bit of bias at the ends. But for the overwhelming bulk of the observations whether you use percentage changes or log changes doesn't matter. Generally, even though I know the data is closer to log normally distributed than normally distributed, I'll use percentage changes just because it's easier. Why would we use log normal distribution? Well, when we're doing simulation, the log normal distribution has this very nifty property of keeping your yields from going negative. But, even that I can call that into question because there are instances of yields going negative. It's happened. Doesn't happen a lot, but it happens. All right. So I talked about bad data, talked about one sided versus two sided. I'll talk about longs and shorts a little bit later when we we're talking multi asset. I'm going to cover a fixed income piece. We use this thing called a PV01 because what I measure in fixed income markets isn't a price. [? Asian ?] measure a yield. I have to get from a change of yield to a change of price. Hm, sounds like a Jacobian, right? With kind of a poor man's Jacobian. It's a measure that captures the fact that my price yield relationship price, yield is nonlinear. For any small approximation I look at the tangent. And I use my PV01 which has a similar notion to duration, but PV01 is a little more practical. The slope of that tells me how much my price will change for a given change of yield. See, there it is. You knew you were going to use the calculus, right? You're always using the calculus. You can't escape it. But the price yield line is nonlinear. But for all intents and purposes, what I'm doing is I'm shifting the price yield relationship I'm shifting my yield change into price change by multiplying my yield change by my PV01 which is my price sensitivity to 1/100th percent move in yields. Think about that for a second. We don't have time to I would love to spend an hour on this, and about trading strategies, and about bull steepeners and bear steepeners in barbell trades, but we don't have time for that. Suffice to say if I'm measuring yields the thing is going to trade as a 789 or a 622 or a 401 yield. How do I get that in the change in price? Because I can't tell my boss, hey, I had a good day. I bought it at 402 and sold it at 401. No, how much money did you make? Yield to coffee break yield to lunch time, yield to go home at the end of day. How do I get from change in yield to change in price? Usually PV01. I could use duration. Bond traders who think in terms of yield to coffee break, yield to lunch time, yield to go home at the end of the day typically think in terms of PV01. Do you agree with that statement? AUDIENCE: [INAUDIBLE] KENNETH ABBOTT: How often on the fixed income [? desk ?] did you use duration measures? AUDIENCE: Well, actually, [INAUDIBLE]. KENNETH ABBOTT: Because of the investor horizon? OK, the insurance companies. Very important point I want to reach here as a quick aside. You're going to hear this notion of PV01, which is called PVBP or DV01. That's the price sensitivity to a one basis point move. One basis point is 1/100th of a percent in yield. Duration is the half life, essentially, of my cash flow. What's the weighted expected time to owe my cash flows? If my duration is 7.9 years, my PV01 is probably about $790 per million. In terms of significant digits, they're roughly the same but they have different meanings and the units are different. Duration is measured in yield, PV01 is measured in dollars. In bond space I typically think in PV01. If I'm selling to long term investors they have particular demands because they've got cash flow payments they have to hedge. So they may think of it in terms of duration. For our purposes, we're talking DV01 or PV01 or PVBP, those three terms more or less equal. Make sense? Yes? AUDIENCE: [INAUDIBLE] in terms of [INAUDIBLE] versus [INAUDIBLE]? KENNETH ABBOTT: We could. In some instances, in some areas and options we might look at an overall 1% move. But we have to look at what trades in the market. What trades in the market is the yield. When we quote the yield, I'm going to quote it going from 702 to 701. I'm not going to have the calculator handy to say, a 702 move to a 701. What's 702 minus 701 divided by 702? Make sense? It's the path of least resistance. What's the difference between a bond and a bond trader? A bond matures. A little fixed income humor for you. Apparently very little. I don't want to spend too much time on this because we just don't have the time. I provide an example here. If you guys want examples, contact me. I'll send you the spreadsheets I use for other classes if you just want to play around with it. When I talk about PV01, when I talk about yields, I usually have some kind of risk free rate. Although this whole notion of the risk free rate, which is so much of modern finance is predicated on this assumption that there is a risk free rate, which used to be considered the US treasury. It used to be considered risk free. Well, there's a credit spread out there for US Treasury. I don't mean to throw a monkey wrench into the works. But there's no such thing. I'm not going to question 75 years of academic finance. But it's troublesome. Just like when I was taking economics 30 years ago, inflation just mucked with everything. All of the models fell apart. There were appendices to every chapter on how you have to change this model to address inflation. And then inflation went way and everything was better. But this may not go away. I've got two components here. If the yield is 6%, I might have a 450 treasury rate and 150 basis point credit spread. The credit spread reflects the probability of default. And I don't want to get into measures of risk neutrality here. But if I'm an issuer and I have a chance of default, I have to pay my investors more. Usually when we measure sensitivity we talk about that credit spread sensitivity and the risk free sensitivity. We say, well, how could they possibly be different? And I don't want to get into detail here, but the notion is, when credit spreads start getting high, it implies a higher probability of default. You have to think about credit spreads sensitivity a little differently. Because when you get to 1,000 basis points, 1,500 basis points credit spread, it's a high probability of default. And your credit models will think different. Your credit models will say, ah, that means I'm not going to get my next three payments. There's an expected, there's a probability of default, there's a loss given default, and there's recovery. A bunch of other stochastic measures come into play. I don't want to spend any more time on it because it's just going to confuse you now. Suffice to say we have these yields and yields are composed of risk free rates and credit spreads. And I apologize for rushing through that, but we don't have time to do it. Typically you have more than one asset. So in this framework where I take 2.33 standard deviations times my dollar investment, or my renminbi investment or my sterling investment. That example was with one asset. If I want to expand this, I can expand this using this notion of covariance and correlation. You guys covered correlation and covariance at some point in your careers? Yes, no? All right? Both of them measure the way one asset moves vis a vis another asset. Correlation is scaled between negative 1 and positive 1. So I think of correlation as an index of linearity. Covariance is not scaled. I'll give you an example of the difference between covariance and correlation. What if I have 50 years of data on crop yields and that same 50 years of data on tons of fertilizer used? I would expect a positive correlation between tons of fertilizer used and crop yields. So the correlation would exist between negative 1 and positive 1. The covariance could be any number, and that covariance will change depending on whether I measure my fertilizer in tons, or in pounds, or in ounces, or in kilos. The correlation will always be exactly the same. The linear relationship is captured by the correlation. But the units in covariance, the units count. If I have covariance here it is. Covariance matrices are symmetric. They have the variance along the diagonal. And the covariance is on the off diagonal. Which is to say that the variance is the covariance of an item with itself. The correlation matrix, also symmetric, is the same thing scaled with correlations, where the diagonal is 1.0. If I have covariance because correlation is covariance covariance divided by the product of the standard deviations. Gets me sorry correlation hat. This is like the apostrophe in French. You forget it all the time. But the one time you really need it, you won't do it and you'll be in trouble. If you have the covariances, you can get to the correlations. If you have the correlations, you can't get to the covariances unless you know the variances. That's a classic midterm question. I give that almost not every year, maybe every other year. Don't have time to spend much more time on it. Suffice to say this measure of covariance says when x is a certain distance from its mean, how far is y from its mean and in what direction? Yes? Now this is just a little empirical stuff because I'm not as clever as you guys. And I don't trust anyone. I read it in the textbook, I don't trust anyone. a, b, here's a plus b. Variance of a plus b is variance of a plus variance of b plus 2 times covariance a b. It's not just a good idea, it's the law. I saw it in a thousand statistics textbooks, I tested it anyway. Because if I want to get fired, I'm going to get fired for making my own mistake, not making someone else's mistake. I do this all the time. And I just prove it empirically here. The proof of which will be left to the reader as an exercise. I hated when books said that. PROFESSOR: I actually kind of think that's a proven point, that you really should never trust output from computer programs or packages KENNETH ABBOTT: Or your mother, or Mother Teresa. PROFESSOR: It's good to check them. Check all the calculations. KENNETH ABBOTT: Mother Teresa will slip you some bad data if she can. I'm telling you, she will. She's tricky that way. Don't trust anyone. I've caught mistakes in software, all right? I had a programmer it's one of my favorite stories we're doing one of our first Monte Carlo simulations, and we're factoring a matrix. If we have time, we'll get so I factor a covariance matrix into E transpose lambda E. It's our friend the quadratic form. We're going to see this again. And this is a diagonal matrix of eigenvalues. And I take the square root of that. So I can say this is E transpose lambda to the 1/2 lambda to the 1/2 E. And so my programmer had gotten this, and I said, do me a favor. I said, take this, and transpose and multiply by itself. So take the square root and multiply it by the other square root, and show me that you get this. Just show me. He said I got it. I said you got it? He said out to 16 decimals. I said stop. On my block, the square root of 2 times the square root of 2 equals 2.0. All right? 2.0000000 what do you mean out to 16 decimal places? What planet are you on? And I scratched the surface, and I dug, and I asked a bunch of questions. And it turned out in this code he was passing a float to a [? fixed. ?] All right? Don't trust anyone's software. Check it yourself. Someday when I'm dead and you guys are in my position, you'll be thanking me for that. Put a stone on my grave or something. All right so covariance. Covariance tells me some measure of when x moves, how far does y move? [? Or ?] for any other asset? Could I have a piece of your cookie? I hardly had lunch. You want me to have a piece of this, right? It's just looking very good there. Thank you. It's foraging. I'm convinced 10 million years ago, my ape ancestors were the first one at the dead antelope on the planes. All right. So we're talking about correlation covariance. Covariance is not unit free. I can use either, but I have to make sure I get my units right. Units screw me up every time. They still screw me up. That was a good cookie. All right. So more facts. Variance of xa times yb x squared variance a y squared variance b plus 2xy covariance ab. You guys seen this before? I assume you have. Now I can get pretty silly with this if I want. xayb you get the picture, right? But what you should be thinking, this is a covariance matrix, sigma squared, sigma squared, sigma squared. It's the sum of the variances plus 2 times the sum of the covariances. So if I have one unit of every asset, I've got n assets, all have to do to get the portfolio variance is sum up the whole covariance matrix. Now, you never get only one unit, but just saying. But you notice that this is kind of a regular pattern that we see here. And so what I can do is I can use a combination of my correlation matrix and a little bit of linear algebra [? ledger ?] domain, to do some very convenient calculations. And here I just give an example of a covariance matrix and a correlation matrix. Note the correlation matrices between negative 1 and positive 1. All right. Let me cut to the chase here. I'll draw it here because I really want to get into some of the other stuff. What this means, if I have a covariance structure, sigma. And I have a vector of positions, x dollars in dollar yen, y dollars in gold, z dollars in oil. And let's say I've got a position vector, x1, x2, x3, xn. If I have all my positions recorded as a vector. This is asset one, asset two, and this is in dollars. And I have the covariance structure, the variance of this portfolio that has these assets and this covariance structure this is where the magic happens is x transpose sigma x equals sigma squared hat portfolio. Now you really could go work for a bank. This is how portfolio variance, using the variance covariance method, is done. In fact, when we were doing it this way 20 years ago, spreadsheets only have 256 columns. So we tried to simplify everything into 256 or sometimes you had to sum it up using two different spreadsheets. We didn't have multitab spreadsheets. That was a dream, multitab spreadsheets. This was Lotus 123 we're talking about here, OK? You guys don't even know what Lotus 123 is. It's like an abacus but on the screen. Yes? AUDIENCE: What's x again in this? KENNETH ABBOTT: Position vector. Let's say I tell you that you've got dollar yen, gold, and oil. You've got $100 of dollar yen, $50 of oil, and $25 of gold. It would be 100, 50, 25. Now, I should say $100 of dollar yen, your position vector would actually show up as negative 100, 50, 25. Why is that? Because if I'm measuring my dollar yen and this is just a little aside typically, I measure dollar yen in yen per dollar. So dollar yen might be 95. If I own yen and I'm a dollar investor and I own yen, and yen go from 95 per dollar to 100 per dollar, do I make or lose money? I lose money. Negative 100. Just store that. You won't be tested on that, but we think about that all the time. Same thing with yields. Typically, when I record my PV01 and I'll record some version, something like my PV01 in that vector, my interest rate sensitivity, I'm going to record it as a negative. Because when yields go up and I own the bond, I lose money. Signs, very important. And, again, we've covered usually I do this in a two hour lecture. And we've covered it in less than an hour, so pretty good. All right. I spent a lot more time on the fixed income. [STUDENT COUGHING] Are you taking something for that? That does not sound healthy. I don't mean to embarrass you. But I just want to make sure that you're taking care of yourself because grad students don't I was a grad student, I didn't take care of myself very well. I worry. All right. Big picture, variance covariance. Collect data, calculate returns, test the data, matrix construction, get my position vector, multiply my matrices. All right? Quick and dirty, that's how we do it. That's the simplified approach to measuring this order statistic called value at risk using this particular technique. Questions, comments? Anyone? Anything you think I need to elucidate on that? And this is, in fact, how we did this up until the late '90s. Firms used variance covariance. I heard a statistic in Europe in 1996 that 80% of the European banks were using this technique to do their value at risk. It was no more complicated than this. I use a little flow diagram. Get your data returns, graph your data to make sure you don't screw it up. Get your covariance matrix, multiply your matrices out. x transpose sigma x. Using the position vectors and then you can do your analysis. Normally I would spend some more time on that bottom row and different things you can do with it, but that will have to suffice for now. A couple of points I want to make before we move on about the assumptions. Actually, I'll fly through this here so we can get into Monte Carlo simulation. Where am I going to get my data? Where do I get my data? I often get a lot of my data Bloomberg, I get it from public sources, I get it from the internet. Especially when you get it from look, if it says so on the internet, it must be true. Right? Didn't Abe Lincoln say, don't believe everything you read on the internet? That was a quote, I saw that some place. You get data from people, you check it. There's some sources that are very reliable. If you're looking for yield data or foreign exchange data, the Federal Reserve has it. And they have it back 20 years, daily data. It's the H.15 and the H.10. It's there, it's free, it's easy to download, just be aware of it. Exchange PROFESSOR: [INAUDIBLE] study posted on the website that goes through computations for regression analysis and asset pricing models and the data that's used there is from the Federal Reserve for yields. KENNETH ABBOTT: It's H15 It's for yields, it's probably from the H.15. [INTERPOSING VOICES] PROFESSOR: Those files, you can see how to actually get that data for yourselves. KENNETH ABBOTT: Now, another great source of data is Bloomberg. Now the good thing about Blumberg data is everybody uses it, so it's clean. Relatively clean. I still find errors in it from time to time. But what happens is when you find an error in your Bloomberg data, you get on the phone to Bloomberg right away and say I found an error in your data. They say, oh, what date? June 14, you know, 2012. And they'll say, OK, we'll fix it. All right? So everybody does that, and the data set is pretty clean. I found consistently that Bloomberg data is the cleanest in my experience. How much data do we use in doing this? I could use one year of data, I can use two weeks of data. Now, times series, we usually want 100 observations. That's always been my rule of thumb. I can use one year of data. There are regulators that require you to use at least a year of data. You could use two years of data. In fact, some firms use one year of data. There's one firm that uses five years of data. And there, we could say, well, am I going to weight it. Am I going to weight my more recent data heavily? I could do that with exponential smoothing, which we won't have time to talk about. It's a technique I can use to lend more credence to the more recent data. Now, I'm a relatively simple guy. I tend to use equally weighted data because I believe in Occam's razor, which is, the simplest explanation is usually the best. I think we get too clever by half when we try to parameterize. How much more does last week's data have an impact than from two weeks ago, three weeks ago. I'm not saying that it doesn't, what I am saying is, I'm not smart enough to know exactly how much it does. And assuming that everything's equally weighted throughout time is just as strong an assumption. But it's a very simple assumption, and I love simple. Yes? AUDIENCE: [INAUDIBLE] calculate covariance matrix? KENNETH ABBOTT: Yes. All right, quickly. Actually I think I have some slides on that. Let me just finish this and I'll get to that. Gaps in data. Missing data is a problem. How do I fill in missing data? I can do a linear interpolation, I can use the prior day's data. I can do a Brownian bridge, which is I just do a Monte Carlo between them. I can do a regression based, I can use regression to project changes from one onto changes in another. That's usually a whole other lecture I gave on how to do missing data. Now you've got that lecture for free. That's all you need to know. It's not only a lecture, it's a very hard homework assignment. But how frequently do I update my data? Some people update their covariance structures daily. I think that's an overkill. We update our data set weekly. That's what we do. And I think that's overkill, but tell that to my regulators. And we use daily data, weekly data, monthly data. We typically use daily data. Some firms may do it differently. All right. Here's your exponential smoothing. Remember, I usually measure covariance sum of xi minus x bar times y minus y bar divided by n minus 1. What if I stuck an omega in there? And I use this calculation instead, where the denominator is the sum of all the omegas you should be thinking finite series. You have to realize, I was a decent math student, I wasn't a great math student. And what I found when I was studying this, I was like, wow, all that stuff that I learned, it actually finite series, who knew? Who knew that I'd actually use it? So I take this, and let's say I'm working backwards in time. So today's observations is t zero. Yesterday's observation is t1, t2, t3. So today's observation would get and let's assume for the time being that this omega is on the order 0.95. It could be anything. So today would be 0.95 to the 0 divided by the sum of all the omegas. Tomorrow it will be 0.95 divided by the sum of the omegas. The next would be 0.95 squared divided by the sum of the omegas. 0.95 cubed and get smaller and smaller. For example, if you use 0.94, 99% of your weight will be in the last 76 days. 76 observations, I shouldn't say 76 days. 76 observations. So there's this notion that the impact declines exponentially. Does that make sense? People use this pretty commonly, but what scares me about it somebody stuck these fancy transitions in between these slides. Anyway, is that here's my standard deviation [INAUDIBLE] the rolling six [INAUDIBLE] window. And here's my standard deviation using different weights. The point I want to make here, and it's an important point, my assumption about my weighting coefficient has a material impact on the size of my measure volatility. Now when I see this, and this is just me. There's no finance or statistics theory behind this, any time the choice any time an assumption has this material an impact, bells and whistles go off and sirens. All right, and red lights flash. Be very, very careful. Now, lies, damn lies, and statistics. You tell me the outcome you want, and I'll tell you what statistics to use. That's where this could be abused. Oh, you want to show high volatility? Well let's use this. You want to show low volatility, let's use this? See, I choose to just take the simplest approach. And that's me. That's not a terribly scientific opinion, but that's what I think. Daily versus weekly, percentage changes log changes. Units. Just like dollar yen, interest rates. Am I long or am I short? If I'm long gold, I show it as a positive number. And if I'm short gold, in my position vector, I show it as a negative number. If I'm long yen, and yen is measured in yen per dollar, then I show it as a negative number. If I'm long yen, but my covariance matrix measures yen as dollars per yen 0.000094, whatever then I show it as a positive number. It's just like physics only worse because it'll cost you real no, I guess physics would be worse because if you get the units wrong, you blow up, right? This will just cost you money. I've made this mistake. I've made the units mistake. All right, we talked about fixed income. So that's what I want to cover from the bare bones setup for var. Now I'm going to skip the historical simulation and go right to the Monte Carlo because I want to show you another way we can use covariance structures. [POWERPOINT SOUND EFFECT] That's going to happen two or three more times. Somebody did this, somebody made my presentation cute some years ago. And I just I apologize. All right, see, there's a lot to meat in this presentation that we don't have time to get to. Another approach to doing value at risk is rather than use this parametric approach, is to simulate the outcomes. Simulate the outcomes 100 times, 1,000 times, 10,000 times, a million times, and say, these are all the possible outcomes based on my simulation assumptions. And let's say I simulate 10,000 times, and I have 10,000 possible outcomes for tomorrow. And I wanted to measure my value at risk at the 1% significance level. All I would do is take my 10,000 outcomes and I would sort them and take my hundredth worst. Put it in your pocket, go home. That's it. This is a different way of getting to that order statistic. Lends a lot more flexibility. So I can go and I can tweak the way I do that simulation, I can relax my assumptions of normality. I don't have to use normal distribution, I could use a t distribution, I could do lots, I could tweak my distribution, I could customize it. I could put mean reversion in there, I could do all kinds of stuff. So another way we do value at risk is we simulate possible outcomes. We rank the outcomes, and we just count them. If I've got the 10,000 observations and I want my 5% order statistic, well I just take my 500th. Make sense? It's that simple. Well, I don't want to make it seem like it's that simple because it actually gets a little messy in here. But when we do Monte Carlo simulation, we're simulating what we think is going to happen all subject to our assumptions. And we run through this Monte Carlo simulation. Simulation of method using sequences of random numbers. Coined during the Manhattan Project, similar to games of chance. You need to describe your system in terms of probability density functions. What type of distribution? Is this normal? Is it t? Is it chi squared? Is it F? All right? That's the way we do it. So quickly, how do I do that? I have to have random numbers. Now they're truly random numbers. Somewhere at MIT you could buy I used to say tape, but people don't use tape. They'll give you a website where you can get the atomic decay. That's random. All right? Anything else is psuedo random. What you see when you go into MATLAB, you have a random number generator, it's an algorithm. It probably takes some number and takes the square root of that number and then goes 54 decimal places to the right and takes the 55 decimal places to the right, multiplies those two numbers together and then takes the fifth root, and then goes 16 decimal places to the right to get that it's some algorithm. True story, before I came to appreciate that these were all highly algorithmically driven, I was in my 20's, I was taking a computer class, I saw two computers, they were both running random number of generators and they were generating the same random numbers. And I thought I was at the event horizon. I thought that light was bending and the world was coming to an end, all right? Because this this stuff can't happen, all right? It was happening right in front of me. It was a psuedo random number generator. I didn't know, I was 24. Anyway. quasi random numbers, it's sort of a way of imposing some order on your random numbers. You random numbers, one particular set of draws may not have enough draws in a particular area to give you the numbers you want. I can impose some conditions upon that. I don't want to get into a discussion of random numbers. How do I get from random uniform most remember generous give you random uniform number between 0 and 1. What you'll typically do is you'll take that random uniform number, you'll map it over to the cumulative density function, and map it down. So this gets you from random uniform space into standard deviation space. We used to worry about how we did this, now your software does it for you. I've gotten comfortable enough, truth be told. I usually trust my random number generators in Excel, in MATLAB. So I kind of violate my own rules, I don't check. But I think most of your standard random number of generators are decent enough now. And you can go straight to normal, you don't have to do random uniform and back into random normal. You can get it distributed in any way you want. What I do when I do a Monte Carlo simulation and this is going to be rushed because we've only got like 20 minutes. If I take a covariance matrix you're going to have to trust me on this because again, I'm covering like eight hours of lecture in an hour and a half. You guys go to MIT so I have no doubt you're going to be all over this. Let's take this out of here for a second. I can factor my covariance structure. I can factor my covariance structure like this. And this is the transpose of this. I didn't realize that the first time we did this commercially I saw this instead of this and I thought we had sent bad data to the customer. I got physically sick. And then I remembered AB transpose equals B transpose A. These things keep happening. My high school math keeps coming back to me. But I had forgotten this and I got physically sick because I thought we'd sent bad data because I was looking at this when it's just the transpose of this. Anyway, I can factor this into this where this is the a matrix of eigenvectors. This is a diagonal matrix of the eigenvalues. All right? This is the vaunted Gaussian copula. This is it. Most people view it as a black box. If you've had any more than introductory statistics, this should be a glass box to you. That's why I wanted to go through this even though I'd love to spend another hour and a half and do about 50 examples. Because this is how I learned this, I didn't learn it from looking at this equation and saying, oh, I get it. I learned it from actually doing it about 1,000 times in a spreadsheet, and sunk in like water into a store. So I factor this matrix, and then I take this, which is the square root matrix, which is my transpose of my eigenvector matrix and diagonal matrix contain the square root of my eigenvalued. Now, could this ever be negative and take me into imaginary root land? Well, if my variances are positive or zero, then that will be a problem. So here we get into this remember you guys studied positive semidefinite, positive definite. Once again, it's another one of these high school math things. Like, here it is. I had to know this. Suddenly I care whether it's positive semidefinite. Covariance structures have to be positive semidefinite. If you don't have a complete data set, let's say you've got 100 observations, 100 observations, 100 observations, 25 observations, 100 observations, you may have a negative eigenvalue. If you just measure the covariance with the amount of data that you have. My intuition and I doubt this is the [INAUDIBLE] is that you're measuring with error and you have fewer observations you measure with more error. So it's possible if some of your covariance measures have 25 observations and some of them have 100 observations that there's more error in some than in others. And so there's the theoretical possibility for negative variance. True story, we didn't no this in the '90s. I took this problem to the chairman of the statistics department at NYU said, I'm getting negative eigenvalues. And he didn't know. He had no idea, he's a smart guy. You have to fill in your missing data. You have to fill in your missing data. If you've got 1,000 observations, 1,000 observations, 1,000 observations, 200 observations, and you want to make sure you won't have a negative eigenvalue, you've got to fill in those observations. Which is why missing data is a whole other thing we talk about. Again, I could spend a lot of time on that. And I learned that the hard way. But anyway, so I take this square root matrix, if I premultiply that square root matrix by row after row of normals, I will get out an array that has the same covariance structure as that with which I started. Another story here, I've been using the same eigenvalue I believe in full attribution, I'm not a clever guy. I have not an original thought in my head. And whenever I use someone else's stuff, I give them credit for it. And the guy who wrote the code that did the eigenvalue [? decomposition ?] this is something that was translated from Fortran IV. It wasn't even [INAUDIBLE], there's a dichotomy in the world. There are people that have written Fortran, and people that haven't. I'm guessing that there are two people in this room that have ever written a line of Fortran. Anyone here? Just saying. Yeah, with cards or without cards? PROFESSOR: [INAUDIBLE]. KENNETH ABBOTT: I didn't use cards. See, you're an oldtimer because you used cards. The punch line is, I've been using this guy's code. And I could show you the code. It's like the Lone Ranger, I didn't even get a chance to thank him. Because he didn't put his name on the code. On the internet now, if you do something clever on the quant newsgroups, you're going to post your name all over it. I've been wanting to thank this guy for like 20 years and I haven't been able to. Anyway, [INAUDIBLE] code that's been translated. Let me show you what this means. Here's some source data. Here's some percentage changes. Just like we talked about. Here is the empirical correlation of those percentage changes. So the correlation of my government 10 year to my AAA 10 year is 0.83. To my AA, 8.4. All right, you see this. And I have this covariance matrix which is the the correlation matrix is a scaled version of the covariance matrix. And I do a little bit of statistical ledger domain. Eigenvalues and eigenvectors. Take the square root of that. And again, I'd love to spend a lot more time on this, but we just don't suffice to say, I call this a transformation matrix, that's my term. This matrix here is this. If we had another hour and a half I'd take the step by step to get you there. The proof of which is left to the reader as an exercise. I'll leave this spreadsheet for you, I'll send it to you. I have this matrix. This matrix is like a prism. I'm going to pass white light through it, I'm going to get a beautiful rainbow. Let me show you what I mean. So remember that matrix, this matrix I'm calling t. Remember my matrix is 10 by 10. One, two, three, four, five, six, seven, eight, nine, ten. 10 columns of data. 10 by 10 correlation matrix. Let's check. Now I've got row vectors of sorry uncorrelated random normals. So what I'm doing then is I'm premultiplying that transformation matrix row by row by each row of uncorrelated random normals. And what I get is correlated random normals. So what I'm telling you here is this array happens to be 10 wide and 1,000 long. And I'm telling you that I started with my historical data let me see how much data have there. A couple hundred observations of historical data. And what I've done is once I have that covariance structure, I can create a data set here which has the same statistical properties as this. Not quite the same. It can have the same means and the same variances. This is what Monte Carlo simulation is about. I wish we had another hour because I'd like to spend time and this is one of these things, and again, when I first saw this, I was like, oh my god. I felt like I got the keys to the kingdom. And I did is manually, did it all on a spreadsheet. Didn't believe anyone else's code, did it all on a spreadsheet. But what that means quickly, let me just go back over here for a second. I happen to have about 800 observations here. Historical observations. What I did was I happened to generate 1,000 samples here. But I could generate 10,000 or 100,000, or a million or 10 million or a billion just by doing more random normals. I could generate in effect, what I'm generating here is synthetic time series that have properties similar to my underlying data. That's what Monte Carlo simulation is about. The means and the variances and the covariances of this data set are just like that. Now, again, true story, when somebody first showed me this I did not believe them. So I developed a bunch of little tests. And I said, let me just look at the correlation of my Monte Carlo data versus my original correlation matrix. So 0.83, 0.84, 0.85, 0.85, 0.67, 0.81. You look at the corresponding ones of the random numbers I just generated, 0.81, 0.82, 0.84, 0.84, 0.64, 0.52. 0.54 versus 0.52. 0.18 Versus 0.12. 0.51 versus 0.47. Somebody want to tell me why they're not spot on? Sampling error. The more data I use the closer it will get to that. If I do 1 million, I'd better get right on top of that. Does that make sense? So what I'm telling you here is that I can generate synthetic time series. Now, why would I generate so many? Well because, remember, I care what's going on out in that tail. If I only have 100 observations and I'm looking empirically at my tail, I've only got one observation out in the 1% tail. And that doesn't tell me a whole lot about what's going on. If I can simulate that distribution exactly, I can say, you know what, I want a billion observations in that tail. Now we can look at that tail. If I have 1 billion observations, let's say I'm looking at some kind of normal distribution. I'm circling it out here, I'm seeing I can really dig in and see what the properties of this thing are. In fact, this can really only take two distributions, and really, it's only one. But that's another story. So what I do in Monte Carlo simulations, I'm and simulated these outcomes so we can get a lot more meat in this tail to understand what's happening out there. Does it drop off quickly? Does it not drop off quickly? That's kind of what it's about. So we're about out of time. We just covered like four weeks of material, all right? But you guys are from MIT. I have complete confidence in you. I say that to the people who work for me. I have complete confidence in your ability to get that done by tomorrow morning. Questions or comments? I know you're sipping from the fire hose here. I fully appreciate that. So those are examples. When I do this with historical simulation I won't generate these Monte Carlo trials, I'll just use historical data. And my fat tails are built into it. But what I've shown you today is what we developed a one asset var model, then we developed a multiasset variance covariance model. And then I showed you quickly, and in far less time than I would like to have shown you is how I can use another statistical technique, which is called the Gaussian copula, to generate has data sets that will have the same properties as my source historical data. All right? There you have it. [APPLAUSE] Oh you don't have to please, please, please. And I'll tell you, for me, one of the coolest things was actually being able to apply so much of the math I learned in high school and in college and never thought I'd apply again. One of my best moments was actually finding a use for trigonometry. If you're not an engineer, where are you going to use it? Where do you use it? Seasonals. You do seasonal estimation. And what you do is you do fast Fourier transform. Because I can describe any seasonal pattern with a linear combination of sine and cosine functions. And it actually works. I have my students do it as an exercise every year. I say, go get New York city temperature data. And show me some linear combination of sine and cosine functions that will show me the seasonal pattern of temperature data. And when I first realized I could use trigonometry, yes! It wasn't a waste of time. I still pull the coordinates, I still haven't found a use for that one. But it's there. I know it's there. All right? Go home.
Contents
Details
Common parameters for VaR are 1% and 5% probabilities and one day and two week horizons, although other combinations are in use.^{[6]}
The reason for assuming normal markets and no trading, and to restricting loss to things measured in daily accounts, is to make the loss observable. In some extreme financial events it can be impossible to determine losses, either because market prices are unavailable or because the lossbearing institution breaks up. Some longerterm consequences of disasters, such as lawsuits, loss of market confidence and employee morale and impairment of brand names can take a long time to play out, and may be hard to allocate among specific prior decisions. VaR marks the boundary between normal days and extreme events. Institutions can lose far more than the VaR amount; all that can be said is that they will not do so very often.^{[7]}
The probability level is about equally often specified as one minus the probability of a VaR break, so that the VaR in the example above would be called a oneday 95% VaR instead of oneday 5% VaR. This generally does not lead to confusion because the probability of VaR breaks is almost always small, certainly less than 50%.^{[1]}
Although it virtually always represents a loss, VaR is conventionally reported as a positive number. A negative VaR would imply the portfolio has a high probability of making a profit, for example a oneday 5% VaR of negative $1 million implies the portfolio has a 95% chance of making more than $1 million over the next day.^{[8]}
Another inconsistency is that VaR is sometimes taken to refer to profitandloss at the end of the period, and sometimes as the maximum loss at any point during the period. The original definition was the latter, but in the early 1990s when VaR was aggregated across trading desks and time zones, endofday valuation was the only reliable number so the former became the de facto definition. As people began using multiday VaRs in the second half of the 1990s, they almost always estimated the distribution at the end of the period only. It is also easier theoretically to deal with a pointintime estimate versus a maximum over an interval. Therefore, the endofperiod definition is the most common both in theory and practice today.^{[9]}
Varieties
The definition of VaR is nonconstructive; it specifies a property VaR must have, but not how to compute VaR. Moreover, there is wide scope for interpretation in the definition.^{[10]} This has led to two broad types of VaR, one used primarily in risk management and the other primarily for risk measurement. The distinction is not sharp, however, and hybrid versions are typically used in financial control, financial reporting and computing regulatory capital.^{[11]}
To a risk manager, VaR is a system, not a number. The system is run periodically (usually daily) and the published number is compared to the computed price movement in opening positions over the time horizon. There is never any subsequent adjustment to the published VaR, and there is no distinction between VaR breaks caused by input errors (including Information Technology breakdowns, fraud and rogue trading), computation errors (including failure to produce a VaR on time) and market movements.^{[12]}
A frequentist claim is made, that the longterm frequency of VaR breaks will equal the specified probability, within the limits of sampling error, and that the VaR breaks will be independent in time and independent of the level of VaR. This claim is validated by a backtest, a comparison of published VaRs to actual price movements. In this interpretation, many different systems could produce VaRs with equally good backtests, but wide disagreements on daily VaR values.^{[1]}
For risk measurement a number is needed, not a system. A Bayesian probability claim is made, that given the information and beliefs at the time, the subjective probability of a VaR break was the specified level. VaR is adjusted after the fact to correct errors in inputs and computation, but not to incorporate information unavailable at the time of computation.^{[8]} In this context, "backtest" has a different meaning. Rather than comparing published VaRs to actual market movements over the period of time the system has been in operation, VaR is retroactively computed on scrubbed data over as long a period as data are available and deemed relevant. The same position data and pricing models are used for computing the VaR as determining the price movements.^{[2]}
Although some of the sources listed here treat only one kind of VaR as legitimate, most of the recent ones seem to agree that risk management VaR is superior for making shortterm and tactical decisions today, while risk measurement VaR should be used for understanding the past, and making medium term and strategic decisions for the future. When VaR is used for financial control or financial reporting it should incorporate elements of both. For example, if a trading desk is held to a VaR limit, that is both a riskmanagement rule for deciding what risks to allow today, and an input into the risk measurement computation of the desk's riskadjusted return at the end of the reporting period.^{[5]}
In governance
VaR can also be applied to governance of endowments, trusts, and pension plans. Essentially trustees adopt portfolio ValuesatRisk metrics for the entire pooled account and the diversified parts individually managed. Instead of probability estimates they simply define maximum levels of acceptable loss for each. Doing so provides an easy metric for oversight and adds accountability as managers are then directed to manage, but with the additional constraint to avoid losses within a defined risk parameter. VaR utilized in this manner adds relevance as well as an easy way to monitor risk measurement control far more intuitive than Standard Deviation of Return. Use of VaR in this context, as well as a worthwhile critique on board governance practices as it relates to investment management oversight in general can be found in Best Practices in Governance.^{[13]}
Mathematical definition
The VaR of at the confidence level is the smallest number such that the probability that does not exceed is at least . Mathematically, is the quantile of , i.e.,
 ^{[14]}^{[15]}
This is the most general definition of VaR and the two identities are equivalent (indeed, for any random variable its cumulative distribution function is well defined). However this formula cannot be used directly for calculations unless we assume that has some parametric distribution.
Risk managers typically assume that some fraction of the bad events will have undefined losses, either because markets are closed or illiquid, or because the entity bearing the loss breaks apart or loses the ability to compute accounts. Therefore, they do not accept results based on the assumption of a welldefined probability distribution.^{[7]} Nassim Taleb has labeled this assumption, "charlatanism".^{[16]} On the other hand, many academics prefer to assume a welldefined distribution, albeit usually one with fat tails.^{[1]} This point has probably caused more contention among VaR theorists than any other.^{[10]}
Value of Risks can also be written as a distortion risk measure given by the distortion function ^{[17]}^{[18]}
Risk measure and risk metric
The term "VaR" is used both for a risk measure and a risk metric. This sometimes leads to confusion. Sources earlier than 1995 usually emphasize the risk measure, later sources are more likely to emphasize the metric.
The VaR risk measure defines risk as marktomarket loss on a fixed portfolio over a fixed time horizon. There are many alternative risk measures in finance. Given the inability to use marktomarket (which uses market prices to define loss) for future performance, loss is often defined (as a substitute) as change in fundamental value. For example, if an institution holds a loan that declines in market price because interest rates go up, but has no change in cash flows or credit quality, some systems do not recognize a loss. Also some try to incorporate the economic cost of harm not measured in daily financial statements, such as loss of market confidence or employee morale, impairment of brand names or lawsuits.^{[5]}
Rather than assuming a static portfolio over a fixed time horizon, some risk measures incorporate the dynamic effect of expected trading (such as a stop loss order) and consider the expected holding period of positions.^{[5]}
The VaR risk metric summarizes the distribution of possible losses by a quantile, a point with a specified probability of greater losses. A common alternative metrics is expected shortfall.^{[1]}
VaR risk management
Supporters of VaRbased risk management claim the first and possibly greatest benefit of VaR is the improvement in systems and modeling it forces on an institution. In 1997, Philippe Jorion wrote:^{[19]}
[T]he greatest benefit of VAR lies in the imposition of a structured methodology for critically thinking about risk. Institutions that go through the process of computing their VAR are forced to confront their exposure to financial risks and to set up a proper risk management function. Thus the process of getting to VAR may be as important as the number itself.
Publishing a daily number, ontime and with specified statistical properties holds every part of a trading organization to a high objective standard. Robust backup systems and default assumptions must be implemented. Positions that are reported, modeled or priced incorrectly stand out, as do data feeds that are inaccurate or late and systems that are toofrequently down. Anything that affects profit and loss that is left out of other reports will show up either in inflated VaR or excessive VaR breaks. "A risktaking institution that does not compute VaR might escape disaster, but an institution that cannot compute VaR will not."^{[20]}
The second claimed benefit of VaR is that it separates risk into two regimes. Inside the VaR limit, conventional statistical methods are reliable. Relatively shortterm and specific data can be used for analysis. Probability estimates are meaningful, because there are enough data to test them. In a sense, there is no true risk because you have a sum of many independent observations with a left bound on the outcome. A casino doesn't worry about whether red or black will come up on the next roulette spin. Risk managers encourage productive risktaking in this regime, because there is little true cost. People tend to worry too much about these risks, because they happen frequently, and not enough about what might happen on the worst days.^{[21]}
Outside the VaR limit, all bets are off. Risk should be analyzed with stress testing based on longterm and broad market data.^{[22]} Probability statements are no longer meaningful.^{[23]} Knowing the distribution of losses beyond the VaR point is both impossible and useless. The risk manager should concentrate instead on making sure good plans are in place to limit the loss if possible, and to survive the loss if not.^{[1]}
One specific system uses three regimes.^{[24]}
 One to three times VaR are normal occurrences. You expect periodic VaR breaks. The loss distribution typically has fat tails, and you might get more than one break in a short period of time. Moreover, markets may be abnormal and trading may exacerbate losses, and you may take losses not measured in daily marks such as lawsuits, loss of employee morale and market confidence and impairment of brand names. So an institution that can't deal with three times VaR losses as routine events probably won't survive long enough to put a VaR system in place.
 Three to ten times VaR is the range for stress testing. Institutions should be confident they have examined all the foreseeable events that will cause losses in this range, and are prepared to survive them. These events are too rare to estimate probabilities reliably, so risk/return calculations are useless.
 Foreseeable events should not cause losses beyond ten times VaR. If they do they should be hedged or insured, or the business plan should be changed to avoid them, or VaR should be increased. It's hard to run a business if foreseeable losses are orders of magnitude larger than very large everyday losses. It's hard to plan for these events, because they are out of scale with daily experience. Of course there will be unforeseeable losses more than ten times VaR, but it's pointless to anticipate them, you can't know much about them and it results in needless worrying. Better to hope that the discipline of preparing for all foreseeable threetoten times VaR losses will improve chances for surviving the unforeseen and larger losses that inevitably occur.
"A risk manager has two jobs: make people take more risk the 99% of the time it is safe to do so, and survive the other 1% of the time. VaR is the border."^{[20]}
Another reason VaR is useful as a metric is due to its ability to compress the riskiness of a portfolio to a single number, making it comparable across different portfolios (of different assets). Within any portfolio it is also possible to isolate specific position that might better hedge the portfolio to reduce, and minimise, the VaR. An example of marketmaker employed strategies for trading linear interest rate derivatives and interest rate swaps portfolios is cited.^{[25]}
Computation methods
VaR can be estimated either parametrically (for example, variancecovariance VaR or deltagamma VaR) or nonparametrically (for examples, historical simulation VaR or resampled VaR).^{[5]}^{[7]} Nonparametric methods of VaR estimation are discussed in Markovich ^{[26]} and Novak.^{[27]} A comparison of a number of strategies for VaR prediction is given in Kuester et al.^{[28]}
A McKinsey report^{[29]} published in May 2012 estimated that 85% of large banks were using historical simulation. The other 15% used Monte Carlo methods.
Backtesting
A key advantage to VaR over most other measures of risk such as expected shortfall is the availability several backtesting procedures for validating a set of VaR forecasts. Early examples of backtests can be found in Christoffersen (1998),^{[30]} later generalized by Pajhede (2017),^{[31]} which models a "hitsequence" of losses greater than the VaR and proceed to tests for these "hit's" to be independent from one another and with a correct probability of occurring. E.g. a 5% probability of a loss greater than VaR should be observed over time when using a 95% VaR, these hits should occur independently.
A number of other backtests are available which model the time between hits in the hitsequence, see Christoffersen (2014),^{[32]} Haas (2016),^{[33]} Tokpavi et al. (2014).^{[34]} and Pajhede (2017)^{[31]} As pointed out in several of the papers, the asymptotic distribution is often poor when considering high levels of coverage, e.g. a 99% VaR, therefore the parametric bootstrap method of Dufour (2006)^{[35]} is often used to obtain correct size properties for the tests. Backtest toolboxes are available in Matlab [1], or R—though only the first implements the parametric bootstrap method.
History
The problem of risk measurement is an old one in statistics, economics and finance. Financial risk management has been a concern of regulators and financial executives for a long time as well. Retrospective analysis has found some VaRlike concepts in this history. But VaR did not emerge as a distinct concept until the late 1980s. The triggering event was the stock market crash of 1987. This was the first major financial crisis in which a lot of academicallytrained quants were in high enough positions to worry about firmwide survival.^{[1]}
The crash was so unlikely given standard statistical models, that it called the entire basis of quant finance into question. A reconsideration of history led some quants to decide there were recurring crises, about one or two per decade, that overwhelmed the statistical assumptions embedded in models used for trading, investment management and derivative pricing. These affected many markets at once, including ones that were usually not correlated, and seldom had discernible economic cause or warning (although afterthefact explanations were plentiful).^{[23]} Much later, they were named "Black Swans" by Nassim Taleb and the concept extended far beyond finance.^{[36]}
If these events were included in quantitative analysis they dominated results and led to strategies that did not work day to day. If these events were excluded, the profits made in between "Black Swans" could be much smaller than the losses suffered in the crisis. Institutions could fail as a result.^{[20]}^{[23]}^{[36]}
VaR was developed as a systematic way to segregate extreme events, which are studied qualitatively over longterm history and broad market events, from everyday price movements, which are studied quantitatively using shortterm data in specific markets. It was hoped that "Black Swans" would be preceded by increases in estimated VaR or increased frequency of VaR breaks, in at least some markets. The extent to which this has proven to be true is controversial.^{[23]}
Abnormal markets and trading were excluded from the VaR estimate in order to make it observable.^{[21]} It is not always possible to define loss if, for example, markets are closed as after 9/11, or severely illiquid, as happened several times in 2008.^{[20]} Losses can also be hard to define if the riskbearing institution fails or breaks up.^{[21]} A measure that depends on traders taking certain actions, and avoiding other actions, can lead to self reference.^{[1]}
This is risk management VaR. It was well established in quantitative trading groups at several financial institutions, notably Bankers Trust, before 1990, although neither the name nor the definition had been standardized. There was no effort to aggregate VaRs across trading desks.^{[23]}
The financial events of the early 1990s found many firms in trouble because the same underlying bet had been made at many places in the firm, in nonobvious ways. Since many trading desks already computed risk management VaR, and it was the only common risk measure that could be both defined for all businesses and aggregated without strong assumptions, it was the natural choice for reporting firmwide risk. J. P. Morgan CEO Dennis Weatherstone famously called for a "4:15 report" that combined all firm risk on one page, available within 15 minutes of the market close.^{[10]}
Risk measurement VaR was developed for this purpose. Development was most extensive at J. P. Morgan, which published the methodology and gave free access to estimates of the necessary underlying parameters in 1994. This was the first time VaR had been exposed beyond a relatively small group of quants. Two years later, the methodology was spun off into an independent forprofit business now part of RiskMetrics Group (now part of MSCI).^{[10]}
In 1997, the U.S. Securities and Exchange Commission ruled that public corporations must disclose quantitative information about their derivatives activity. Major banks and dealers chose to implement the rule by including VaR information in the notes to their financial statements.^{[1]}
Worldwide adoption of the Basel II Accord, beginning in 1999 and nearing completion today, gave further impetus to the use of VaR. VaR is the preferred measure of market risk, and concepts similar to VaR are used in other parts of the accord.^{[1]}
Criticism
VaR has been controversial since it moved from trading desks into the public eye in 1994. A famous 1997 debate between Nassim Taleb and Philippe Jorion set out some of the major points of contention. Taleb claimed VaR:^{[37]}
 Ignored 2,500 years of experience in favor of untested models built by nontraders
 Was charlatanism because it claimed to estimate the risks of rare events, which is impossible
 Gave false confidence
 Would be exploited by traders
In 2008 David Einhorn and Aaron Brown debated VaR in Global Association of Risk Professionals Review^{[20]}^{[3]} Einhorn compared VaR to "an airbag that works all the time, except when you have a car accident". He further charged that VaR:
 Led to excessive risktaking and leverage at financial institutions
 Focused on the manageable risks near the center of the distribution and ignored the tails
 Created an incentive to take "excessive but remote risks"
 Was "potentially catastrophic when its use creates a false sense of security among senior executives and watchdogs."
New York Times reporter Joe Nocera wrote an extensive piece Risk Mismanagement^{[38]} on January 4, 2009 discussing the role VaR played in the Financial crisis of 20072008. After interviewing risk managers (including several of the ones cited above) the article suggests that VaR was very useful to risk experts, but nevertheless exacerbated the crisis by giving false security to bank executives and regulators. A powerful tool for professional risk managers, VaR is portrayed as both easy to misunderstand, and dangerous when misunderstood.
Taleb in 2009 testified in Congress asking for the banning of VaR for a number of reasons. One was that tail risks are nonmeasurable. Another was that for anchoring reasons VaR leads to higher risk taking.^{[39]}
VaR is not subadditive:^{[5]} VaR of a combined portfolio can be larger than the sum of the VaRs of its components.
For example, the average bank branch in the United States is robbed about once every ten years. A singlebranch bank has about 0.0004% chance of being robbed on a specific day, so the risk of robbery would not figure into oneday 1% VaR. It would not even be within an order of magnitude of that, so it is in the range where the institution should not worry about it, it should insure against it and take advice from insurers on precautions. The whole point of insurance is to aggregate risks that are beyond individual VaR limits, and bring them into a large enough portfolio to get statistical predictability. It does not pay for a onebranch bank to have a security expert on staff.
As institutions get more branches, the risk of a robbery on a specific day rises to within an order of magnitude of VaR. At that point it makes sense for the institution to run internal stress tests and analyze the risk itself. It will spend less on insurance and more on inhouse expertise. For a very large banking institution, robberies are a routine daily occurrence. Losses are part of the daily VaR calculation, and tracked statistically rather than casebycase. A sizable inhouse security department is in charge of prevention and control, the general risk manager just tracks the loss like any other cost of doing business. As portfolios or institutions get larger, specific risks change from lowprobability/lowpredictability/highimpact to statistically predictable losses of low individual impact. That means they move from the range of far outside VaR, to be insured, to near outside VaR, to be analyzed casebycase, to inside VaR, to be treated statistically.^{[20]}
VaR is a static measure of risk. By definition, VaR is a particular characteristic of the probability distribution of the underlying (namely, VaR is essentially a quantile). For a dynamic measure of risk, see Novak,^{[27]} ch. 10.
There are common abuses of VaR:^{[7]}^{[10]}
 Assuming that plausible losses will be less than some multiple (often three) of VaR. Losses can be extremely large.
 Reporting a VaR that has not passed a backtest. Regardless of how VaR is computed, it should have produced the correct number of breaks (within sampling error) in the past. A common violation of common sense is to estimate a VaR based on the unverified assumption that everything follows a multivariate normal distribution.
VaR, CVaR and EVaR
The VaR is not a coherent risk measure since it violates the subadditivity property, which is
However, it can be bounded by coherent risk measures like Conditional ValueatRisk (CVaR) or entropic value at risk (EVaR). In fact, for (with the set of all Borel measurable functions whose momentgenerating function exists for all positive real values) we have
where
in which is the momentgenerating function of at . In the above equations the variable denotes the financial loss, rather than wealth as is typically the case.
See also
 Capital Adequacy Directive
 Conditional valueatrisk
 Cyber risk quantification based on cyber valueatrisk or CyVaR
 EMP for stochastic programming— solution technology for optimization problems involving VaR and CVaR
 Entropic value at risk
 Profit at risk
 Margin at risk
 Liquidity at risk
 Risk return ratio
 Valuation risk
References
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} ^{f} ^{g} ^{h} ^{i} ^{j} Jorion, Philippe (2006). Value at Risk: The New Benchmark for Managing Financial Risk (3rd ed.). McGrawHill. ISBN 9780071464956.
 ^ ^{a} ^{b} Holton, Glyn A. (2014). ValueatRisk: Theory and Practice second edition, ebook.
 ^ ^{a} ^{b} David Einhorn (June–July 2008), Private Profits and Socialized Risk (PDF), GARP Risk Review, archived (PDF) from the original on April 26, 2016
 ^ McNeil, Alexander; Frey, Rüdiger; Embrechts, Paul (2005). Quantitative Risk Management: Concepts Techniques and Tools. Princeton University Press. ISBN 9780691122557.
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} ^{f} Dowd, Kevin (2005). Measuring Market Risk. John Wiley & Sons. ISBN 9780470013038.
 ^ Pearson, Neil (2002). Risk Budgeting: Portfolio Problem Solving with ValueatRisk. John Wiley & Sons. ISBN 9780471405566.
 ^ ^{a} ^{b} ^{c} ^{d} Aaron Brown (March 2004), The Unbearable Lightness of CrossMarket Risk, Wilmott Magazine
 ^ ^{a} ^{b} Crouhy, Michel; Galai, Dan; Mark, Robert (2001). The Essentials of Risk Management. McGrawHill. ISBN 9780071429665.
 ^ Jose A. Lopez (September 1996). "Regulatory Evaluation of ValueatRisk Models". Wharton Financial Institutions Center Working Paper 9651.
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} Kolman, Joe; Onak, Michael; Jorion, Philippe; Taleb, Nassim; Derman, Emanuel; Putnam, Blu; Sandor, Richard; Jonas, Stan; Dembo, Ron; Holt, George; Tanenbaum, Richard; Margrabe, William; Mudge, Dan; Lam, James; Rozsypal, Jim (April 1998). Roundtable: The Limits of VaR. Derivatives Strategy.
 ^ Aaron Brown (March 1997), The Next Ten VaR Disasters, Derivatives Strategy
 ^ Wilmott, Paul (2007). Paul Wilmott Introduces Quantitative Finance. Wiley. ISBN 9780470319581.
 ^ Lawrence York (2009), Best Practices in Governance
 ^ Artzner, Philippe; Delbaen, Freddy; Eber, JeanMarc; Heath, David (1999). "Coherent Measures of Risk" (PDF). Mathematical Finance. 9 (3): 203–228. doi:10.1111/14679965.00068. Retrieved February 3, 2011.
 ^ Foellmer, Hans; Schied, Alexander (2004). Stochastic Finance. de Gruyter Series in Mathematics. 27. Berlin: Walter de Gruyter. pp. 177–182. ISBN 9783110183467. MR 2169807.
 ^ Nassim Taleb (December 1996 – January 1997), The World According to Nassim Taleb, Derivatives Strategy
 ^ Julia L. Wirch; Mary R. Hardy. "Distortion Risk Measures: Coherence and Stochastic Dominance" (PDF). Retrieved March 10, 2012.
 ^ Balbás, A.; Garrido, J.; Mayoral, S. (2008). "Properties of Distortion Risk Measures". Methodology and Computing in Applied Probability. 11 (3): 385. doi:10.1007/s110090089089z.
 ^ Jorion, Philippe (April 1997). The JorionTaleb Debate. Derivatives Strategy.
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} ^{f} Aaron Brown (June–July 2008). "Private Profits and Socialized Risk". GARP Risk Review.
 ^ ^{a} ^{b} ^{c} Espen Haug (2007). Derivative Models on Models. John Wiley & Sons. ISBN 9780470013229.
 ^ Ezra Zask (February 1999), Taking the Stress Out of Stress Testing, Derivative Strategy
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} Kolman, Joe; Onak, Michael; Jorion, Philippe; Taleb, Nassim; Derman, Emanuel; Putnam, Blu; Sandor, Richard; Jonas, Stan; Dembo, Ron; Holt, George; Tanenbaum, Richard; Margrabe, William; Mudge, Dan; Lam, James; Rozsypal, Jim (April 1998). "Roundtable: The Limits of Models". Derivatives Strategy.
 ^ Aaron Brown (December 2007). "On Stressing the Right Size". GARP Risk Review.
 ^ The Pricing and Hedging of Interest Rate Derivatives: A Practical Guide to Swaps, J H M Darbyshire, 2016, ISBN 9780995455511
 ^ Markovich, N. (2007), Nonparametric analysis of univariate heavytailed data, Wiley
 ^ ^{a} ^{b} Novak, S.Y. (2011). Extreme value methods with applications to finance. Chapman & Hall/CRC Press. ISBN 9781439835746.
 ^ Kuester, Keith; Mittnik, Stefan; Paolella, Marc (2006). "ValueatRisk Prediction: A Comparison of Alternative Strategies". Journal of Financial Econometrics. 4: 53–89. doi:10.1093/jjfinec/nbj002.
 ^ McKinsey & Company. "McKinsey Working Papers on Risk, Number 32" (pdf).
 ^ Christoffersen, Peter (1998). "Evaluating interval forecasts". International Economic Review. 39 (4): 841–62. CiteSeerX 10.1.1.41.8009. doi:10.2307/2527341. JSTOR 2527341.
 ^ ^{a} ^{b} Pajhede, Thor (2017). "Backtesting ValueatRisk: A Generalized Markov Framework". Journal of Forecasting. 36 (5): 597–613. doi:10.1002/for.2456.
 ^ Christoffersen, Peter (2014). "Backtesting ValueatRisk: A DurationBased Approach". Journal of Financial Econometrics.
 ^ Haas, M. (2006). "Improved durationbased backtesting of valueatrisk". Journal of Risk. 8.
 ^ Tokpavi, S. "Backtesting ValueatRisk: A GMM DurationBased Test". Journal of Financial Econometrics.
 ^ Dufour, JM (2006). "Monte carlo tests with nuisance parameters: A general approach to finitesample inference and nonstandard asymptotics". Journal of Econometrics.
 ^ ^{a} ^{b} Taleb, Nassim Nicholas (2007). The Black Swan: The Impact of the Highly Improbable. New York: Random House. ISBN 9781400063512.
 ^ Nassim Taleb (April 1997), The JorionTaleb Debate, Derivatives Strategy
 ^ Joe Nocera (January 4, 2009), Risk Mismanagement, The New York Times Magazine
 ^ Nassim Taleb (Sep 10, 2009). "Report on The Risks of Financia l Modeling, VaR and the Economic Breakdown" (PDF). U.S. House of Representatives. Archived from the original (PDF) on November 4, 2009.
External links
 Discussion
 "Value At Risk", Ben Sopranzetti, Ph.D., CPA
 "Perfect Storms" – Beautiful & True Lies In Risk Management, Satyajit Das
 "Gloria Mundi" – All About Value at Risk, Barry Schachter
 Risk Mismanagement, Joe Nocera NY Times article.
 "VaR Doesn't Have To Be Hard", Rich Tanenbaum
 Tools
 "The Pricing and Trading of Interest Rate Derivatives", J H M Darbyshire, MSc.
 Online realtime VaR calculator, Razvan Pascalau, University of Alabama
 ValueatRisk (VaR), Simon Benninga and Zvi Wiener. (Mathematica in Education and Research Vol. 7 No. 4 1998.)
 Derivatives Strategy Magazine. "Inside D. E. Shaw" Trading and Risk Management 1998