To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

From Wikipedia, the free encyclopedia

Itô integral Yt(B) (blue) of a Brownian motion B (red) with respect to itself, i.e., both the integrand and the integrator are Brownian. It turns out Yt(B) = (B2t)/2.

Itô calculus, named after Kiyosi Itô, extends the methods of calculus to stochastic processes such as Brownian motion (see Wiener process). It has important applications in mathematical finance and stochastic differential equations.

The central concept is the Itô stochastic integral, a stochastic generalization of the Riemann–Stieltjes integral in analysis. The integrands and the integrators are now stochastic processes:

where H is a locally square-integrable process adapted to the filtration generated by X (Revuz & Yor 1999, Chapter IV), which is a Brownian motion or, more generally, a semimartingale. The result of the integration is then another stochastic process. Concretely, the integral from 0 to any particular t is a random variable, defined as a limit of a certain sequence of random variables. The paths of Brownian motion fail to satisfy the requirements to be able to apply the standard techniques of calculus. So with the integrand a stochastic process, the Itô stochastic integral amounts to an integral with respect to a function which is not differentiable at any point and has infinite variation over every time interval. The main insight is that the integral can be defined as long as the integrand H is adapted, which loosely speaking means that its value at time t can only depend on information available up until this time. Roughly speaking, one chooses a sequence of partitions of the interval from 0 to t and constructs Riemann sums. Every time we are computing a Riemann sum, we are using a particular instantiation of the integrator. It is crucial which point in each of the small intervals is used to compute the value of the function. The limit then is taken in probability as the mesh of the partition is going to zero. Numerous technical details have to be taken care of to show that this limit exists and is independent of the particular sequence of partitions. Typically, the left end of the interval is used.

Important results of Itô calculus include the integration by parts formula and Itô's lemma, which is a change of variables formula. These differ from the formulas of standard calculus, due to quadratic variation terms.

In mathematical finance, the described evaluation strategy of the integral is conceptualized as that we are first deciding what to do, then observing the change in the prices. The integrand is how much stock we hold, the integrator represents the movement of the prices, and the integral is how much money we have in total including what our stock is worth, at any given moment. The prices of stocks and other traded financial assets can be modeled by stochastic processes such as Brownian motion or, more often, geometric Brownian motion (see Black–Scholes). Then, the Itô stochastic integral represents the payoff of a continuous-time trading strategy consisting of holding an amount Ht of the stock at time t. In this situation, the condition that H is adapted corresponds to the necessary restriction that the trading strategy can only make use of the available information at any time. This prevents the possibility of unlimited gains through clairvoyance: buying the stock just before each uptick in the market and selling before each downtick. Similarly, the condition that H is adapted implies that the stochastic integral will not diverge when calculated as a limit of Riemann sums (Revuz & Yor 1999, Chapter IV).

YouTube Encyclopedic

  • 1/5
    Views:
    276 733
    51 830
    20 318
    126 189
    18 426
  • 18. Itō Calculus
  • Stochastic Calculus for Quants | Understanding Geometric Brownian Motion using Itô Calculus
  • Ito Lemma
  • Ito's Lemma
  • 212(a) - Ito's Formula for Brownian Motion

Transcription

The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. PROFESSOR: Let's begin. Today we're going to continue the discussion on Ito calculus. I briefly introduced you to Ito's lemma last time, but let's begin by reviewing it and stating it in a slightly more general form. Last time what we did was we did the quadratic variation of Brownian motion, Brownian process. We defined the Brownian process, Brownian motion, and then showed that it has quadratic variation, which can be written in this form-- d B square is equal to dt. And then we used that to show the simple form of Ito's lemma, which says that if f is a function on the Brownian motion, then d of f is equal to f prime of d Bt plus f double prime of dt. This additional term was a characteristic of Ito calculus. In classical calculus we only have this term, but we have this additional term. And if you remember, this happened exactly because of this quadratic variation. Let's review it, and let's do it in a slightly more general form. As you know, we have a function f depending on two variables, t and x. Now we're interested in-- we want to evaluate our information on the function f t, Bt. The second coordinate, we're planning to put in the Brownian motion there. Then again, let's do the same analysis. Can we describe d of f in terms of these differentiations? To do that, deflect this, let me start from Taylor expansion. f at a point t plus delta t, x plus delta x by Taylor expansion for two variables is f of t of x plus partial of f over partial of t at t comma x of delta t plus x. That's the first order of terms. Then we have the second order of terms. Then the third order of terms, and so on. That's just Taylor expansion. If you look at it, we have a function f. We want to look at the difference of f when we change to the first variable a little bit and the second variable a little bit. We start from f of t of x. In the first order of terms, you take the partial derivative, so take del f over del t, and then multiply by the t difference. Second term, you take the partial derivative with respect to the second variable-- partial f over partial x-- and then multiply by del x. That much is enough for classical calculus. But then, as we have seen before, we ought to look at the second order of term. So let's first write down what it is. That's exactly what happened in Taylor expansion, if you remember. If you don't remember, just believe me. This 1 over 2 times takes the second derivative partial. Let's write it in terms of-- yes? AUDIENCE: [INAUDIBLE] PROFESSOR: Oh, yeah, you're right. Thank you. Is it good enough? Let's write it as dt all these deltas. I'll just write like that. I'll just not write down t of x. And what we have is f plus del f over del t dt plus del f over del x dx plus the second order of terms. The only important terms-- first of all, these terms are important. But then, if you want to use x equals B of t-- so if you're now interested in f t comma B of t. Or more generally, if you're interested in f t plus dt, f Bt plus d of Bt, then these terms are important. If you subtract f of t of Bt, what you get is these two terms. Del f over del t dt plus del f over del x-- I'm just writing this as a second variable differentiation-- at d Bt. And then the second order of terms. Instead of writing it all down, dt square is insignificant, and dt comma dt times d Bt also is insignificant. But the only thing that matters will be this one. This Is d Bt square, which you saw is equal to dt. From the second order of term, you'll have this term surviving. 1 over 2 partial f over partial x second derivative of dt. That's it. If you rearrange it, what we get is partial f over partial t plus 1/2 this plus-- and that's the additional term. If you ask me why these terms are not important and this term is important, I can't really say it rigorously. But if you think about d Bt square equals dt, then d times Bt is kind of like square root of dt. It's not a good notation, but if you do that-- these two terms are significantly smaller than dt because you're taking a power of it. dt square becomes a lot smaller than dt, dt [INAUDIBLE] is a lot smaller than dt. But this one survives because it's equal to dt here. That's just the high level description. That's a slightly more sophisticated form of Ito's lemma. Let me write it down here. And let's just fix it now. If f is t of Bt-- that's d of f is equal to-- Any questions? Just remember, from the classical calculus term, we're only adding this one term there. Yes? AUDIENCE: Why do we have x there? PROFESSOR: Because the second variable is supposed to be x. I don't want to write down partial derivative with respect to a Brownian motion here because it doesn't look good. It just means, take the partial derivative with respect to the second term. So just view this as a function f of t of x, evaluate it, and then plug in x equals Bt in the end, because I don't want to write down partial Bt here. Other questions? Consider a stochastic process X of t such that d of x is equal to mu times d of t plus sigma times d of Bt. This is almost like a Brownian motion, but you have this additional term. This is called a drift term. Basically, this happens if Xt is equal to mu t plus sigma of Bt. Mu and sigma are constants. From now on, what we're going to study is stochastic process of this type, whose difference can be written in terms of drift term and the Brownian motion term. We want to do a slightly more general form of Ito's lemma, where we want f of t of Xt here. That will be the main object of study. I'll finally state the strongest Ito's lemma that we're going to use. f is some smooth function and Xt is a stochastic process like that. Xt satisfies where Bt is the Brownian motion. Then df of t, Xt can be expressed as-- it's just getting more and more complicated. But it's based on this one simple principle, really. It all happened because of quadratic variation. Now I'll show you why this form deviates from this form when we replace B to x. Remember here all other terms didn't matter, that the only term that mattered was partial square of f of dx square. To prove this, note that df is partial f over partial t dt plus partial f over partial x d of Xt plus 1/2 of d of x squared. Just exactly the same, but I've place the d Bt previously, what we had d Bt I'm replacing to dXt. Now what changes is dXt can be written like that. If you just plug it in, to get here is partial f over partial x mu dt plus sigma of d Bt. Then what you get here is 1/2 of partials and then mu plus sigma d Bt square. Out of those three terms here we get mu square dt square plus 2 times mu sigma d mu dB plus sigma square d Bt square. Only this was survives, just as before. These ones disappear. And then you just collect the terms. So dt-- there's one dt here. There's mu times that here, and that one will become a dt. It's 1/2 of sigma square f square of dt. And there's only one d Bt term here. Sigma-- I made a mistake, sigma. This will be a form that you'll use the most, because you want to evaluate some stochastic process-- some function that depends on time and that stochastic process. You want to understand the difference, df. The X would have been written in terms of a Brownian motion and a drift term, and then that's the Ito lemma for you. But if you want to just-- if you just see this for the first time, it just looks too complicated. You don't understand where all the terms are coming from. But in reality, what it's really doing is just take this Taylor expansion. Remember these two classical terms, and remember that there's one more term here. You can derive it if you want to. Really try to know where it all comes from. It all started from this one fact, quadratic variation, because that made some of the second derivative survive, and because of those, you get these kind of complicated terms. Questions? Let's do some examples. That's too much. Sorry, I'm going to use it a lot, so let me record it. Example number one. Let f of x be equal to x square, and then you want to compute d of f at Bt. I'll give you three minutes just to try a practice. Did you manage to do this? It's a very simple example. Assume it's just the function of two variables, but it doesn't depend on t. You don't have to do that, but let me just do that. Partial f over partial t is 0. Partial f over partial x is equal to 2x, and the second derivative equal to 2 at tx. Now we just plug in t comma Bt, and what you have is mu equals 0, sigma equals 1, if you want to write it in this formula. What you're going to have is 2 times Bt of d Bt plus 1 over 2 times 2dt. You should write it down. You can either use these parameters and just plug in each of them to figure it out. Or a different way to do it is really write down, remember the proof. This is partial f over partial t dt plus partial f over partial x dx plus 1/2-- remember this one. And x is d Bt here. That one is 0, that one was 2x, so 2Bt d Bt. Use it one more time, so you get dt. Make sense? Let's do a few more examples. And you want to compute d of f at t comma B of t. Let's do it this time. Again, partial f over partial t dt plus partial f over partial x d Bt. That's the first order of terms. The second order of term is 1/2 partial square f over partial x square of d Bt square, which is equal to dt. Let's do it. Partial f over partial t, you get mu times f. This one is just equal to mu times f. Maybe I'm going too quick. Mu times e to the mu t plus dx dt. Partial f over partial x is sigma times e to the mu t plus dx, and then d Bt plus-- if you take the second derivative, you do that again, what you get is 1/2, and then sigma square times e to the mu t plus dx dt. Yes? AUDIENCE: In the original equation that you just wrote, isn't it 1/2 times sigma squared, and then the second derivative? Up there. PROFESSOR: Here? AUDIENCE: Yes. PROFESSOR: 1/2? AUDIENCE: Times sigma squared. PROFESSOR: Oh, sigma-- OK, that's a good question. But that sigma is different. That's if you plug in Xt here. If you plug in Xt where Xt is equal to mu prime dt plus sigma prime d of Bt, then that sigma prime will become a sigma prime square here. But here the function is mu and sigma, so maybe it's not a good notation. Let me use a and b here instead. The sigma here is different from here. AUDIENCE: Yeah, that makes a lot more sense. PROFESSOR: If you replace a and b, but I already wrote down all mu's and sigma's. That's a good point, actually. But that's when you want to consider a general stochastic process here other than Brownian motion. But here it's just a Brownian motion, so it's the most simple form. And that's what you get. Mu plus 1/2 sigma square-- and these are just all f itself. That's the good thing about exponential. f times dt plus sigma times d of Bt. Make sense? And there's a reason I was covering this example. It's because-- let's come back to this question. You want to model stock price using Brownian motion, Brownian process, S of t. But you don't want St to be a Brownian motion. What you want is a percentile difference to be a Brownian motion, so you want this percentile difference to behave like a Brownian motion with some variance. The question was, is St equal to e to the sigma times B of t in this case? And I already told you last time that no, it's not true. We can now see why it's not true. Take this function, St equals e to the sigma Bt, that's exactly where mu is equal to 0 here. What we got here was d of St, in this case, is equal to mu is 0, so we get 1/2 of sigma square times dt plus sigma times d of Bt. We originally were targeting sigma times d Bt, but we got this additional term which we didn't want in the first. In other words, we have this drift. I wasn't really clear in the beginning, but our goal was to model stock price where the expected value is 0 at all times. Our guess what to take e to the sigma of Bt, but it turns out that in this case we have a drift, if you just take natural e to the sigma of Bt. To remove that drift, what you can do is subtract that term somehow. If you can get rid of that term then you can see if you add this mu to be minus 1 over 2 sigma square, you can remove that term. That's why it doesn't work. So instead use S of t equals e to the minus 1 over 2 sigma square t plus sigma of Bt. That's the geometric Brownian motion without drift. And the reason it has no drift is because of that. If you actually do the computation, the dt term disappears. Question? So far we have been discussing differentiation. Now let's talk about integration. Yes? AUDIENCE: Could you we do get this solution as [INAUDIBLE]. Could you also describe what it means? What does it mean, this solution of Bt? Does that mean if we have a sample Bt, then we could get a sample Bt [INAUDIBLE]? PROFESSOR: Oh, what this means, yes. Whenever you have the Bt value, just at each time take the exponential value. Because why we want to express this in terms of a Brownian motion is, for Brownian motion we have a pretty good understanding. It's a really good process you understand fairly well, and you have good control on it. But the problem is you want to have a process whose percentile difference behaves like a Brownian motion. And this gives you a way of describing it in terms of Brownian motion, as an exponential function of it. Does that answer your question? AUDIENCE: Right, distribution means that if we have a sample Bt, that would be the corresponding sample Bt [INAUDIBLE]? PROFESSOR: That's a good question, actually. Think of it as a point related to valuation. That is not always correct, but for most of the things that we will cover, it's safe to think about it that way. But if you think about it path wise all the time, eventually it fails. But that's a very advanced topic. So what this question is, basically Bt is a probability space, it's a probability distribution over passes. For this equation, if you just look at it, it looks right, but it doesn't really make sense, because Bt-- if it's a probability distribution, what is e to the Bt? Basically, what it's saying is Bt is a probability distribution over passes. If you take omega according to a pass according to the Brownian motion example probability distribution, and for this pass it's well defined, this function. So the probability density function of this pass is equal to the problem to density function of e to the whatever that is in this distribution. Maybe it confused you more. Just consider this as some pass, some well defined function, and you have a well defined function. Integral definition. I will first give you a very, very stupid definition of integration. We say that we define F as the integration if d of F is equal to f d Bt plus-- We define it as an inverse of differentiation. Because differentiation is now all defined-- we just defined integration as the inverse of it, just as in classical calculus. So far, it doesn't have that good meaning, other than being an inverse of it, but at least it's well defined. The question is, does it exist? Given f and g, does it exist, does integration always exist, and so on. There's lots of questions to ask, but at least this is some definition. And the natural question is, does there exist a Riemannian sum type description? That means-- if you remember how we defined integral in calculus, you have a function f, integration of f from a to b. According to the Riemannian sum description was, you just chop the interval into very fine pieces-- a0, a1, a2, a3, dot, dot, dot-- and then sum the area of these boxes, and take the limit. And this is the limit of Riemannian sums. Slightly more, if you want, is it's the limit as n goes to infinity of the function 1 over n times the sum of [INAUDIBLE] f of t b over n minus f of t minus 1 over n. Does this ring a bell? Question? AUDIENCE: [INAUDIBLE] PROFESSOR: No, you're right. Good point, no we don't. Thanks. Does integral defined in this way have this Riemannian sum type description, is the question. So keep that in mind. I will come back to this point later. In fact, it turns out to be a very deep question and very important question, this question, because if you remember like I hope you remember, in the Riemannian sum, it didn't matter which point you took in this interval. That was the whole point. You have the function. In the interval a i to a i plus 1, you take any point in the middle and make a rectangle according to that point. And then, no matter which point you take, when you go to the limit, you had exactly the same sum all the time. That's how you define the limit. But what's really interesting here is that it's no longer true. If you take the left point all the time, and you take the right point all the time, the two limits are different. And again, that's used in quadratic variation, because that much of variance can accumulate over time. That's the reason we didn't start with Riemannian sum type definition of integral. But I'll just make one remark. Ito integral is the limit of Riemannian sums when always take the leftmost point of each interval. So you chop down this curve at the time interval into pieces, and for each rectangle, pick the leftmost point, and use it as a rectangle. And you take the limit. That will be your Ito integral defined. It will be exactly equal to this thing, the inverse of our Ito differentiation. I won't be able to go into detail. What's more interesting is instead, what happens if you take the rightmost point all the time, you get an equivalent theory of calculus. It's just like Ito's calculus. It looks really, really similar and it's coherent itself, so there is no logical flaw in it. It all makes sense, but the only difference is instead of a plus in the second order of term, you get minuses. Let me just make this remark, because it's just a theoretical part, this thing, but I think it's really cool. Remark-- there's this and equivalent version. Maybe equivalent is not the right word, but a very similar version of Ito calculus such that basically, what it says is d Bt square is equal to minus dt. Then that changed a lot of things. But this part, it's not that important. Just cool stuff. Let's think about this a little bit more, this fact. Taking the leftmost point all the time means if you want to make a decision for your time interval-- so at time t of i and time t of i plus 1, let's say it's the stock price. You want to say that you had so many stocks in this time interval. Let's say you had so many stocks in this time interval according to the values between this and this. In real world, your only choice you have is you have to make the decision at time t of i. Your choice cannot depend on the future time. You can't suddenly say, OK, in this interval the stock price increased a lot, so I'll assume that I had a lot of stocks in this interval. In this interval, I knew it was going to drop, so I'll just take the rightmost interval. I'll assume that I only had this many stock. You can't do that. Your decision has to be based on the leftmost point, because the time. You can't see the future. And the reason Ito's calculus works well in our setting is because of this fact, because it has inside it the fact that you cannot see the future. Every decision is made based on the leftmost time. If you want to make a decision for your time interval, you have to do it in the beginning. That intuition is hidden inside of the theory, and that's why it works so well. Let me reiterate this part a little bit more. It's the definition of these things where you're only allowed to-- at time t, you're only allowed to use the information up to time t. Definition delta t is an adapted process-- sorry-- adapted to another stochastic process Xt if for all values of time variables delta t depends only on X0 up to Xt. There's a lot of vague statements inside here, but what I'm trying to say is just assume x is the Brownian motion underlying stock price. Your stock is changing. You want to call it with a strategy, and you want to say that mathematically this strategy makes sense. And what it's saying is if your strategy makes your decision at time t is only based on the past values of your stock price, then that's an adapted process. This defines the processes that are reasonable, that cannot see future. And these are all-- in terms of strategy, if delta t is a portfolio strategy, these are the only meaningful strategies that you can use. And because of what I said before, because we're always taking the leftmost point, adaptive processes just also fit very well with Ito's calculus. They'll come into play altogether. Just a few examples. First, a very stupid example. Xt is adapted to Xt. Of course, because at time, Xt really depends on only Xt, nothing else. Two, Xt plus 1 is not adapted to Xt. This is maybe a little bit vague, so we'll call it Yt equals Xt plus 1. Yt is the value at t plus 1, and it's not based on the values up to time t. Just a very artificial example. Another example, delta t equals minima is adapted. And I'll let you think about it. The fourth is quite interesting. Suppose T is fixed, some large integer, or some large real number. Then you let delta t to be the maximum where X of s. It's not adapted. What is this? This means at time T, I'm going to take at it this value, the maximum of all value inside this part, the future. This refers to the future. It's not an adapted process. Any questions? Now we're ready to talk about the properties of Ito's integral. Let's quickly review what we have. First, I defined Ito's lemma-- that means differentiation in Ito calculus. Then I defined integration using differentiation-- integration was an inverse operation of the differentiation. But this integration also had an alternative description in terms of Riemannian sums, where you're taking just the leftmost point as the reference point for each interval. And then, as you see, this naturally had this concept of using the leftmost point. And to abstract that concept, we've come up with this adapted process, very natural process, which is like the real life procedures, real life strategies we can think of. Now let's see what happens when you take the integral of adapted processes. Ito integral has really cool properties. The first thing is about normal distribution. Bt has normal distribution of 0 up to t. So your Brownian motion at time t has normal distribution with 0, t. That means if your stochastic process is some constant time B of t, of course, then you have 0 and c square t. It's still a normal variable. That means if you integrate, that's the integration of some sigma. That's the integration of sigma of d Bt. If sigma is a fixed constant, when you take the Ito integral of sigma times d Bt, this constant, at each time you get a normal distribution. And this is like saying the sum of normal distribution is also normal distribution. It has this hidden fact, because integral is like sum in the limit. And this can be generalized. If delta t is on a process depending only on the time variable-- so it does not depend on the Brownian motion-- then the process X of t equals the integration of delta t d Bt has normal distribution at all time. Just like this, we don't know the exact variance yet. The variance will depend on the sigmas, but still, it's like a sum of normal variables, so we'll have normal distribution. In fact, it just gets better and better. The second fact is called Ito isometry. That was cool. Can we compute the variance? Yes? AUDIENCE: Can you put that board up? PROFESSOR: Sure. AUDIENCE: Does it go up? PROFESSOR: This one doesn't go up. That's bad. I wish it did go up. This has a name called Ito isometry. Can be used to compute the variance. Bt has a Brownian motion, delta t is adapted to a Brownian motion. Then the expectation of your Ito integral-- that's the Ito integral of your adapted process. That's the variance-- we take the square of it-- is equal to something cool. The square just comes in. Quite nice, isn't it? I won't prove it, but let me tell you why. We already saw this phenomenon before. This is basically quadratic variation. And the proof also uses it. If you take delta s equals to 1-- sorry, I was using Korean-- 1 at all time, then what we have is here you get a Brownian motion, Bt. So on the left you get an expectation of Bt square, and on the right, what you get is t. Because when delta s is equal to 1 at all time, when you have to get from 0 to t you get t, and you have t on the right hand side. That's what it's saying. And that was the content of quadratic variation, if you remember. We're summing the squares-- maybe not exactly this, but you're summing the squares over small intervals. So that's a really good fact that you can use to compute the variance. You have an Ito integral, you know the square, can be computed this simple way. That's really cool. And one more property. This one will be really important. You'll see it a lot in future lectures. It's that when is Ito integral a martingale? What's a martingale? Martingale meant if you have a stochastic process, at any time t, whatever happens after that, the expected value at time t is equal to 0. It doesn't have any natural tendency to go up or go down. No matter which point you stop your process and you see your future, it doesn't have a natural tendency to go up or go down. In formal language, it can be defined as where Ft is the events X0 gets t. So if you take the conditional expectation based on whatever happened up to time t, that expectation will just be whatever value you have at that time. Intuitively, that just means you don't have any natural tendency to go up or go down. Question is, when is an Ito integral a martingale? Adapted to B of t, then it is a martingale. As long as g is not some crazy function, as long as g is reasonable-- only can be reasonable if it's [INAUDIBLE]. If you don't know what it means, you can safely ignore it. Basically, if g doesn't-- it's not a crazy function if it doesn't grow too fast, then in most cases this integral is always a martingale. If you flip it-- remember, integral was defined as the inverse of differentiation. So if d Xt is equal to some function mu that depends on both t and Bt times dt plus sigma of d Bt, what this means is Xt is a martingale if that is 0 at all time, always. And if it's not 0, you have a drift, so it's not a martingale. That gives you some classification. Now, if you look at a differential equation of this stochastic-- this is called a stochastic differential equation-- if you know stochastic process, if you look at a stochastic differential equation, if it doesn't have a drift term, it's a martingale. If it has a drift term, it's not a martingale. That'll be really useful later, so try to remember it. The whole point is when you write down a stochastic process in terms of something times dt, something times d Bt, really this term contributes towards the tendency, the slope of whatever is going to happen in the future. And this is like the variance term. It adds some variance to your stochastic process. But still, it doesn't add or subtract value over time, it fairly adds variation. Remember that. That's very important fact. You're going to use it a lot. For example, you're going to use it for pricing theory. In pricing theory, you come up with this stochastic process or some strategy. You look at its value. Let's say Xt is your value of your portfolio over time. If that portfolio has-- then you match it with your financial-- let me go over it slowly again. First you have a financial derivative, like option of a stock. Then you have your portfolio strategy. Assume that you have some strategy that, at the expiration time, gives you the exact value of the option. Now you look at the difference between these two stochastic processes. Basically what the thing is, when your variance goes to 0, your drift also has to go to 0. So when you look at the difference, if you can somehow get rid of this variance term, that means no matter what you do, that will govern the value of your portfolio. If it's positive, that means you can always make money, because there's no variance. Without variance, you make money. That's called arbitrage, and you cannot have that. But I won't go into further detail because [INAUDIBLE] will cover it next time. But just remember that flavor. So when you write something down in a stochastic differential equation form, that term is a drift term, that term is a variance term. And if you don't have drift, it's a martingale. That is very important. Any questions? That's kind of the basics of Ito calculus. I will give you some exercises on it, mostly just basic computation exercises, so that you'll get familiar with it. Try to practice it. And let me cover one more thing called Girsanov theorem. It's related, but these are really basics of the Ito calculus, so if you have any questions on this, please ask me right now before I move on to the next topic. The last thing I want to talk about today. Here is an underlying question. Suppose you have two Brownian motions. This is without drift. And you have another B tilde Brownian motion with drift. These are two probability distributions overpasses. According to Bt, you're more likely to have some Brownian motion that has no drift. That's a sample pass. According to B tilde, you have some drift. Your Brownian motion will close it. A typical pass will follow this line and will follow that line. The question is this-- can we switch from this distribution to this distribution by a change of measure? Can we switch between the two measures to probability distributions by a change of measure? Let me go a little bit more what it really means. Assume that you're just looking at a Brownian motion from time 0 up to time t, some fixed time interval. Then according to Bt, let's say this is a sample pass omega. You have some probability of omega-- this is a p.d.f. given by this Brownian motion B. And then you have another p.d.f., P tilde of omega, which is a p.d.f. given by P of t. The question is, does there exist a Z depending on omega such that P of omega is equal to Z times P tilde? Do you understand the question? Clearly, if you just look at it, they're quite different. The passes that you get according to distributions are quite different. It's not clear why we should expect it at all. You'll see the answer soon. But let me discuss all this in a different context. Just forget about all the Brownian motion and everything just for a moment. In this concept, changing from one probability distribution to another distribution, it's a very important concept in analysis and probability just in general, theoretically. And there's a name for this Z, for this changing measure. If Z exists, it's called the Radon-Nikodym derivative. Before doing that, let me talk a little bit more. Suppose P is a probability distribution over omega. It's a probability distribution. So this is some set, and P describes the probability that you have each element in the set. And you have another probability distribution, P tilde. We define P and P tilde to be equivalent if the probability that A is greater than zero if and only for all. These probability distributions describe the probability of the subsets. Think about a very simple case. Sigma is equal to 1, 2, and 3. P gives 1/3 probability to 1, 1/3 probability to 2, 1/3 probability to 3. P tilde gives 2/3 probability to 3, 1 over 6 probability to 2, 1 over 6 probability to 3. We have two probability distribution over some space. They are equivalent if, whenever you take a subset of your background set-- let's say 1, 2. When A is equal to 1, 2, according to probability distribution P, the probability you fall into this set A is equal to 2/3. According to P tilde, you have 5/6. They're not the same. The probability itself is not the same, but this condition is satisfied when it's 0. And when it's not 0, it's not 0. And you can just check that it's always true, because they're all positive probabilities. On the other hand, if you take instead, say, 1/3 and 0, now you take your A to be 3. Then you have 1/3 equal to 0. This means, according to probability distribution P, there is some probability that you'll get 3. But according to probability distribution P tilde, you don't have any probability of getting 3. So they're not equivalent in this case. If you think about it, then it's really clear. The theorem says-- this is a very important theorem in analysis, actually. The theorem-- there exists a Z such that P of omega is equal to if and only if P and P tilde are equivalent. You can change from one probability measure to another probability measure just in terms of multiplication, if and only if they're equivalent. And you can see that it's not the case for this when they're not equivalent. You can't make a zero probability to 1/3 probability by multiplication. So in the finite world this is very just intuitive theorem, but what this is saying is it's true for all probability spaces. And these are called the Radon-Nikodym derivative. Our question is, are these two Brownian motions equivalent? The passes that this Brownian motion without drift takes and the Brownian motion with drift takes-- are they kind of the same but just skewed in distribution, or are they really fundamentally different? That's the question. And what Girsanov's theorem says is that they are equivalent. To me, it came as a little bit non intuitive. I would imagine that it's not equivalent, these two. These passes have a very natural tendency. As it goes to infinity, these passes and these passes will really look a lot different, because when you go really, really far, the passes which have drift will be just really close to your line mu of t, while the passes which don't have drift will be really close to the x axis. But still, they are equivalent. You can change from one to another. I'll just state that theorem without proof. And this will also be used in pricing theory. I'm not an expert enough to tell why, but basically what it's saying is, you switch some stochastic process into a stochastic process without drift, thus making it into a martingale. And martingale has a lot of meaning in pricing theory, as you'll see. This also application for it. That's why I'm trying to cover it, although it's quite a technical theorem. Try to remember, at least a statement and the spirit of what it means. It just means these two are equivalent, you can change from one to another by a multiplicative function. Let me just state it in a simple form. GUEST SPEAKER: If I could just interject a comment. PROFESSOR: Sure. GUEST SPEAKER: With these changes of measure, it turns out that all of these theories with continuous time processes should have an interpretation if you've discretized time, and should consider sort of a finer and finer discretization of the process. And with this change of measure, if you consider problems in discrete stochastic processes like random walks, basically how-- say if you're gambling against a casino or against another player, and you look at how your winnings evolve as a random walk, depending on your odds, your odds could be that you will tend to lose. So there's basically a drift in your wealth as this random process evolves. You can transform that process, basically by taking out your expected losses, to a process which has zero change in expectation. And so you can convert these gambling problems where there's drift to a version where the process, essentially, has no drift and is a martingale. And the martingale theory in stochastic process courses is very, very powerful. There's martingale convergence theorems. So you know that the limit of the martingale is-- there's a convergence of the process, and that applies here as well. PROFESSOR: You will see some surprising applications. GUEST SPEAKER: Yeah. PROFESSOR: And try to at least digest the statement. When the guest speaker comes and says by Girsanov's theorem, they actually know what it is. There's a spirit. This is a very simple version. There's a lot of complicated versions, but let me just do it. So P is a probability distribution over passes from 0, T to infinity. What this means is just passes from that stochastic process defined from time 0 to time t. These are passes defined by a Brownian motion with drift mu. And then P tilde is a probability distribution defined by Brownian motion without drift. Then P and P tilde are equivalent. Not only are they equivalent, we can actually compute their Radon-Nikodym derivative. And the Radon-Nikodym derivative Z which is defined as T of-- which we denote like this has this nice form. That's a nice closed form. Let me just tell you a few implications of this. Now, assume you have some, let's say, value of your portfolio over time. That's the stochastic process. And you measure it according to this probability distribution. Let's say it depends on some stock price as the stock price is modeled using a Brownian motion with drift. What this is saying is, now, instead of computing this expectation in your probability space-- so this is defined over the probability space P, our sigma [INAUDIBLE] P defined by this probability distribution. You can instead compute it in-- you can compute as expectation in a different probability space. You transform the problems about Brownian motion with drift into a problem about Brownian motion without a drift. And the reason I have Z tilde instead of Z here is because I flipped. What you really should have is Z tilde here as expectation of Z. If you want to use this Z. I don't expect you to really be able to do computations and do that just by looking at this theorem once. Just really trying to digest what it means and understand the flavor of it, that you can transform problems in one probability space to another probability space. And you can actually do that when the two distributions are defined by Brownian motions when one has drift and one doesn't have a drift. How we're going to use it is we're going to transform a non martingale process into a martingale process. When you change into martingale it has very good physical meanings to it. That's it for today. And you only have one more mass lecture remaining and maybe one or two homeworks but if you have two, the second one won't be that long. And you'll have a lot of guest lectures, exciting guest lectures, so try not to miss them.

Notation

The process Y defined before as

is itself a stochastic process with time parameter t, which is also sometimes written as Y = H · X (Rogers & Williams 2000). Alternatively, the integral is often written in differential form dY = H dX, which is equivalent to YY0 = H · X. As Itô calculus is concerned with continuous-time stochastic processes, it is assumed that an underlying filtered probability space is given
The σ-algebra represents the information available up until time t, and a process X is adapted if Xt is -measurable. A Brownian motion B is understood to be an -Brownian motion, which is just a standard Brownian motion with the properties that Bt is -measurable and that Bt+sBt is independent of for all s,t ≥ 0 (Revuz & Yor 1999).

Integration with respect to Brownian motion

The Itô integral can be defined in a manner similar to the Riemann–Stieltjes integral, that is as a limit in probability of Riemann sums; such a limit does not necessarily exist pathwise. Suppose that B is a Wiener process (Brownian motion) and that H is a right-continuous (càdlàg), adapted and locally bounded process. If is a sequence of partitions of [0, t] with mesh width going to zero, then the Itô integral of H with respect to B up to time t is a random variable

It can be shown that this limit converges in probability.

For some applications, such as martingale representation theorems and local times, the integral is needed for processes that are not continuous. The predictable processes form the smallest class that is closed under taking limits of sequences and contains all adapted left-continuous processes. If H is any predictable process such that 0t H2 ds < ∞ for every t ≥ 0 then the integral of H with respect to B can be defined, and H is said to be B-integrable. Any such process can be approximated by a sequence Hn of left-continuous, adapted and locally bounded processes, in the sense that

in probability. Then, the Itô integral is
where, again, the limit can be shown to converge in probability. The stochastic integral satisfies the Itô isometry
which holds when H is bounded or, more generally, when the integral on the right hand side is finite.

Itô processes

A single realization of Itô process with μ = 0 and σ = ψ(t−5), where ψ is the Ricker wavelet. Off the tide of wavelet, the motion of Itô process is stable.

An Itô process is defined to be an adapted stochastic process that can be expressed as the sum of an integral with respect to Brownian motion and an integral with respect to time,

Here, B is a Brownian motion and it is required that σ is a predictable B-integrable process, and μ is predictable and (Lebesgue) integrable. That is,

for each t. The stochastic integral can be extended to such Itô processes,

This is defined for all locally bounded and predictable integrands. More generally, it is required that be B-integrable and be Lebesgue integrable, so that

Such predictable processes H are called X-integrable.

An important result for the study of Itô processes is Itô's lemma. In its simplest form, for any twice continuously differentiable function f on the reals and Itô process X as described above, it states that f(X) is itself an Itô process satisfying

This is the stochastic calculus version of the change of variables formula and chain rule. It differs from the standard result due to the additional term involving the second derivative of f, which comes from the property that Brownian motion has non-zero quadratic variation.

Semimartingales as integrators

The Itô integral is defined with respect to a semimartingale X. These are processes which can be decomposed as X = M + A for a local martingale M and finite variation process A. Important examples of such processes include Brownian motion, which is a martingale, and Lévy processes. For a left continuous, locally bounded and adapted process H the integral H · X exists, and can be calculated as a limit of Riemann sums. Let πn be a sequence of partitions of [0, t] with mesh going to zero,

This limit converges in probability. The stochastic integral of left-continuous processes is general enough for studying much of stochastic calculus. For example, it is sufficient for applications of Itô's Lemma, changes of measure via Girsanov's theorem, and for the study of stochastic differential equations. However, it is inadequate for other important topics such as martingale representation theorems and local times.

The integral extends to all predictable and locally bounded integrands, in a unique way, such that the dominated convergence theorem holds. That is, if HnH and |Hn| ≤ J for a locally bounded process J, then

in probability. The uniqueness of the extension from left-continuous to predictable integrands is a result of the monotone class lemma.

In general, the stochastic integral H · X can be defined even in cases where the predictable process H is not locally bounded. If K = 1 / (1 + |H|) then K and KH are bounded. Associativity of stochastic integration implies that H is X-integrable, with integral H · X = Y, if and only if Y0 = 0 and K · Y = (KH) · X. The set of X-integrable processes is denoted by L(X).

Properties

The following properties can be found in works such as (Revuz & Yor 1999) and (Rogers & Williams 2000):

  • The stochastic integral is a càdlàg process. Furthermore, it is a semimartingale.
  • The discontinuities of the stochastic integral are given by the jumps of the integrator multiplied by the integrand. The jump of a càdlàg process at a time t is XtXt−, and is often denoted by ΔXt. With this notation, Δ(H · X) = H ΔX. A particular consequence of this is that integrals with respect to a continuous process are always themselves continuous.
  • Associativity. Let J, K be predictable processes, and K be X-integrable. Then, J is K · X integrable if and only if JK is X-integrable, in which case
  • Dominated convergence. Suppose that HnH and |Hn| ≤ J, where J is an X-integrable process. then Hn · XH · X. Convergence is in probability at each time t. In fact, it converges uniformly on compact sets in probability.
  • The stochastic integral commutes with the operation of taking quadratic covariations. If X and Y are semimartingales then any X-integrable process will also be [X, Y]-integrable, and [H · X, Y] = H · [X, Y]. A consequence of this is that the quadratic variation process of a stochastic integral is equal to an integral of a quadratic variation process,

Integration by parts

As with ordinary calculus, integration by parts is an important result in stochastic calculus. The integration by parts formula for the Itô integral differs from the standard result due to the inclusion of a quadratic covariation term. This term comes from the fact that Itô calculus deals with processes with non-zero quadratic variation, which only occurs for infinite variation processes (such as Brownian motion). If X and Y are semimartingales then

where [X, Y] is the quadratic covariation process.

The result is similar to the integration by parts theorem for the Riemann–Stieltjes integral but has an additional quadratic variation term.

Itô's lemma

Itô's lemma is the version of the chain rule or change of variables formula which applies to the Itô integral. It is one of the most powerful and frequently used theorems in stochastic calculus. For a continuous n-dimensional semimartingale X = (X1,...,Xn) and twice continuously differentiable function f from Rn to R, it states that f(X) is a semimartingale and,

This differs from the chain rule used in standard calculus due to the term involving the quadratic covariation [Xi,Xj ]. The formula can be generalized to include an explicit time-dependence in and in other ways (see Itô's lemma).

Martingale integrators

Local martingales

An important property of the Itô integral is that it preserves the local martingale property. If M is a local martingale and H is a locally bounded predictable process then H · M is also a local martingale. For integrands which are not locally bounded, there are examples where H · M is not a local martingale. However, this can only occur when M is not continuous. If M is a continuous local martingale then a predictable process H is M-integrable if and only if

for each t, and H · M is always a local martingale.

The most general statement for a discontinuous local martingale M is that if (H2 · [M])1/2 is locally integrable then H · M exists and is a local martingale.

Square integrable martingales

For bounded integrands, the Itô stochastic integral preserves the space of square integrable martingales, which is the set of càdlàg martingales M such that E[Mt2] is finite for all t. For any such square integrable martingale M, the quadratic variation process [M] is integrable, and the Itô isometry states that

This equality holds more generally for any martingale M such that H2 · [M]t is integrable. The Itô isometry is often used as an important step in the construction of the stochastic integral, by defining H · M to be the unique extension of this isometry from a certain class of simple integrands to all bounded and predictable processes.

p-Integrable martingales

For any p > 1, and bounded predictable integrand, the stochastic integral preserves the space of p-integrable martingales. These are càdlàg martingales such that E(|Mt|p) is finite for all t. However, this is not always true in the case where p = 1. There are examples of integrals of bounded predictable processes with respect to martingales which are not themselves martingales.

The maximum process of a càdlàg process M is written as M*t = supst |Ms|. For any p ≥ 1 and bounded predictable integrand, the stochastic integral preserves the space of càdlàg martingales M such that E[(M*t)p] is finite for all t. If p > 1 then this is the same as the space of p-integrable martingales, by Doob's inequalities.

The Burkholder–Davis–Gundy inequalities state that, for any given p ≥ 1, there exist positive constants cC that depend on p, but not M or on t such that

for all càdlàg local martingales M. These are used to show that if (M*t)p is integrable and H is a bounded predictable process then
and, consequently, H · M is a p-integrable martingale. More generally, this statement is true whenever (H2 · [M])p/2 is integrable.

Existence of the integral

Proofs that the Itô integral is well defined typically proceed by first looking at very simple integrands, such as piecewise constant, left continuous and adapted processes where the integral can be written explicitly. Such simple predictable processes are linear combinations of terms of the form Ht = A1{t > T} for stopping times T and FT-measurable random variables A, for which the integral is

This is extended to all simple predictable processes by the linearity of H · X in H.

For a Brownian motion B, the property that it has independent increments with zero mean and variance Var(Bt) = t can be used to prove the Itô isometry for simple predictable integrands,

By a continuous linear extension, the integral extends uniquely to all predictable integrands satisfying
in such way that the Itô isometry still holds. It can then be extended to all B-integrable processes by localization. This method allows the integral to be defined with respect to any Itô process.

For a general semimartingale X, the decomposition X = M + A into a local martingale M plus a finite variation process A can be used. Then, the integral can be shown to exist separately with respect to M and A and combined using linearity, H · X = H · M + H · A, to get the integral with respect to X. The standard Lebesgue–Stieltjes integral allows integration to be defined with respect to finite variation processes, so the existence of the Itô integral for semimartingales will follow from any construction for local martingales.

For a càdlàg square integrable martingale M, a generalized form of the Itô isometry can be used. First, the Doob–Meyer decomposition theorem is used to show that a decomposition M2 = N + M exists, where N is a martingale and M is a right-continuous, increasing and predictable process starting at zero. This uniquely defines M, which is referred to as the predictable quadratic variation of M. The Itô isometry for square integrable martingales is then

which can be proved directly for simple predictable integrands. As with the case above for Brownian motion, a continuous linear extension can be used to uniquely extend to all predictable integrands satisfying E[H2 · Mt] < ∞. This method can be extended to all local square integrable martingales by localization. Finally, the Doob–Meyer decomposition can be used to decompose any local martingale into the sum of a local square integrable martingale and a finite variation process, allowing the Itô integral to be constructed with respect to any semimartingale.

Many other proofs exist which apply similar methods but which avoid the need to use the Doob–Meyer decomposition theorem, such as the use of the quadratic variation [M] in the Itô isometry, the use of the Doléans measure for submartingales, or the use of the Burkholder–Davis–Gundy inequalities instead of the Itô isometry. The latter applies directly to local martingales without having to first deal with the square integrable martingale case.

Alternative proofs exist only making use of the fact that X is càdlàg, adapted, and the set {H · Xt: |H| ≤ 1 is simple previsible} is bounded in probability for each time t, which is an alternative definition for X to be a semimartingale. A continuous linear extension can be used to construct the integral for all left-continuous and adapted integrands with right limits everywhere (caglad or L-processes). This is general enough to be able to apply techniques such as Itô's lemma (Protter 2004). Also, a Khintchine inequality can be used to prove the dominated convergence theorem and extend the integral to general predictable integrands (Bichteler 2002).

Differentiation in Itô calculus

The Itô calculus is first and foremost defined as an integral calculus as outlined above. However, there are also different notions of "derivative" with respect to Brownian motion:

Malliavin derivative

Malliavin calculus provides a theory of differentiation for random variables defined over Wiener space, including an integration by parts formula (Nualart 2006).

Martingale representation

The following result allows to express martingales as Itô integrals: if M is a square-integrable martingale on a time interval [0, T] with respect to the filtration generated by a Brownian motion B, then there is a unique adapted square integrable process on [0, T] such that

almost surely, and for all t[0, T] (Rogers & Williams 2000, Theorem 36.5). This representation theorem can be interpreted formally as saying that α is the "time derivative" of M with respect to Brownian motion B, since α is precisely the process that must be integrated up to time t to obtain MtM0, as in deterministic calculus.

Itô calculus for physicists

In physics, usually stochastic differential equations (SDEs), such as Langevin equations, are used, rather than stochastic integrals. Here an Itô stochastic differential equation (SDE) is often formulated via

where is Gaussian white noise with
and Einstein's summation convention is used.

If is a function of the xk, then Itô's lemma has to be used:

An Itô SDE as above also corresponds to a Stratonovich SDE which reads

SDEs frequently occur in physics in Stratonovich form, as limits of stochastic differential equations driven by colored noise if the correlation time of the noise term approaches zero. For a recent treatment of different interpretations of stochastic differential equations see for example (Lau & Lubensky 2007).

See also

References

  • Bichteler, Klaus (2002), Stochastic Integration With Jumps (1st ed.), Cambridge University Press, ISBN 0-521-81129-5
  • Cohen, Samuel; Elliott, Robert (2015), Stochastic Calculus and Applications (2nd ed.), Birkhaueser, ISBN 978-1-4939-2867-5
  • Hagen Kleinert (2004). Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 4th edition, World Scientific (Singapore); Paperback ISBN 981-238-107-4. Fifth edition available online: PDF-files, with generalizations of Itô's lemma for non-Gaussian processes.
  • He, Sheng-wu; Wang, Jia-gang; Yan, Jia-an (1992), Semimartingale Theory and Stochastic Calculus, Science Press, CRC Press Inc., ISBN 978-0849377150
  • Karatzas, Ioannis; Shreve, Steven (1991), Brownian Motion and Stochastic Calculus (2nd ed.), Springer, ISBN 0-387-97655-8
  • Lau, Andy; Lubensky, Tom (2007), "State-dependent diffusion", Phys. Rev. E, 76 (1): 011123, arXiv:0707.2234, Bibcode:2007PhRvE..76a1123L, doi:10.1103/PhysRevE.76.011123, PMID 17677426
  • Nualart, David (2006), The Malliavin calculus and related topics, Springer, ISBN 3-540-28328-5
  • Øksendal, Bernt K. (2003), Stochastic Differential Equations: An Introduction with Applications, Berlin: Springer, ISBN 3-540-04758-1
  • Protter, Philip E. (2004), Stochastic Integration and Differential Equations (2nd ed.), Springer, ISBN 3-540-00313-4
  • Revuz, Daniel; Yor, Marc (1999), Continuous martingales and Brownian motion, Berlin: Springer, ISBN 3-540-57622-3
  • Rogers, Chris; Williams, David (2000), Diffusions, Markov processes and martingales - Volume 2: Itô calculus, Cambridge: Cambridge University Press, ISBN 0-521-77593-0
  • Mathematical Finance Programming in TI-Basic, which implements Ito calculus for TI-calculators.
This page was last edited on 31 March 2024, at 17:33
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.