To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Languages
Recent
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

From Wikipedia, the free encyclopedia

Computer illustration of HLA-B*3508 with EBV (EPLPQGQLTAY) in the binding pocket.
B*3508-β2MG with bound peptide
2h6p
major histocompatibility complex (human), class I, B35
Alleles B*3501, 3502, 3503, . . .
Structure (See HLA-B)
Symbol(s) HLA-B 2fyy​, 2fz3
EBI-HLA B*3501 2h6p​, 2cik​,
1a1n
EBI-HLA B*3502
EBI-HLA B*3503
EBI-HLA B*3504
EBI-HLA B*3505
EBI-HLA B*3506
EBI-HLA B*3507
EBI-HLA B*3508 2nw3​, 2nx5
EBI-HLA B*3509
EBI-HLA B*3510
EBI-HLA B*3511
EBI-HLA B*3512
Locus chr.6 6p21.31

HLA-B35 (B35) is an HLA-B serotype. The serotype identifies the more common HLA-B*35 gene products.[1] B35 is one of the largest B serotype groups, it currently has 97 known nucleotide variants and 86 polypeptide isoforms. (For terminology help see: HLA-serotype tutorial). This variant is particularly susceptible to HIV infection.

YouTube Encyclopedic

  • 1/2
    Views:
    664
    820
  • The Human Genome I
  • Mod-18 Lec-35 Host response mechanisms during infectious diseases -- part 2

Transcription

(English captions by Andrea Matsumoto from the University of Michigan) Lets get going again. I'm very flattered that you're all still here after listening to my voice for two hours. We're going to talk about the genome and I'm going to continue this on Friday morning with a slightly different part of it. What I'm going to talk about, these are what I'm going to cover in this series of two lectures. Hopefully in the past two hours we've kind of given you the background of DNA sequence variation, how we look at it, what it means, the ways that you can get a single gene disorder like sickle-cell anemia or thalassemia, all the different things that can go wrong in the DNA. We're going to talk more globally about the whole genome now and how that evolves and varies and what that means and, in particular, on Friday, how we're using this to detect much more complicated diseases. Pretty much everything I've been talking about so far applies to single gene diseases, what we often call Mendelian disorders. It's a single gene that has something wrong with it. That's sickle-cell anemia and thalassemia and hemophilia and BRCA-1. There are probably about, now I think the catalogs up to about 3 or 4 thousand, diseases we know of that are due to a single gene disorder. There are lots of other diseases that are very common that are much more complicated. It's not one single gene. Diabetes, heart disease, most of what one sees in practice in a general medical practice. That's going to be the subject of Friday. What I want to talk about today, we've talked about some of this, I'm going to reemphasize and to go into some more detail about what actually makes up our genome and how it's put together. We're going to talk about recombination. We're going to talk about what I think are the two most conceptually difficult things that we're going to cover, maybe in this whole course but certainly in my lectures. I think they're important concepts to have a general feel for. What recombination is, what linkage is, what linkage disequilibrium is. I'm guessing that this makes absolutely no sense to you all now, sounds like complete gibberish, and I'm hoping at the end of this next hour or 45 minutes this will sound at least vaguely familiar. We already talked about all this. DNA sequence variation, there's a lot if it. 0.1% between any two humans, 1-2% if we compare ourselves to chimps. You all are familiar with polymorphisms, you understand distinction. This is really important. Any change in the ancestral DNA sequence is a mutation. The millions and millions of variations we all have, all arose as a mutation at some point. Everything different between us and chimps is a mutation in either us or the chimp. Anytime the sequence is different it arose from a mutation. Mutation is not synonymous with bad, causes disease. Without mutation we would all still be bacteria or something in the primordial soup. Mutation is not bad though it's commonly used that way. This is a very common misuse to say when we do this BRCA sequence in our patient is this a mutation or is this a polymorphism? The idea being polymorphism is benign, mutation is bad. That's wrong. Polymorphic just simply means it's common. Mutation, everything's a mutation. What you really want to know is if it's a disease causing variant or a non-disease causing variant. Is it common, a polymorphism, or not? Okay. To reemphasize, to beat this point a bit more, a normal change if it's very common is a polymorphism. If it's rare it's a private polymorphism or a rare variant. If it causes disease it's a disease causing variant or disease causing mutation. If it's common it could stall be disease causing but now it's a polymorphism as well. This is a little bit of history but I think it's informative. What I'm going to talk about now is linkage. This is less important that is used to be but it's an important concept, I think, for you to understand. I spend much less time talking about this than I used to because it's becoming more obscure. I think you really should understand the concept. This is basically the idea that if you look in our genome at genes or parts of the chromosomes that are near each other, they tend to get inherited together. We can use inheritance of some marker in the DNA to keep track of some other gene that's causing a trait. This is actually the first example of this from 1968. This was a heteromorphism on chromosome one. As I was telling you, we used to do cytogenetics, we still do it somewhat, but it's becoming used less often. Take a cell, spread open the chromosomes during metaphase, stain them, and you can look at them like this. The largest chromosome is chromosome one, the next largest is chromosome two, and chromosome 22 is the smallest, and then you have the X and the Y. We just numbered them based on size and here's what they look like. Well Donahue observed that if you looked at people, a lot of normal people, you found some people had this funny thing on chromosome one. Part of the DNA is heteromorphic, it's spread out a little bit, what's that about? He observed in this family that if you had this heteromorphism, that's the people colored in yellow, this got inherited along with the Duffy blood group. Which is a blood group with a subtle difference common polymorphism on the surface of red blood cells. That's a place, by the way, where polymorphisms are really important because, ABO for example is a common polymorphism on your red blood cells. If you have type O or type A you're perfectly fine. But if you're type O you will have antibodies against A or B and if we give you a transfusion from somebody who's type A you will have a transfusion reaction, which could be fatal. So when we give a transfusion we don't just give a transfusion to anybody. We have to match it for a whole bunch of polymorphisms. That's a place where being polymorphically different, even though they're not disease causing, it's really important in the way we do medicine. This is one of those blood groups and look at this. This guy has Duffy A and Duffy B and mom is BB. Both kids must've gotten B from mom so they had to get A from dad and they both got the heteromorphism. This child got A from dad because he's AA and also got the heteromorphism. Basically, everyone in this family who got the Duffy A allele from a parent who had the heteromorphism got the heteromorphism along with it. They're being inherited together, that's linkage. This heteromorphism is linked to the Duffy blood group. Does this heteromorphism cause the Duffy blood group? What do you think? Nope. We can prove it because we can find a lot of people who have Duffy A who don't have the heteromorphism. You don't have to have the heteromorphism to get the Duffy A blood group but within this family they're traveling together. That's what linkage is. Why is it? Why does it occur? It occurs because of this phenomenon of recombination. During meiosis when we make an egg or a sperm, we line up together the maternal copy of each chromosome and the paternal copy of each chromosome. They pair up very closely. They form something called a synaptonemal complex where the two strands are right next to each other and exchange occurs where you shuffle a little bit. A bit of Dad's chromosome one gets swapped for Mom's chromosome one by recombination. Just like we showed in the alpha globin example where you got a deletion. That was a wrong recombination but normally the sequences line up perfectly and you just exchange from one to the other. This happens all the time. It happens on average in one meiosis, in one set of chromosomes that you get from your mom or dad, on average it's at least once per chromosome, typically this is happening. So if we look at you and we look at your mom or your dad, you'll have one chromosome that came from dad, one that came from mom. But, the one that came from dad will have part of one of his mother's chromosome, and part of his father's chromosome and the same for the one you got from mom. We're always a little bit of a shuffling. This is another way we generate diversity among us during evolution. So this is going to occur at least once, sometimes a couple times along the chromosome. Now imagine that you have two different genes that are fairly near each other on the chromosome. Depending on where that recombination occurs' Lets that this example where they're fairly far apart. This chromosome has big A and big B. This chromosome has little A and little B. They're fairly far apart. A recombination has occurred in the middle so we end up with these chromosomes. You either got the part that didn't recombine, big A with big B. Or you got a recombination and you got a chromosome that's got big A and little B. Or you got the other one that's got little A and big B. Or you got the little A little B one that did not combine. Does everyone follow this? Well, what if A and B are really close to each other and the recombination occurred above it so that they got swapped together? Chromosomes look exactly the same. You still have these mixed chromosomes but if we look at the genes, big A and big B and little A and little B always stay together. That's linkage. Generally what happens is the farther apart the genes are the more likely they are to recombine during meiosis and the closer together they are the less likely they are to recombine during meiosis. Does that make sense? This is a very though concept but it's a really important one. It turns out, we know, that the frequency with which this happens from years of genetics studies is roughly the chance of two genes recombining with each other. The change of recombination occurring between these two genes is roughly 1% for every million base pairs. So, if two genes are a million base pairs apart, 1% of the time they're going to recombine. Two center margins apart, 2% of the time they're going to recombine. If they're 50 center margins, 50% of the time they're going to recombine. What's the most they could ever recombine? 50% because if it was more than that you end up coming back. 50% is the maximum approach and that's the furthest. You do two opposites the most you can do is recombine so they get totally randomly scrambled and it's 50%. The two ends of the chromosome are far enough apart they get inherited independently just like two different chromosomes get inherited. It's not an absolute thing but across the genome there's some places that recombine more than others but, statistically on average, it's 1% per million base pairs. Okay, so let's take a look here. Here we have disease D, could be any disease, and we have a marker, a gene nearby, a SNP (single-nucleotide polymorphism), a polymorphism, a common one nearby. You could have either big A or little A. In this particular case, the disease mutation in this father with the disease, an autosomal dominant disease, occurred on a chromosome that happens to have the big A. Now every child is going to get a big A and a normal copy of the gene from the mom and they're either going to get a big A or little A from dad. You'll notice that these three who got the disease all got the disease gene from dad and they got the big A. These three individuals got little A and the normal copy from dad and didn't get the disease, right? It's linkage between this gene and the disease. Here's the exception. What's going on here? This person got a little A but got the disease because there was a recombination between the two. The closer together the less often this recombination is going to occur, the further apart the more often. We can use this kind of polymorphic variation to track the inheritance of a diseased gene in a family. Even if we have no idea what the disease mutation is. Even if we have no idea what the disease gene is. As long as we know it's a gene and we know this marker is near it we can use it to trace the disease by linkage. In fact, this is how we used to find diseased genes when we had no idea what the disease was, what the gene was. This is neurofibromatosis and a DNA polymorphism. This is one of those old RFLPs (restriction fragment length polymorphism) by southern blotting. It doesn't really matter how you do it, the point is, it's a common polymorphism. You have either this big band or you have the little band in all normal people you could look at anybody and that's what you would see. In this particular family, dad who has the disease has one of each. Mom has two small bands. We look at this kid, he had to get a small from mom in which case he got the big from dad. This kid got the big one from dad. This kid got the little from dad. If you look you'll see that, with one exception I think, everybody here who got the dad's big copy got neurofibromatosis and if they got the little copy they didn't get neurofibromatosis. Out here there's one exception. This guy got neurofibromatosis from dad, has the big copy and little copy, right? And, here the mom who's normal has two big copies. The important thing to remember, whether it's big or little is not the diseased gene. Here's somebody who has two bigs and is normal. Here's somebody who has two littles and is normal. That is not the diseased gene, it's just what's nearby. In this guy, again, it's the big copy that has the gene and everybody on this side of the family is going to get a big from mom. It's whether they get the big or little from the dad is going to determine whether they get neurofibromatosis. Here's the one exception. This individual who got the small copy, what should've been the normal copy from dad, and also got neurofibromatosis. How did that happen? Recombination between the marker and the disease. Out of all these individuals, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, out of these 17 individuals this happened once. Little over 5% of the time. Turns out this is about 5 million base pairs away and we expect it to happen 5% of the time. Yeah? [Student question] Closer there'd be less recombination. It's if they're further apart, right? Make sense? This was the way positional cloning became possible. You could now take a disease you know absolutely nothing about what caused the disease and simply if you could have a set of these markers that covered all the chromosomes, just look at a family like this and just see of all the markers we're looking at which one tracks best with a disease in a family. The gene, whatever it is, must be near there. That's what positional cloning is all about. Here is the first successful mapping of a human disease gene this way. This is a disease called Huntington's disease. Autosomal dominate. If you get this gene you get the disease. We had no idea at all what the gene is. You've heard about this I think, or you will. This is a terrible neurodegenerative disease. You're perfectly fine and healthy until you're in your 30s or 40s then you get this rapidly progressive movement disorder, dementia, and you die. Not so great. When you're 20-something it doesn't seem so bad. When you're my age it seems really awful. But it's bad and living your whole life, not to make light of it but, living your whole life knowing you've got this hanging over you is obviously pretty terrible. It's a very significant disease and we can diagnose it perfectly now. The gene wasn't known at the time. It's very common in a particular village in Venezuela and this is a family from that village. This is a polymorphic marker that happened to be on chromosome 4. You remember I showed you that slide of the markers we knew of on chromosome 4. They only had three markers on chromosome 4. They were incredibly lucky that one of them happened to lie right near that gene. Amazing luck. That's where their luck stopped. They weren't able to find the gene for another 10 years but they were able to figure out where it was. If you look at this family, and I encourage you to go through this on your own with the slides, you can trace where this is being inherited, which allele this is being inherited with. This allowed us to know that the Huntington's disease gene was segregating almost perfectly with this marker, which was G8. The eighth marker that was identified. I remember when I heard about them trying this. This couldn't possibly work. You need hundreds of markers to do that. At the time to have three or four hundred markers in the human genome seemed impossible. Now we have 25 million. At the time it seemed we would never have that many. They only had eight and they tried anyway and they were so lucky that this one happened to fall right next to the gene. So we knew the gene lay on this part of chromosome 4. It meant you could do this typing for this polymorphic marker that's got nothing to do with anything except it happens to be near the Huntington's gene, and you could predict accurately who was going to get Huntington's and who wasn't. The first thing you would do from this approach by checking markers throughout the genome and a family just seeing what got inherited along with the disease gene you could figure out where the gene was. Then you had to do all kinds of really complicated stuff that you don't want to know about because it's ancient history, how many of you can use a slide roll? I didn't think so. This is like a slide roll, how many of have even heard of a slide roll? Okay, that makes me feel a little better. But, you know, who would ever use that kind of thing now? There was a time when this seemed miraculous and it's ancient history. Now we do it all in silico. Once we know where the gene is we can sit down at the computer and look at all the genes that are in that area and figure which ones it might be and start checking them until we find the actual mutation. And, Bingo! Everybody who's got this disease has a mutation on this gene and we know we're there. Cystic Fibrosis was the first gene that was found purely by this kind of approach, using this kind of genetics. It was a huge splash. One of the major leading groups was headed here at the University of Michigan by Francis Collins who's now head of the NIH (National Institutes of Health). He used to give this lecture, by the way, I had to take it over for him when he left. As much as I love giving this lecture I wish he'd stayed. He used to come back for a while and give it. Anyway, Cystic Fibrosis is now routinely screened for, I was mentioning this to somebody earlier. Women who go to the obstetrician, this is now offered as a standard of care if you'd like it. It's found in all human populations. It's particularly high in Europe where one in 25 people are carriers. Again, probably eight or so people in this audience are carriers of CF (cystic fibrosis). It's an autosomal recessive disease. Being a carrier, like sickle-cell, is fine. It's only when you have two copies you have the disease. We had no clue as to what might be causing that until this methodology made it possible to isolate the gene. This was an enormous effort. I actually pointed out, I think on the Huntington's one'no. Both of these were efforts of huge consortiums of thousands of people and tons of labs around the world working on these things to do it. It's now become almost routine, as I'll show you in a minute. Easier than routine, in fact we don't even bother with most of this anymore. We talked about the different types of variants you can use. I talked about these microsatellites or simple tandem repeats. These were very useful for linkage. Maybe you can see why now because you have so many different alleles. Most people are different if you look at this. There are so many different bands you can get. If you can find three of four hundred of these spaced evenly across the genome like this you are ready to do this. All you needed is the family. If you had a family that had the disease you do this and you can find where the gene is. That was what genetics was all about for the past maybe 10 years. When we found the gene then you had to look for mutations. We talked about the different types of mutations, most are silent, etcetera, all these different variants. We talked about the fact that we now have many millions of these variants. This was hard to do in the days of Huntington's disease gene. Now a days you can take a little SNP chip for a few hundred dollars you can check a million SNPs throughout the genome and everybody. We done even bother with STRs this is how everything is done now with these common single-nucleotide variants. The next huge quantumly forward for us was a sequence of the human genome. This used to be big news. This is now 11 years old, it's not such big news anymore. This became possible because of sequencing technology. I told you about how when I started out I spent a year and a half sequencing four nucleotides. These automated machines like this, this is one of the early genome centers, could sequence millions of base pairs in a day. This was thousands of people working hard. You could do millions. It still took a long time to do the whole genome. Now a days as I told you we can do billions in my lab, one person. It's not so hard. At a fraction, a tiny fraction of the cost and I'll show you that in a minute. Anyway this made it possible to sequence large amounts of DNA and to sequence the entire human genome. When I was a medical student we knew that DNA was genetic material. We knew there were As, Cs, Gs, and Ts. But we actually weren't exactly sure how big the genome was. We were totally mixed up about that. The idea, if you would have told a group of scientists that someday you would have the entire human genome to look at they would have laughed you out of the room and said it's impossible. But it's been done and it's totally astounding. This was the article of Nature in 2001 when we announced the genome was done. It wasn't really done but it was almost done. This has totally exploded the way we find disease genes. Completely revolutionized it. That work, the Cystic Fibrosis gene, thousands of people in many labs, hundreds of labs around the world working night and day for multiple years to find the gene. Now it's an afternoon in a good lab if you have the right DNA samples. You can see what happened. When I was in medical school the number of diseases that which we knew the cause at a molecular level, the responsible gene, was one, sickle-cell anemia and thalassemia. Look what's happened. This is where the human genome came off and things just really took off. This stops at 2005, it's exploded since then. I'll show you on a later slide that we're now up to about 5 thousand. Basically the bottom line is now if we have a human genetic disease, a single gene disorder, and we have a large enough family, and even that isn't required anymore, where we can see how it's being inherited and we have DNA on those individuals, we can find the gene in a research lab and identify the responsible gene. Then precisely diagnose the disease in individuals in that family and other families. Yeah? [Student question] The question was about multifactorial diseases like Type II Diabetes. It's impacted it in a huge way although it's different and it's been much harder than we thought and our progress there is much much much more less. It's impact in the clinic is somewhere between, I would say, zero and none. That is the entire subject of my next lecture so we'll talk about that. So we go from having the genome, knowing where the gene is, to finding the single gene that causes the disease. We have this range of genes from one of the littlest, the Globin gene, it's about 15 hundred base pairs to one of the biggest, Dystrophin, which causes Muscular Dystrophy, 2 million base pairs. And we can figure out how many genes we have. This was a cause of great consternation to many people when we actually looked at the genome because we had all been taught for years and years and years based on old, very definitive experiments that there were 100 thousand genes in the human genome. Turns out we only have 20 thousand. Many people were profoundly upset by this because earthworms also have almost the same number of genes and so does the fly. Most people found this really insulting to our species but we've learned to live with it. The fact is, we really, in terms of the number of genes, we are no more complex than a fly or worm. We seem to have more complex splicing and many people take comfort in that. They think, 'well okay, we don't have more genes than a worm and a fly but ours are more complicated and they do more things.' I'm not really sure that's entirely true either. We don't even have all that much more than a yeast, fair bit more than a bacterium, which his really kind of astounding. It certainly seems like we are more complicated. We have a fairly big genome, it's bigger than some of those species but plants have bigger genomes than we do. That's really distressing to some people. We still like to think we're better than all those species. This is out of date, we now have complete human genome sequences probably in about 10 thousand people. I was just off by a couple of logs here. This is changing so quickly it'll be millions soon. This is also way off, we have now the complete sequence'almost any species anybody's interested in you just do the whole genome now. We do some experiments in my lab where we look at bacteria and mutants of the bacteria and if we want to find the mutation in the bacteria the easiest way now is just sequence the whole bacterial genome. Why bother looking for a single change? Just sequence the whole thing and see what's different. Problem is, the bacterial genome, which is only about 5 million base pairs, it's so tiny that it's a waste to put that on the machine by itself so we'll usually pool together 10 or 20 different ones so we can do them all at once. It's really amazing how this has exploded. What do we know about the human genome, what have we learned? The human genome as you know has 23 pairs of chromosomes, a totally of 46 chromosomes which the genes are spread across. A total of 3 billion base pairs, that's the haploid genome. The DNA you get 3 billion base pairs from mom, 3 billion from dad. A total of 6 billion but it's duplicated so you get 3 billion unique in the haploid genome. A lot of it, most of it, is repetitive sequences, control regions, spaces between genes. We have no idea what most of this does. Really no idea. Some people claim they have an idea, we don't really know what's going on. About 30% is where the genes are. Even the genes, most of it is space, it's introns. The introns are where the Dystrophin gene is 2 million base pairs. Only 20 thousand of that codes for protein. The other 980 thousand is introns. So most of this is introns, space in the genes. Most of which we don't know what that does either. Of this, again, only 1-1.5% is actually the coding sequence. If we look across all the species, if we take our sequence and line it up with chimps and dogs and cats and mice and worms and flies we find a very highly conserved region of pieces. Including most all the genes. The exons are pretty highly conserved, they really stick out. They're much more the same compared to other' There's a protein we work on in our lab and we were just astounded the other day we looked, we should have known this but, it's a part of the ER (endoplasmic reticulum) and this protein in humans is 50% identical to the protein in yeast. That's a long evolutionary distance. If you look at an intron you wont find any similarity at all to the yeast. There's another 1-1.5% of the genome that's just as highly conserved as the exons but we have no idea what it does. It must be doing something important if it's that highly conserved but it doesn't code for protein that we can tell. We're just starting to figure out what all that does. There's still lots to do. If any of you are actually interested in working on this stuff in a lab there's still plenty of questions to answer. The sequence now, it covers almost all, there are a few parts of genome that have been very hard to sequence so it's not really really totally complete and there still are some errors, which people kind of assume if it's in the genome it must be right. There's less than 100 thousand errors, which is astounding. In the old days we sequenced if we got 90% accurate we were thrilled. This is the whole genome that has been done at this level but that means there are still some mistakes in there. In 3 billion base pairs you know that's still 30 thousand mistakes. Often it was a big issues, who's genome do you sequence? It was actually a pool of genomes so you couldn't identify the person but, what's normal? Factor V Leiden, I told you about this variant we studied in my lab. 5% of Europeans have it. Causes an increased risk of thrombosis but you still do pretty well if 5% of people have it. Turns out the normal sequence, we just noticed it the other day, the normal sequence in the genome have Factor V Leiden. Even though it's only 2.5% allele, it's a polymorphism, but it's the rare allele. In many cases it's the rarer allele that's in the index genome. Important thing to keep in mind. There are still a few holes. This is less than when I made the slide. Wonderful thing, this is all freely available to anybody. If you'd like to read the entire human genome you can just go to (') you can sit there, it would take a while, but you can read through the entire things. There are browsers and all that make it actually mean something. I told you already, might be because I'm still wrestling with this, we have fewer genes than new thought. We like to think that we make more complicated proteins. I already talked about this. These are the base pairs in Genebank. This is the repository at the NIH part of the National Library of Medicine where all the sequence you want to look at you can just freely get to with a web browser any time you'd like. You can see its grown exponentially and I forget where it is now. This is a terabyte, a thousand gigabytes, and we're actually way over that now. I think it might even be getting close to petabytes, that's the next thing after terabytes. That's a thousand terabytes. It's unbelievable how much sequence there is. This is a wonderful place to go. If anybody's interested or excited about this I would sometime you sit down and you go to the NCBI website. This is part of the National Library of Medicine. This is where Genebank is. You'll know these guys also because they also run PubMed, for most of us it's the most important part of the NIH. PubMed is extremely useful and this is for DNA obviously what PubMed is for everything else for the literature. There's lots of things you can read about here and you can look at. You can look up any gene you want and it will tell you all sorts of things about it. There are other web browsers. This is the one at UC Santa Cruz that many of us use, that researchers kind of like. You can go in there and look at your favorite gene. This is one of mine. It's a gene that we work on in our lab that causes yeast called PTP, which we found by positional cloning ancient times, 10 years ago. Things had already improved by then. I told you how Cystic Fibrosis was thousands of people working on it night and day for years and years. ADAMTS13 was one Michigan MD PhD student working in my lab for 9 months. Now she could probably do it in about 2 weeks. It's gotten incredibly fast to do this. This is the ADAMTS13 gene and you can see all the other genes around it and what parts are conserved. You can look at all sorts of things. This is a genome browser, extremely useful. Only to mainly research purposes at this time point. So, we already talked about this linkage concept and I'm just emphasizing it again here because this is so so fundamentally important. Linkage in this case was because this A and B gene tended to be inherited, whatever genotype you had at A and at B tended to be inherited together. That's linkage. This is not the most difficult concept that I'm going to tell you about. This is linkage disequilibrium. Related to linkage but something a little different. I'm going to spend some time talking about this. If you don't fully get this don't feel too badly about it. It's a difficult concept but I think it's important. What I'm showing you here is the HLA Locus. This is one of the most polymorphic parts of our genome. This is the histocompatibility locus, major histocompatibility complex. It's highly, highly variable. Unless you're related, unless you look at your siblings, most of us have different HLA types than other people. This is part of how our body's immune system tells self from non-self. That's why if we put a skin graft on you from somebody else you reject it. Or if we give you a bone marrow from somebody else you reject it because your body says, 'oh wait a second, that's not my HLA type, that's foreign,' and it gets rid of it. So if we do a transplant we type, we try to get a match for HLA. It's a very important medical thing. It's incredibly polymorphic and there are a whole bunch of genes in here. One of them I'm going to say more about turns out to be a gene called HFE. It's a gene that causes hemochromatosis, which is a very common human genetic disease. Here's HLA typing and this is the old way that we used to do it but I think it's instructive. There's a bunch of HLA genes in here, A, B, and C, and the D types, and you can type for all of them. There are many different types. You could be A 1, 2, 3, 4, 5, you know. 40 or 50 or so different alleles of each one of these. Here's a family and we've typed dad and the kids. So dad has at A he has two alleles, A1 and A29. At B he has B7 and B8. At D he has DR3 and DR4. Here's the types of all the kids. Now, he's got two chromosomes and we don't know whether A1 goes with B7 or B8 or DR3 or DR4. We don't know how they're lined up. That's called phase. This is his genotype. Knowing which ones are on which chromosomes is phase. You can figure that out. Here's how we can deduce that. If we look at daughter number one here she's A2, A29. Which one did she get from dad? A29, so that's here. She's got B7 and B35, which did she get from dad? B7, etcetera. We can tell that this chromosome came from dad. That's told us one of his chromosomes. By deduction, what's left over is the other chromosome of dad's. And, by deduction, her other chromosome must be one of mom's chromosomes. So by that kind of logic, again I would encourage you to go through this on your own with the figure and the handout and make sure you can derive this from this. These are genotypes. This is their phase, what's actually on each chromosome. This is extremely important when we're typing patients to figure out who we want to use for bone marrow transplant. Lets say this kid has leukemia and we want to use one of her siblings as a transplant donor. Who would you pick? 4 or 5 you might say, right? They've got the same chromosomes as she did from dad except look at this one. A2, B35, it's supposed to be DR13 it's DR1, how'd that happen? Recombination, crossing over. Very good, exactly. In mom it just so happened there was a recombination right here and DR1 ended up right here. So this kid is fairly close but one part of the HLA Locus is different. This is a better match so that's the person we'd use for the transplant. This is a perfect HLA match. This person is haploidentical, has one chromosome the same. This person is completely not identical. By chance, each of your siblings has a 1 in 4 chance of being HLA identical, a 1 in 4 change of being completely different, and a 50% chance of being half identical or halopidentical. Here's a curious fact. Hemochromatosis. Fairly common genetic diseases, 1 in about 800 or something. Roughly 8 or 9% of people are carriers, fairly common disease. Curious thing. If you look at people with hemochromatosis, 70% of them are HLA A3. 30% of them have other HLA types. How could that be? In the general population only 10% of people are A3. So does A3 cause hemochromatosis? No because you could have two A3s and not have it. Do you have to have A3 to have hemochromatosis? No, there are plenty of hemochromatosis patients who don't. So what's going on here? This is linkage disequilibrium. What has happened here is there is a founder effect here. Like the sickle thing where there were four different mutations in history that caused sickle-cell anemia, in the case of hemochromatosis there was one founder. That mutation, when it happened, by chance, the new mutation in that individual just happened to happen on a chromosome that had A3. The HFE gene turns out to be very close to HLA A. They tend to be inherited together, recombination between them is very rare, extremely rare. What happens is, even though this happened 20 thousand years ago, over all that time they're still being inherited together. Not always, there have been some recombinations that separated them. They're still being inherited together. Now, if A3 was located a long ways away, we would have expected everything to have been reshuffled by now. You would expect the HLA types in hemochromatosis patients to be identical to the ones in normal chromosomes. That would be called linkage equilibrium. What's happened is there hasn't been enough time yet for this to get all shuffled and there's still a big excess of A3 with hemochromatosis so this is called linkage disequilibrium. It seems like an obscure fact, it is somewhat obscure, but it's going to be come important on Friday when we talk about complex diseases. I'll reemphasize at the end, I'm sure this is very confusing. I've tried for years I still haven't figured out a good way to make this really straightforward and would love suggestions. If you're still a bit puzzled don't worry about it. We'll go over it again on Friday. I hope it will get more clear then on how we look at complex diseases. The extension of this is laying the groundwork for Friday. Let's take a look at some SNPs. Here's three SNPs, common SNPs, variants. You can have either a C or a T here. You can have either an A or a C here. You can have either an A or a G here. How many possible combinations and permutations are there? Well you think 2 x 2 x 2, 8 ways you can put all this together. These are all 8 possibilities. Guess what? We go look at everybody in the room, what do we find? Only these. This is another example of linkage disequilibrium. The reason again is probably when this variant occurred' Well this isn't a good example because these aren't the same' The various changes that occurred, this T to C occurred on an allele that had this. There obviously have to be, you can observe just these two, but these are the most common. It has to do with the history of how the mutations occurred and where they occurred and what chromosome they occurred on. They're still in linkage disequilibrium. This turns out to be a phenomenon throughout our genome. Part of what's happened is this phenomenon here. There were many humans, not so many as we have now, in Africa. When we left Africa, those of us who left Africa, a limited number left. This is true any time a human population moves. If there is a limited number of founders you can have an allele that's very rare here. If it happens to be present in one of these individuals as that population expands it'll be much more common. Human populations around the world have somewhat different frequencies of these various polymorphisms. Most all of them arose back here. You'll have different frequencies but they're looking still at the same ancestral chromosomes. Since this is all fairly recent in evolutionary terms there hasn't been enough time for everything to get shuffled. If two SNPs are sitting a few thousand base pairs away from each other there hasn't been enough evolutionary time for them to reach linkage equilibrium. They're tending to be inherited in blocks. What we find when we sequence lots and lots of genomes is, there are blocks. The average in size is about 20 thousand base pairs. Large blocks that tend to be inherited together. Just like the HLA A3 hemochromatosis example, it's not perfect. These blocks do get broken up but they tend if you have an A here you tend to have a G here if they're really close. The genome gets inherited in blocks like this. This is laying the groundwork for what we're going to talk about on Friday. When we're looking for common variations causing common diseases, like heart disease and diabetes, we're going to look with these blocks. We're going to say are there any particular blocks that get inherited? We don't know what the change is in that 20 thousand base pairs but if this block is more common in people with diabetes there's probably something somewhere in there. This is all based on linkage disequilibrium. These blocks have not shuffled enough to all be random. Parts of the genome are still tending to be inherited together. What we're going to do is look at a bunch of people with a disease and look at a bunch of unaffected people and if they're somewhere in here on this chromosome a mutation occurred that causes' Let's say this is the region around the A3 gene of HLA and a mutation occurs right here in the HFE gene. When we look many years later at people with hemochromatosis we're going to find much more of the red. It'll still be in the unaffected but at a lower ratio. This is again because of linkage disequilibrium. All we know is that the responsible gene and mutation is somewhere in this big block, this haplotype block, this block of linkage disequilibrium, but we still don't know what it is. This is how we made this deduction that I told you earlier about sickle-cell anemia. It turns out, if you look around the hemoglobin gene, in all people with sickle-cell anemia there are basically only four different colored blocks around the hemoglobin gene. Everybody has one of those four colored blocks. One of those linkage disequilibrium blocks, one of those haplotypes. They're associated with the four different occurrences of the same sickle-cell mutation. We can now trace it over evolutionary time. Again, we just take everybody with hemoglobin S, with sickle-cell anemia, and we're going to find a different distribution of these haplotype blocks than we find in the rest of the population. That's this evolutionary history. Using this in a more general way is, again, what we're going to talk about in the next lecture. I've talked about how technology has changed. Astoundingly we can now sequence many millions of base pairs in a day and that's what this next gen sequencing has totally revolutionized what we're doing. The human genome was done without this next generation massively parallel sequencing that we're now doing. It was done with these old fashion machines. These newer machines are dramatically faster and cheaper. This is what's making it possible for us to sequence thousands, tens of thousands, of genomes and eventually all of our genomes. We're going to talk about the implications for that for medicine, genome wide association studies (GWAS) for human disease. That will all be the focus of my last lecture to you on Friday. We'll stop there. Thanks for toughing it out for these three hours. I'll hang around if people have any questions and I'll see you Friday.

Serotype

B35 serotype recognition of Some HLA B*35 allele-group gene products[2]
B*35 B35 Sample
allele % % size (N)
3501 98 2023
3502 72 1013
3503 94 1226
3504 100 26
3505 94 231
3506 79 10
3508 91 349
3509 68 3
3510 75 12
3511 67 17
3512 78 422
3514 79 29
3520 42 23

References

  1. ^ Marsh, S. G.; Albert, E. D.; Bodmer, W. F.; Bontrop, R. E.; Dupont, B.; Erlich, H. A.; Fernández-Viña, M.; Geraghty, D. E.; Holdsworth, R.; Hurley, C. K.; Lau, M.; Lee, K. W.; Mach, B.; Maiers, M.; Mayr, W. R.; Müller, C. R.; Parham, P.; Petersdorf, E. W.; Sasazuki, T.; Strominger, J. L.; Svejgaard, A.; Terasaki, P. I.; Tiercy, J. M.; Trowsdale, J. (2010). "Nomenclature for factors of the HLA system, 2010". Tissue Antigens. 75 (4): 291–455. doi:10.1111/j.1399-0039.2010.01466.x. PMC 2848993. PMID 20356336.
  2. ^ derived from IMGT/HLA
This page was last edited on 21 July 2019, at 12:06
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.