To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.

Gary King (politician)

From Wikipedia, the free encyclopedia

Gary King
Gary King 2013.jpg
30th Attorney General of New Mexico
In office
January 1, 2007 – January 1, 2015
GovernorBill Richardson
Susana Martinez
Preceded byPatricia Madrid
Succeeded byHector Balderas
Personal details
Gary Kenneth King

(1954-09-29) September 29, 1954 (age 64)
Stanley, New Mexico, U.S.
Political partyDemocratic
Spouse(s)Yolanda Jones (1987–present)
ResidenceMoriarty, New Mexico[1]
EducationNew Mexico State University, Las Cruces (BS)
University of Colorado, Boulder (MS, PhD)
University of New Mexico, Albuquerque (JD)

Gary Kenneth King (born September 29, 1954) is an American lawyer who served as the 30th Attorney General of New Mexico from January 1, 2007, to January 1, 2015. A member of the Democratic Party, he won his party's nomination and lost the general election to become Governor of New Mexico in 2014.

YouTube Encyclopedic

  • 1/3
    1 448
    1 684
    10 179
  • ✪ "The Big Deal about Big Data" with Dr. Gary King
  • ✪ Gary King Discusses Big Data
  • ✪ Gary King, Harvard, Institute for Quantitative Social Science


[MUSIC PLAYING] Welcome. My name is Ziyad Marar. I'm the executive vice president of SAGE, and I'm responsible for our global publishing strategy and really delighted to be here and to introduce Gary King. The event is in context. of us wanting to look hard at research methods and I'll come to that in a second. But it's great to see people representing policymaking and the learning societies and media and SAGE colleagues in one place. I hope you will actively engage with Gary when he's up and running. The format will be I'll do just a few minutes of intro, and then Gary will talk for about 30 minutes and then take questions and answers, after which we will go for drinks. But he reassures me that if you want to interrupt him along the way, you're more than welcome to do that. But also, I'd like to say a quick thank you to the American Political Science Association and the American Statistical Association, both of whom have helped make this event happen. So thank you to everyone involved there. So it's often said that data is to the 21st century what oil was to the 20th century. And that extraordinary fact has almost become mundanely in our lives today. I think none of us are startled anymore to hear these gigantic numbers being bandied around, 1.5 billion people on Facebook or, as I looked up this morning, 40 thousand Google searches per second, where they become sort of the hum-drum backdrop of our lives. Some people positive about them, others not so much, the negatives be inclined to say, the business model of the internet is surveillance versus on the other hand people saying, never has there been before such an opportunity to improve human flourishing with all that's available through big data. And there's no surprise that organizations of various kinds have responded to this with alacrity. These organizations, many of them are commercial organizations, are looking to get close to their customers and to send messages out as timely as possible. And that's big and understandable stuff. And then some similarly in the research community, the natural scientists and the physical scientists have similarly been responding incredibly well. But I think that the social scientists, we get a bit of a mixed story. And that's while there's huge amounts of uptake, there's also a fair amount of caution expressed and doubt expressed as we. And I think one of the reasons we wanted this event was really to help push the story of data-intensive social science a step along. And so the expressions concern range from the nature of social science, problem domains, and issues around methodology and ethics. A researcher studying social inequality will worry about the people who left off the digital grid more than a market researcher might. And people in social sciences will tend to worry more about privacy and the idea of informed consent, which isn't really an issue for high energy physicists. So we've got particular dynamics that apply to the social sciences. Nevertheless, I think a huge opportunity for social science to reinvent itself in the light of big data. And this is particularly important for us at SAGE because our history has been as innovators in research methods since our founding. We were founded 51 years ago by Sara Miller McCune, who thought of research methods as a really profound connective tissue across the research communities, across geographies, across fields, across levels from students to professor. And we've innovated, whether it's through the Little Green Books or the rise of qualitative research or mixed methods or evaluation throughout history and even today are still innovating by launching our Big Data & Society open access journal and more ways to talk about. What next? Well, I think data in the sense of social science is what's next. And we feel that we can play a role in helping bring together these communities. And so to our speaker, who is the very acme, I think, of what's-- the emblem, I think, of data-intensive social science in my head. And not only is Professor Gary King and incredibly eminent academic on many fronts, but he is such a pioneer in this particular area. I know he needs no introduction, but as you have seen in the documentation, he is the professor Albert J Weatherhead university professor at Harvard University. And that university professor role is held by 24 professors in the university. It's an incredibly elite and impressive role that he occupies. But it's not just about the contribution he makes as an individual academic. It's the fact that he's also the director of the Institute for Quantitative Social Science, which has led the way, I think, for many researchers to be able take on this opportunity. And more than that, I think Gary has been very good at explaining to institutions how they could reconfigure themselves to take advantage of this opportunity as well. And so I think it requires massive amounts of collaboration to do it really well. And I think [INAUDIBLE] is one of the best things we've got to turn to. So we're short of time, so I'm going to just say it's an incredibly important moment in social science. SAGE is very committed to supporting that. And we couldn't think of a better keynote speaker to have than Professor Gary King. Please welcome him with me today. Thank you. [APPLAUSE] Thanks. That's good enough. I only wanted the applause, so we can just end now, I think. And thanks for the introduction. I really appreciate it. If anybody has any questions, like after I start talking, you should feel free to either yell them out or scream obscenities or whatever it is you choose. That's totally fine. So that's me. These are SAGE's slides. Thanks for the ASA and APSA and SAGE. And is there another one? Why is there no one other? No, no, thank you all for coming and for sponsoring this. I really appreciate it. And our members of Congress. So I'm going to talk about the big deal about big data. And I already gave the answer, which is it's not about the data, OK? So that's not the innovation in big data. So what's the innovation? Well, let me explain. So data is easy to come by. It's a free byproduct of IT improvements in like every organization. If you buy it, it's becoming commoditized, so it's becoming cheaper and cheaper and cheaper. If you ignore it, wherever you look at the end of the year, you're going to have more data than you did at the beginning of the year, if you ignore it. If you pay any attention at all, you're going to have tons more data without putting in much effort at all, OK? However, what are you going to do with all that data? It's not very helpful by itself, right? Because you have to manage it. It's valuable, so you have to keep it, right? So basically it's an expense. It doesn't actually do anything good for you. Hold on. There's more to the story, OK? [LAUGHTER] The value is the analytics. The revolution is the analytics. The revolution that we didn't know how to do-- the thing that we did not know how to do before but we know how to do now, what we're learning how to do know, is how to make the data actionable. That's really the secret. I love the phrase "big data," not because it describes anything accurately, but because the media has figured out a way to describe what we do and what SAGE publishes and what many of you do in a way that my mom now think she understands what I do. [LAUGHTER] OK? And really, the public genuinely gets this now. Well, they don't completely get it, but they get it much more than they did previously. And basically we failed to communicate this to them, but somebody in the media or the media collectively has figured that out. And that's really valuable. But the value is not data. It's not the big. It's the analytics. You can customize the output for exactly what you want. I to contrast it with Moore's law. So you know what Moore's law is? It's a prediction, empirically accurate prediction, it turns out, that the speed and power of computers will double every 18 months, which has produced a lot of value. That's a really good thing, OK? However, Moore's law is like nothing compared to like one student data scientist who works on algorithms, who in an afternoon can increase the speed of the computer plus the algorithm by a factor of 1,000. That's like not that big a deal. That happens quite regularly. Moore's law never got close to that, OK? No objection to Moore's law, OK? We need that, all right? But don't miss where the real value is here. I'll give one more example. And it won't be curing the common cold. We'll work on that. A colleague had lots of data coming in and ran a particular analysis every year. And the data got bigger, right? There was more data this year. There's more data the next year. There's more data the next year. And all of a sudden, there was more data than could fit on his computer. And so he called down to the IT shop and said, spec me out a new computer, right? I need a bigger computer to be able to do this. And so my guys at the Institute for Quantitative Social Science, which I run-- thank you for mentioning that. My guys and one woman-- because if you ever find a female sysop, you hire her. That's my instructions to them. And we have one, OK? And maybe she is the only one, but she's one. In any event, my guys specced out the cost of this computer. And the cost of the computer was $2 million. Now, that's a beefy computer, OK? It is possible to raise money to buy a $2 million computer. It's a real thing, the $2 million dollar computer. We don't really want to buy a $2 million computer, but that's what it would cost to do. And so we intercepted this, a graduate student and I, and we worked on it for almost two hours. And now this professor runs this algorithm on his laptop in 20 minutes. So this is actually-- the most with amazing thing about this story is that it's not even amazing. It happens like all the time, right? The innovation is the analytics. OK, it's low-cost. It's low-infrastructure. All you need to do is hire my students, OK? [LAUGHTER] It's mostly human capital. That's really what it is. It's important to understand also that if you go from no analytics to just ordinary, off-the-shelf analytics, you have a huge improvement. If you go, however, it's important to understand, from ordinary, off-the-shelf analytics to innovative analytics tuned to your problem, it's orders of magnitude better. So it's worth remembering that. And what's ordinary is drastically changing over time. Because that's really the revolution. We've learned more about causal inference, the effect of something on something else, and prediction, what's going to happen in the future, more in the last 20 or 30 years than at any point in human history. It's quite amazing, OK? And the cool thing is we all get to be a part of this. It's happening in the social sciences, OK? So let me give you some context. There's exciting data, but without the analytics, it's quite useless. So an example is exercise. The way they used to measure exercise is they would do a survey, and they would say, how many times in the last week did you exercise? How many times did you get on treadmill? You know the answer to that question, but that's not necessarily related to how much energy you output, right? So how could you measure it? Well, you all have cell phones in your pockets, I bet. We could put a little piece of software, if you don't mind. There's also an accelerometer in your cell phone, which says how much you move. And the little piece of software would just send us the information, to be used only for scientific purposes, I promise, OK? Now we have, let's say, 500,000 people with this little piece of software on their cell phone. And we have continuous time information on how much they're moving. Think of how much more valuable that is than how many times in the last week did you exercise. It's hugely more valuable. However, what are you going to do with this trace of information about how much you exercise second by second for 500,000 people over the course of, I don't know, a month. What are you going to do with information, right? Moreover, how are you going to distinguish the couch potato sitting sleeping on the train like this compared to the person on a stationary bike doing an all-out sprint who's not moving according to the accelerometer with her cell phone next door, right? How do you actually distinguish that? You need the analytics in order to interpret the data appropriately. It's not just the data. Another example is in order for democracy to work, you need activists. You need people pursuing office. If no one pursues office, Democrats, Republicans, whatever, then there really isn't any point in having democracy. So political scientists track activists. It's important to all of us. So the way they used to do it was a few thousand interviews. In fact, there's a famous book by some colleagues of mine, where they took a random sample of the United States public of 15,000 people. And they asked them screener questions, weeded it down to maybe the 2,000 that are actually activists and then asked a long series of questions and wrote a big book on the one snapshot of 2,000 people. And I realized that today, there are 650 million social media posts that are available publicly for us to analyze-- 650 million. And people write about everything, including basically everything about politics, things that would drive you crazy and things that you would like and everything in between and everything else. That's enormously more information than the information we used to get, OK? Only, what are you supposed to do with 650 million social media posts, right? Quite a lot of them are about cat videos, right? Like what are you supposed to do? How do you analyze that data? Once you have the analytics, then it tuns out we can turn that into meaningful information. Or take social contacts. The way they would do this in many, many high quality surveys is, please give me a list of your five best friends. And we'd ask just for the first names so that we can then ask you more questions about them. Alternatively, if you give us permission, and we will protect it, we can have a continuous record of phone calls and e-mails and text messages and Bluetooths and social media connections and you name it, right? So the amount of information could be enormous. But what are you supposed to do with that information? It requires detailed and sophisticated analytic tools to be able to extract information from it. Or economic development in developing countries-- one option is you get the information from the government. I once traced the source of all of information about AIDS in Africa. Like where does that come from, right? Because they don't really have the institutional structure to be able to estimate this. There's villages off far from-- like where this is actually coming from? It turns out that where it was coming from was a guy in the World Health Organization named Alan. He actually-- [LAUGHTER] Really. And what Alan would do is he'd get the data from the governments, who would give them numbers when he would ask. And then he'd look at them and he'd say, that's not right. And he'd cross it off, and he'd write the other number. And then that became official record of AIDS in Africa, OK? Alan, by the way, was really good. You wanted his numbers rather than the government's. [LAUGHTER] But still, that's not an empirical method, OK? Or not the empirical method you want. So now if you want to measure, let's say, GDP, we can get satellite images of human-generated light at night. You know those photographs, right? Or road networks or infrastructure or things like that. An undergraduate of mine studied Chinese investment in Africa and used differences in satellite imagery for where the investment was and whether the investment actually had any had any effect over a 10-year period. You could actually see it, rather than just talking to somebody and asking them what they remember was here 10 years ago. There's many, many more things. But in each, without the new analytics, the data are useless, OK? Let me just put this in context of the social sciences. In 1995, Science magazine asked 60 scientists about the future of their field. They said, what's going to happen over the next 25 years? And they asked for these half-column descriptions from 60 scientists. Every one of the natural and physical scientists said, we're going to make the most amazing discoveries and inventions, and we're going to cure these amazing diseases. And every single one of the smaller number of social scientists that answered the question said, well, we used to be studying this, and it's really exciting because now we're going to be studying this. And that pissed me off. [LAUGHING] Right? It is important to be studying this and that. But it's also important to be actually solving problems. And the change in the social sciences is we're changing from studying them to actually solving problems. So I'm going to try to give you some of those examples today. So here's my first. How do you like this sentence? How to Read a Trillion Social Media Posts and Classify Deaths Without Physicians. Now, I use this sentence mostly because it's never been uttered before in the history of humanity. [LAUGHING] But also because I'm going to show you the same underlying methods enable you to solve completely different problems. So that's the power of social science methodology, that you can solve problems that you couldn't have solved otherwise, and you can also solve problems that you never even realized existed, OK? So here's some examples, basically examples of bad analytics. First one is verbal autopsies. What the heck is that, OK? If somebody dies in the United States, there's a death certificate. And someone sees the body, a medical personnel, doctor sees the body, signs the death certificate that says the person died from lung cancer, whatever it was, OK? In most of the developing world, that is, in most countries in the world, when someone dies, they basically go off to the bush and are never heard from again. No one sees the body. Autopsies are culturally prohibited, right? You basically don't know, OK? Moreover, these are the places where we have to really figure out what people are dying from because that's where diseases emerge that might affect us, OK? And actually, it's going to have disastrous effects in these places without hospital infrastructure and things like that, OK? So how do you measure the prevalence of different diseases in countries without this kind of data? So what they do is verbal autopsies. Well, what's a verbal autopsy? You go to a household where someone's died, and you find the next of kin or the caretaker, and you ask a series of sort of uncomfortable questions. Did the person have stomach pain before they died? Did they have bleeding from their eyes? Did they have tire tracks across their back? [LAUGHTER] That's a joke. [LAUGHING] Right? You ask them questions to try to figure out what the cause of death was. And then you give the answers, maybe 50 answers to the 50 questions to a physician. And the physician says, ah, that was tuberculosis. And when some smart-aleck scientist came along and said-- social scientist, by the way-- came along and said, let's give that to another physician. You give it to another physician and they say, ah, malaria. And it turned out that when you did this systematically with lots of examples, the physicians were useless in this context. Physicians to be useful needed to see the body and needed to do tests. Verbal autopsies, just the physicians were useless. So what do you do? Well, let's think about that, OK? Put that over here, OK? And think about what they're trying to do. What they're trying to do is classify individuals into categories of death, OK? Now hold that thought, OK? Another thing people do is sentiment analysis by word counts. So you see this a lot, OK? In the media, you see people investing on the basis of Twitter and things like that. And what they're doing is they're counting the number of tweets with certain words. Let me give you an example of the kinds of things that can happen when you do this. There is a group that was trying to predict US unemployment rates. And they thought, well, let's do this with social media. We can do this better than the government. Let's count the number of times people say "classified" or "jobs" or "unemployment" or all these things, right? And we'll count a number of tweets that have these words. And they did. And they plotted them over time, and they correlated with official government US unemployment rates. And they preceded them. So they were able to predict US unemployment rates with the word count. What happens, though, is you can do a little of this well, but then it fails catastrophically, OK? As a side point, it's quite unlike heart surgery. You don't do that like a little bit. You either do heart surgery or you don't do heart surgery. You don't feel unwell at dinner and ask someone to pass a steak knife to fix something. OK, let me go back to the story. OK, so this is something people do a little, and actually, they were predicting unemployment rates to some degree from this. And then they noticed one day there was a spike in the number of these words used. And they thought, let's invest. And they start investing and investing. And what happened? What they didn't notice was that was the day Steve Jobs died. And we didn't notice that Steve's last name-- [LAUGHTER] --was one of the worlds that they were counting, OK? So catastrophic failure. Both of these fail. Now, the interesting thing is they're completely unrelated substantive problems, but they have the same solution. And the reason why is the key to both methods, the reason both methods were failing, is they were classifying. As it happens in public health, people in public health are not your doctor. People in public health don't care about anyone. They only care about everyone, OK? So that's interesting. They don't care what you die of. They only care about the proportion of people dying from a particular disease. And when we study Twitter, no one cares what stapumpkin222 says, OK? It doesn't matter what anybody has to say. It only matters what everybody has to say, OK? If you think about that, the individual classification decisions are not the quantity of interest. What's the quantity of interest? What do we actually care about? What we actually care about is the percent and the category. We care about the percent of people dying from malaria in the United States. That's zero-- zero. That's an effective number, zero. We need to devote no resources to malaria. OK, good. OK? In other parts of the world, it's not zero, so we have to devote more resources. We are about the percentage. We don't care who died from malaria. Well, you and I might, but as public health professions, we wouldn't. Similarly, for estimating sentiment or US unemployment rates from word counts, we don't really care about any one of these tweets. We only care about the percentage that fall in a category. Now, you're probably thinking, wait a second. How do you get the percentage? You just put them all in the categories, add them up, and you get the percentage. And if you were thinking that, you'd be absolutely right, except you just assumed that every time you put it in the category, you got it right. If you don't get it right all the time, those are two different things, right? So let's take all pairs of countries every year since World War II. 1,000 of those million pairs of countries were at war. So now let's come up with the prediction of the proportion of pairs of countries at war. Well, a really good prediction that will be right nearly 100% of the time is there's never been any war, right? So that's useless information, right? The percent correctly predicted is really high, and we don't get a good estimate of the thing that we really care about. So classification, although it seemed like that was the way to get there, only gets us there if we're perfect, and we're never perfect, OK? OK. So what does that do? So what that does is all of a sudden we realize that methodologically, we care about the second thing, not the first thing. And if there's another way of getting to the second thing, estimating the percentages in the categories, we don't even care whether we do a worse job in the first thing. And that's what we did. We developed a set of methods that gets us estimates of the percent in a category that's accurate on average. It's called unbiased, is the statistical term. It doesn't even a classify at all individually. I realize it sounds like magic. If anybody wants, I'm happy to explain all the math. In fact, if anybody flinches, I will. [LAUGHING] But basically, we came up with a way of estimating the percent in a category. Once we realized what the quantity of interest is, we used the tools that we've all developed. We develop new statistical procedures to be able to do this. And the really cool thing was it solved these two completely unrelated problems. Actually, I was working on the first problem for the World Health Organization, and we solved this problem. And for a year, I was trying to solve the sentiment analysis problem. And I tried every method that the computer scientists had come up with. And everything was a complete disaster. It just was nowhere near close. And so one day-- and I'd try things, and I'd give them to my graduate students. And I'd say, here, I found this new thing in the literature. Why don't you try this? And like every morning they would roll their eyes at me, like, this one's not going to work. None of the others worked. And then one day I realized mathematically these two problems are exactly the same, and our method works very well for verbal autopsies, so it will work here. And I'd tell my graduate students, they will work, and they're rolling their eyes. But nonetheless, I said mathematically this-- and it actually works. So the consequence of this is modern-day analytics lead to us developing these algorithms. Actually Harvard patented them and licensed them to a start-up company called Crimson Hexagon. They have around 200 employees. They go around the world. They collect social media posts. They have about a trillion social media posts, so that's actually a real number there. And they do brand monitoring for people. I helped found this company. I put this up here. Crimson Hexagon, I'm very proud, was named seven of the top 10 most innovative companies on the web, which I mention primarily because my brother works at Microsoft, and Microsoft was number nine. [LAUGHTER] And also, the same exact method, in fact the same code, is used by the World Health Organization and others to estimate the prevalence by cause of death in countries all over the world. So that's the cool stuff that we get to be involved in in social science methodology. OK. Now I'm going to completely switch subjects, OK? So if you fall asleep, it's best to fall asleep right between slides and then-- [LAUGHTER] OK? So these are more examples. OK, we're in Washington, right? Just so you know. [LAUGHING] So the United States Social Security Administration is the incarnation of the Social Security program, right? The single largest government program, lifted a whole generation out of poverty, extremely popular, highly partisan, the third rail of American politics, it's called, right? The essence of a program provides benefits to retirees and people who are just disabled and families. In order for the program to succeed, to survive, the Social Security Administration must forecast how much is in the Social Security Trust Fund. That's what everything depends upon, OK? These forecasts are used by the Social Security Administration to make the whole thing work. So if retirees draw benefits out too long-- if someone invents a pill and we live to 200, we'll be cheering until we realize there's not enough money in Social Security to pay for us all. And the trust fund will go insolvent, and we won't be able to have enough money for retirement. So the forecasts are essential to this program and to keeping people out of poverty. It really matters. Moreover, many other United States government programs depend upon these forecasts, and those programs are run based upon what the forecasts say. More than half of United States government expenditures depend upon these forecasts-- more than half, . OK, so we looked at these forecasts. The interesting thing is these forecasts have been made for 85 years, right? And they weren't forecasting 85 years ahead of time, so we had this amazing scientific opportunity to evaluate forecasts, real, out-of-sample forecasts, because the year they were forecasting actually occurred. So we can actually see how well they did. So we used the very complicated statistical method to compare the truth and the estimate. The method is known as subtraction-- [LAUGHTER] --to evaluate these forecasts. The methods, as it turns out, had been little changed in 85 years. They're mostly qualitative. And these 85 years are not just any 85 years in the history of the country or the world. These are the 85 years in which we've learned more about forecasting, as I mentioned a minute ago, than in any time in human history, and this is the time that the United Social Security Administration chose not to update the methods by which they ensure the solvency of our retirement, OK? So this seemed like it was worthy of us paying somewhat more attention to. OK, so we did quite an extensive study. What are the results of this? Well, until about the year 2000, the forecasts were about unbiased, which is what you would want. It's not fair to ask of the actuaries who make the forecast that their focus should be spot-on. They are forecasting the future, after all. The future is uncertain. You heard it here. Until 2000, for 75 years, or however many that is, they made forecasts. Sometimes they were too high. Sometimes they were too low. But on average, they were about right. That's what we seek. Since 2000, however, they became systematically biased. And it's not just one forecast. It's a whole bunch of forecasts. And every forecast was biased in the same direction. Every forecast was biased in the direction of making the system look healthier than it really is-- every single one. Why is that? As a side point, why is that? Well, as-- you were looking ahead. I noticed it. OK. [LAUGHTER] Why is that? Well, as social scientists, we do quantitative work. We're quantifying things. But we never leave behind the qualitative evidence. Because it's impossible to quantify all information. There's always going to be essential information that humans beings have that we're not going to be able to completely quantify. And so we also did interview people. And we figured out what the reason was. Like how come this all was biased? Is it just some partisan scheme? No, actually. That's not what it is. What happened was Social Security became much more partisan. That's absolutely true. Both the Democrats and the Republicans are at fault. They both pushed very hard. They're not actually at fault, but they changed the environment. The actuaries did what we would want good public servants to do. They hunkered down and protected themselves from the Democrats and Republicans, who might like the forecast to go this way or go this way, right? Because it would be convenient for their political arguments in changing public policy. The actuaries resisted that just, like we wanted. But the actuaries also resisted pretty much everything else, including the data. And as it happens-- [LAUGHTER] And they insulated themselves basically from the facts and the data. Since about 2000, as it happens, Americans started living unexpectedly longer lives, not in every category, not every person, but on average, they started living unexpectedly longer lives. If you're taking statins, keep taking them, OK? Although don't take medical advice from a political scientist. [LAUGHTER] But we started living longer lives. Like these kinds of innovations actually mattered, and you can see it in the data. And that's a really terrific thing. But it changed the forecast. Like you have different inputs. You should change the outputs, OK? People smoked less. That's good. But they ate more. That's bad. The smoking less was better than the eating more, what was bad, and that wound up working out so that mortality was decreasing at a rate faster than expected. That's not necessarily going to continue to happen, but they ignored the data, and as a result of ignoring the data, they just missed what some people estimated from our data was a trillion-dollar error, OK? OK, so in addition to this, complicated method of subtraction, we actually created an actual complicated method of forecasting. We came up with a better method of forecasting. How should the actuaries actually forecast? We developed new social science statistical methods that can forecast much more accurately. They are, for example, logically consistent, like, unfortunately, older people have higher mortality rates than younger people. You would expect that. The methods the Social Security Administration are using even to this day don't require that, there's quite a number of other logical consistencies. They also produce much more accurate forecasts. The last time we ran this, the trust fund needed $800 billion more than the actuaries actually indicated. So you know, bummer, but-- [LAUGHTER] At last I think we have better information. This, by the way, doesn't say anything about what public policy should be, right? Our elected politicians get to make these decisions somehow, or not make the decisions. That's also a decision. But we hope to give them much better facts on which to make the decisions, and I think we've done that, OK? Deep breath, fall asleep, wake up, new subject. [LAUGHTER] OK, gerrymandering-- redrawing legislative district boundaries, the boundaries of a legislative district. This is really troubling area, hugely troubling area. Is the most conflictual form of politics in the United States, a most predictable form of conflictual politics in the United States, short of violence. And in many areas, it doesn't stop short of violence. There's a lot of examples, which we collect, of fist fights on the floor of legislatures, almost all over redistricting. This is a big deal. For incumbents, there's hardly things that are more frustrating than redistricting. Because what happens is somebody in a basement, who looks a little like me, is redrawing district boundaries and deciding who gets to keep their job or not, right? That's a hard job these folks have, right? But fortunately they don't do this for professors, right? And for voters, it's the same kind of frustration. They're not going to lose their job as a result of redistricting, but they'll lose their representative. And they will be in a different district. And they may have had information about who they would vote for, and now all of a sudden they don't because there's a different configuration of candidates. So this is a really difficult area. It's very, very partisan. It produces a train wreck of litigation on cue. Like every 10 years after the census, they allocate congressional seats to states, and then every state has to redistrict, and then every state has to have litigation. It's a complete train wreck. There's huge amounts of money wasted. Both sides do whatever they can to get advantage. So analytically from a social science perspective, what are the problems here? Well, first of all, there was no agreed-upon standard of partisan fairness, all Right how do you know what a fair redistricting plan is? And secondly, there was no method by which we could estimate whether a plan met the standard. So we needed standards, and we needed methods. Secondly, in order to figure out whether the Voting Rights Act applies, the courts required estimates of whether blacks and whites and Asians and Hispanics were voting the same way. But we actually have this thing in the United States called the secret ballot. And so although the law says, you must tell us how different people are voting, the law also says, you may not know. [LAUGHTER] OK, that's where the social scientists come in. Because we can estimate these things. And that's-- I'm getting ahead of myself. So we have solutions, OK? So the solutions-- the solutions were not about the data. The data have been available for years, right? There is census data, and there's election data, and there's where the districts are. And there's data at the precinct level or the census block level. We have lots of data. First we needed a standard. So I and some co-authors developed a standard for partisan fairness, for what it means for a set of legislative districts to be fair to both parties. So we developed the standard. It is agreed to pretty much by all, that is, by both parties in almost all major redistricting litigation, including cases that have been decided by the Supreme Court this term and many other terms by Supreme Court justices that have written about it and many other courts all across the land. So they basically agree with the standard of partisan fairness. We also developed a sequence of statistical methods, each one of which better than the previous, or at least so we claimed in order to get it published. But at least that was the theory. And the methods estimate on the basis of the data and the plans, the different plans offered by the different parties and citizen groups, et cetera, how fair each plan was. And these basically are agreed to pretty much by everybody. The parties who hate each other, they all use the same methods, which is really terrific. We've also developed what are now very widely used methods for estimating individual or group behavior from aggregate data. So if you only know the percentage of African Americans in this district, and you know the percentage of people voting for the Democrats, and there's lots of African Americans and lots of people voting for the Democrats, it might be that the African Americans are voting for the Democrats, but it could actually be that lots of whites that live in districts with African Americans are the ones voting for the Democrats. Well, then that's known as an ecological inference problem. So we've come up with methods that enable us to get around these problems. They're uncertain, as you might imagine, because there's this information lost. But there are now used in pretty much every court by experts on all sides. So these are just some ways that social scientists have actually made some contributions. Are you ready for a different example? [LAUGHING] OK, keywords. [LAUGHTER] It's the obvious next example, don't you think? Humans, I'm going to make the case, are horrible at choosing keywords. Now, wait a second. That seems ridiculous. We all do Google searches every day. I'm going to convince you that you are horrible at doing Google searches. Well, you're horrible at thinking of keywords. Here's an experiment, OK? You can play along if you want. We actually ran this experiment with 43 undergraduates at a nice college in Northeastern United States in Massachusetts, sort of around Cambridge, Massachusetts. [LAUGHTER] So we gave this to 43 undergraduates. We said we have 10,000 Twitter posts, each containing the word "Boston"-- so the 10,000 posts each used the word "Boston"-- from the time period around the Boston Marathon bombings, when all of us up there were hunkered down in our houses because they closed the roads and they said, you can't go out. Please list any keywords-- this is the task I want you to do. Please list any keywords which come to mind that select posts in this set of tweets related to the bombings but won't pick posts that are unrelated to the bombing. So don't pick the word "the," please, right? Because it's in both. You have to pick words that are just related to the bombings, OK? So you think about what that is. How many words can you think of? Just think of it, OK? OK. I won't pick anybody out from the crowd. I promise. OK, so here's some examples, OK? Explosion, all right? Lockdown. Tsar-- however you say this guy's name. Terrorist. These are the kinds of words that people chose. Maybe some of you chose these words. My guess is very few chose these words. Am I right? I'll show you why, OK? The median number of keywords thought of when we gave them plenty of time for our undergraduates was eight. Each person came up with-- some more, some less, but on average they came up with about eight. The number of unique keywords chosen or thought of across all 43 undergraduates was 139 unique keywords, OK? Each person could only think of about eight. So let's ask the question. For each unique keyword, how many of them were thought of and by person and, when given a chance by 42 other people, every single one of them fails to think of it. That's pretty dramatic, right? How many? Turns out 2/3 of the time-- 2/3 of the time. So what does that mean exactly? That means humans recognize keywords well. If I show you a keyword and I ask you, is this relevant to the bombings and not relevant to the not-bombings, you'd know right away. I gave you these examples right away. You knew they were relevant, right? So we're very good at recognition. We suck at recall, OK? In fact, we're so bad at recall that I can convince you that all brains have an inhibitory process that cause us to forget things, which is sort of a cool fact, OK? Why would that happen? I can explain what why it happens, but let me make it happen to you right now, OK? Do you know the idea of it's on the tip of my tongue? That's an inhibitory process, something in your brain causing you to not remember something. So no one speak, OK? Think of your bank password. That's why I asked you not to speak. Now think of your previous bank password. Now let's assume you don't rotate them because the bank tells you not to do that. Now think of the bank password before that, OK? My guess is almost nobody can think of it, OK? But if I showed it to you-- that would be a really cool magic trick. [LAUGHTER] And that is how I paid for college. But if I showed it to you, don't you agree that you would recognize it? OK, so that's the thing. We're good at recognition. We suck at recall. Surprising, but it's true. OK, so what do we do about that? We've developed some new technology. We call it thresher. And this technology, like most really good technologies, does not fully automate the human away. Because fully automated technologies usually do stupid things, like a driverless car without anyone to tell the driverless car where to go. It wouldn't be very useful. It would drive around in circles, right? Fully automated doesn't really do anything. Fully human is inadequate. I just showed you. You are inadequate. We are all inadequate because we're humans, right? So what this technology does is it suggests words to you so that you can do recognition, which you're good at, rather than recall, which you're bad at, OK? OK, so let me give you an example. We're going to find those hiding in plain sight. We were studying China and Chinese censorship. And I'll give you example of Chinese censorship in a minute. But we use this technology, this thresher technology, in order to find the following. So first of all, I'll give you an example. OK, so anybody speak Chinese? Do you know what that is? Freedom, that's right. That's the word "freedom." If you use the word "freedom" in China on certain website but not others, certain social media websites, it will be filtered out, OK? And you write it in your social media post and you use that word, you won't see it. It will be filtered out-- not every one, but a lot of them, OK? So what do people do, right? Well, if you had to rewrite a sentence without a word, it would be no problem, right? You can find some way of writing around it, OK? What do they do? They use this word. Those who speak Chinese, do you know what this is? First of all, look closely, for those-- this is not the same as this. It's close, but it's not the same, OK? This means "eye field," an eye and a big field of daisies. It has no meaning at all, OK? So what people say is, we need more "eye field" in China. And if you speak Chinese and you see that, then you recognize that the writer means that. It's very clever, don't you think? And this is known as a homograph. Here's another example. This means harmonious society, which is the official slogan of the Communist party. This is filtered out. They don't want you to talk about that. So what do they do? Well, they don't use that. They use this. What is that? Well, it turns out this means "river crab" OK? And they say the policy of river crab totally sucks, OK? Well, then why are they doing that? The reason why is because if you say these two words, which I have tried so many times, and anybody who speaks Chinese in the audience doesn't think I came anywhere near close. But this is, of course, irrelevant. But this is known as a homophone. These two sound almost the same. Can you say them? Yeah. Say it. [SPEAKING CHINESE] That's right. [LAUGHTER] Well, here's another example. This is Bo Guagua. He was the son of Bo Xilai, and Bo Xilai was the guy that came down in the biggest scandal in China in 20 years. So you used the word Bo Guagua, and they censor it out, OK? So how did people write about this? This was the most fascinating event in China. If you were in China, you would want to write about it and read about it and figure it out, OK? And we in the United States study China for all these cases. We wanted to follow the thread of the conversation. When the people we were studying were incredibly creative and innovative and making up new things, we still wanted to follow the thread of the conversation. If we had to sit there and think of words, like in the Boston bombings example, we would fail because we had these inadequacies. But in this case, we would fail even if we were incredibly good because how do you think of something like that, right? OK, Bo Guagua. Instead, they changed the first Chinese letter to the English letter B. OK, I can sort of get that. They leave off the next letter. That sort of makes sense. OK, they use the second two. OK. Then use B melon. [LAUGHTER] Why is that? Well, one of these characters by itself actually means melon, OK? Or ABD. Why would that be? We have to research this. Turns out that the princelings, the sons and daughters of the power bureau people, for whatever reason have means that are like Bo Guagua, like ABB. And so ABB is an abbreviation for princeling. And so these are ABB, right? So these are basically versions of slang, some of which you might have actually thought of if you had a really good day, OK? But the Chinese people are doing this continuously. How would you be able to follow this? We use this technology to follow this kind of thing, OK? Let me tell you about our China study. Because it's sort of fun. This is a more general thing. OK, the previous approach to studying censorship in China was talk to somebody that had a post taken down. Like somebody would write a post and notice that it was taken down. That's what they do in China. They read the posts, and they take down the ones they don't like. It's sort of amazing because they have millions and millions of posts, right? Well, the problem is that one person watching that would be like an ant on an anthill. They don't get to see the big picture, right? So what we did is we noticed that we were able to surprisingly-- actually we were very surprised-- that we figured out how to download all Chinese-language social media posts before the Chinese government could read and censor them. So we had the entire corpus of Chinese-language social media posts that the Chinese people couldn't read because they weren't allowed to. But we could, not because we were allowed to, but because we had them, OK? And so then we developed a network of computers around the world to check on each of the posts to see whether it was taken down, OK? And then so now we have two piles of posts, one censored, one uncensored. And we used methods of automated text analysis that we developed in social science methodology to try to figure out what the Chinese government is after. OK, about 13% are censored overall, just to orient you. Now, I'll tell you what we found. Overall, everybody knows the goal of censorship. The goal of censorship is to stop criticism and protest about the state, its leaders, and their policies. So we knew that goal, and so we went in with that goal. And we analyzed the data. And the first thing we learned is that it was utterly and completely wrong. Like we got nothing remotely related to this goal. So we thought, well, OK, that's a good starting point. It would be nice if we had an ending point also. And so we started to look at the data in many different ways until we finally hit upon a way that made some sense. We asked, what could be the goal? What we did is we broke up the question. And we said, well, maybe the goal is to stop criticism of the state, its leaders, and their policies. Or maybe the goal is to stop collective action, stop protest. And we separated the two. Once we separated the two-- we had the theoretical idea to separate the two-- all of a sudden everything incredibly clarified. It's like when you're at the eye doctor and they go, is it one or is it two? It's two, right? Because all of a sudden you can see that there's someone across the room. OK, what happened? The first was wrong. The second was right. Let me explain what that means. In China, what everybody thought was you couldn't criticize the government. What is actually the case is you can criticize the government. You can say whatever you want about the government. You can say, the leaders of this town are all stealing money. Here's how much. These are the bank accounts they have the money hidden in overseas. And by the way, they all have mistresses, and here are their names. That won't be censored. But if you say, and let's go protest, that will be censored, OK? In fact, if you say, the leaders of this town are doing such a great job, let's have a big rally in their favor-- censored. They don't care what you think about them. They're a bunch of dictators. What should you think of them, right? They only care what you can do. They're not afraid of the United States government and our big military power. They have nuclear weapons. What are we going to do, attack them? Right? It's not going to happen. What are they afraid of? They're afraid of their own people, right? So their own people-- how could their own people affect them? Their own people could affect them if they rise up and they have another big Tienanmen Square and it spreads contagiously across the country, OK? So they stop collective action. The implications of this are really interesting. The implications is that social media, then, is not merely something cool to study. It becomes actionable. It's actionable for the Chinese leaders, and, therefore, it's actionable for us. For the Chinese leaders, they measure criticism to judge local officials. If you're in charge of China, what do you want? Well, you have more power than Barack Obama and Bill Gates combined. You want to keep a good thing going, OK? How do you keep a good thing going? Well, you make sure there's no collective action. How do you do that? Well, there's between 50,000 and 700,000 governmental units across China, depending on how you count, OK? You appoint all of your best friends to run them, OK? And there are a whole lot of other people to run them. And then what happens if one of them isn't doing a good job? Well, then maybe protest, and that protest may spread contagiously across the country, and then you might lose your job. How do you watch all of those people? Well, actually social media is a really good way. You could see why they're being criticized. And it turns out we can use how much they're being criticized to predict whether or not they're going to lose their job. Because they're using that. And of course, they censor to stop events with collective action potential. So we all thought they won't allow criticism, but they allow criticism. It's useful for them. We as academics can use the criticism and the censorship to predict officials in trouble and likely to be replaced. Or dissidents, like before they're arrested, if censorship goes up very substantially on a particular dissident, they need to duck, because they get arrested four or five days later. Or a peace treaty-- a peace treaty between Vietnam and China, they found oil in the South China Sea between those two countries. And both countries started taking a real liking to that body of water. And the media is talking about potential conflict between the two countries. And they're sort of shooting across each other's bow or the equivalent. And so what happened? Censorship was soaring. And then one day we noticed-- no one else did because nobody else had the data. We noticed it just stopped. They stopped censoring completely. That's the media talking about it that time? They're talking about maybe there's going to be a war. What's going to happen? This is going to some conflagration, yes. But we noticed censorship had stopped. Four or five days later, they signed a peace treaty with Vietnam. We don't know for sure, but we think what happened was they made a deal, and they told the censors, don't worry about it, and they signaled. What's happening here is that you have a giant operation within the Chinese government designed to slow the flow of information, but it's so large that it conveys a lot about the intentions and the goals of the Chinese leaders. It's like a big elephant tiptoeing around. It leaves big footprints. And if you look at scale with big data social science methods, you can actually see these things ahead of time and merging scandals and things like that, disagreements between central leaders and local leaders. We can also see those things. I'm going to-- let's see. I am going to not be able see the time. OK, I'm going to skip two slides and give you one final-- agree with everything on these slides, right? [LAUGHTER] OK. All right. Let me tell you about one final message, which is written up there, OK? So let me ask you the following question, OK? What university research has had the biggest impact on your lives personally? Now let's think about what the biggest kinds of things universities have done. Well, first of all, all the progress in the last 400 years has come from science. And most of that, not all of it by any means, but quite a lot of it has come from university research. You can think of this as just research if you want. University was gratuitous. What research has had the biggest impact on you, what university research? Maybe the genetics revolution-- we spend a lot of money on the genetics revolution. The general genetics revolution has taught us an enormous amount about the basic biology of what's going on. But it's sort of been a failure up until now in terms of curing diseases, at least relative to its predictions. We think that's going to change, but we're not there yet. Well, how about the Higgs like particle or gravity waves more recently. Actually, to me those are just like the coolest things that you could possibly see. But how much does it have an effect on your lives personally? Some, some. Just the wonder is totally worth the cost and way more that we pay for these things, OK? I'm totally in favor. But I just want to put it in context. How about exoplanets and the Mars rovers? Yes, yes, yes, yes, yes. Please pay for this. How about doubling life expectancy in the last century? Yes. [LAUGHING] OK? There's a lot of things that belong in this list. My only argument here and my last ride today is that on this list also belongs-- I was hoping it was going to come up-- also belongs quantitative social science. It also belongs on this list of the most important ways that university research has impacted you in your lives. Let's think about what it's actually done. First of all, what is it? Well, it's actually big data or big data applied to people. We sometimes call it data science or data analytics or statistics or [INAUDIBLE] or political methodology. Or there's actually like 100 different terms. And you could use thresher technology to discover all those words that you wouldn't think of. What has quantitative social science done? Well, it's transformed most Fortune 500 companies into data producers, data analyzers, and data monetizers. It's transformed most Fortune 500 companies. It's established whole new industries. It's altered friendship networks. Facebook is basically a social science innovation. So what is social media? Social media is the largest increase in the expressive capacity of the human race in the history of the world. That's cool, OK? It's changed political campaigns completely. Hasn't always produced the outcomes we want, but we don't mess with the outcomes, OK? It's transformed public health. It's changed legal analysis. It's not discovering anymore. It's e-discovery. It's impacted crime and policing in huge ways, military as well. We invented economics, transformed sports. Have you seen Moneyball? It's transformed sports. Who would have thought quantitative social science would have an effect on sports? It sets standards for evaluating public policy. We did an experiment in Mexico, where we evaluated their health care system. And we did a randomized experiment, the largest randomized experiment in health policy ever, where we randomly assigned hospitals and doctors and medicines and money to pay for it all to communities and learned a tremendous amount. And they got to improve their health system. And millions and millions of people got health care that wouldn't have got it otherwise because we did this evaluation. It was a really gratifying kind of thing to be able to participate in. Anyway, there's also three et ceteras, because there's lots of other things that quantitative social science had an impact on. And so if you're anywhere part of or touched this world, I just want you to remember that this belongs on that list, OK? And I thank you for all of your time. [APPLAUSE] Nobody interrupted me, which I apologize for not insisting that you interrupt me. So interrupt me now. [INAUDIBLE] politicized to the extent that-- oh, I'm sorry. Oh, excellent. Mortality statistics go back hundreds of years. So how could Republican or Democrats disagree on mortality statistics? I mean, there's no hope if that's the case. [LAUGHTER] Well, let's put it in context. So politicians don't have to agree with the facts. They can use facts anywhere they want. We're in a democracy. The truth is one of many considerations. And so the politicians can decide on whatever basis they want. We're not going to tell them how to decide. I agree with you that it would be nice to have the facts on the table. And the current mortality statistics, the fact is, everybody does agree with them. Forecasts about them, we should be using best practices, absolutely. The fact that we're not hurts us all, Democrats, Republicans, retirees, workers, everybody. I completely agree. There's no reason for that. OK, thank you. Sure. Run, run, run. [LAUGHING] Sorry. Thank you for a great talk. First, a comment on the Social Security Administration-- there were many over years following 2000 who said that the actuaries were just extrapolating. They weren't taking into account myriad factors, like you point out, health care and changing lifestyles and the like. And Sam Preston and others repeatedly showed in their research how that did change their thinking. So a lot of this has to do with the culture of an organization more than, I think, the politics. Perhaps one of the greatest challenges is trying to infer meaning from language. And will we be able to put this into terms of what's written what people really mean? And do you think quantitative social science will give us any insight? [CHUCKLING] It is the culture in each of these-- you asked two different questions, which are essentially have the same underlying piece, which is human beings communicate in very complicated ways. We sometimes call it culture or language. And there's a way that when we quantify that, we will lose some of that information. It's not only worry. It's real. In fact the point of quantification is mostly to throw away other information so that we can have the abstract and abstract summary. If we summarize the wrong piece of it, then we completely lose everything, absolutely, totally agree. That's why I think the best methods that we come up with are the ones that are not fully automated but not fully human, the ones that are human empowered but computer assisted. When you all write, when you write letters, if you use Microsoft Word or Emacs or Google Docs or whatever it is, what is that? That's computer-assisted writing. Like Microsoft Word doesn't tell you how to write. It helps you write, OK? So we have a project to do computer-assisted reading. Well, what is that, actually? In fact, let me make it more outrageous. It's computer- assisted conceptualization. Well, wait a second. Computers are not allowed to do that. Only we do that. Well, actually, we have a pretty lame working memory-- really lame working memory-- really, really, really lame. We have this real limitation. Let me explain how limited our working memory is. When you're writing something, if you re-write one sentence, what's the first thing you do after you rewrite that one sentence and rearrange 10 words? You read it again. You read it again. You can't remember 10 words and rearrange them? No, no, because you are inadequate. You are humans. [LAUGHING] Right? That's the problem, right? So the idea that we can be helped-- yeah, of course, right? I mean, we can have computer-assisted conceptualization. So we have some methods of doing computer-assisted conceptualization. There's a long way there before we get replaced by the robot, which I don't think is going to happen. But yeah, so we have to pay very close attention to the culture of organizations and the deep qualitative information that individuals have. And that is absolutely the problem in the Social Security Administration. There is a culture in the Office of the Chief Actuary that is they are at the center of attention. They've done a very good job of curating the data, but they don't want to share the data with anybody. Years ago that made sense because other people would screw it up. But really what they should do is they should take the data. They should just make it available to the scientific community. They didn't even share the data with other parts of the Social Security Administration, OK? If they shared it with the scientific community and provided replication data sets like we've been doing in academia, as you, might know, then you would have hundreds of social scientists trying to do better. And if they did better, they would cast no aspersions on the actuaries. It would be great for the actuaries if someone did better. Someone at the end of the day, by the way, has to make the call. And the person who has to make the hard call is the chief actuary or the Office of the Chief Actuary. Or they make the recommendation to the Social Security board. They make the call, I guess. That's fine. But they should have the best advice. They shouldn't be doing it by themselves. The key contribution of science, the reason why so the social sciences are moving from studying problems individually, sort of in our separate monasteries, to a scientific model where we're actually solving problems, is because of the community. Because it's much easier to fool ourselves than it is to fool other people. And when we work together and we share data, then one of us can check on the other, and together we can do way better. That little paragraph actually accounts for almost all progress in the last 400 years. So that's a long answer to your question. Katie. Can I ask a follow-up question on access to data? I mean, one of the barriers to having more social scientists working like this is that it's often very difficult to get access to this kind of data. Either it's corporately owned and controlled, or, as you mentioned, government isn't willing to give it up, so how have you been able to get access to the data? And what more should we all be doing to push that forward? Yeah, so access to data is absolutely essential. Let's think of it in a couple of different ways. So one is, at a minimum, academics seem to be sharing data with each other. That's actually harder than you would think, but we've made a lot of progress over the years. In 1995, actually, I wrote an article, as you know, called "Replication, Replication," where we encouraged scholars to make data available when they publish their article. We've made enormous progress since then, and actually, it's now the norm within academia. Governments have actually made a lot of progress in making data available. As we've seen with Social Security, not all parts of government, but local governments right now are competing with each other to make the bus schedules available so the high school student can write the app to tell you when the bus is going to come. That's a terrific thing. They should continue on that path, and they are, and that's a really great thing. The next stage is companies. It used to be that almost all the data in the world was inside the university because we created it. Now almost all the data in the world is in companies and governments. Governments eventually make data available. Companies don't have to. And they now have more data than anybody. There is a treaty to be signed-- to be negotiated first before we sign it, I think. But there's a treaty to be made, a grand treaty, between the academic world, the commercial world, and government. And if we could sign this treaty, what would happen, I think, and what should happen is that the companies-- what do they want? They want access to the data. They don't want the random terror of government coming in one day and saying, you know what? After two weeks, you have to delete all your data and all their money also, right? They don't want that. They basically would like to be regulated in some way, as long as it's predictable. Predictability in business is very, very useful. So that's sort of what companies want. What academic researchers want is access to that data. The data that Google and Facebook and Microsoft and all these companies have, we could learn so much more about human behavior. We would live longer, healthier, happier lives, and most of the issues that members of Congress care about are actually the issues that social scientists study, like the vast majority of the issues. And we would be able to make progress on everything from teenage pregnancy to unemployment-- you name it, every single one of these issues-- if we could get access to these data. But it's not fair to ask these companies for access to the data if we're going to hurt them commercially. It's not fair, right? So they have to get something from it. They have to get some promise that the people won't come in and give them random terror on-- they call them the privacy nuts. They're actually very important to us. So that's why I think there's a treaty to be made, like all three partners would be way better off if we could make this happen. Politically this is a very difficult thing. But I think somebody should work on that, maybe even in a building like this. [LAUGHTER] Thank you. Increasingly, large public agencies that serve vulnerable children and families, for example, state child welfare and juvenile justice agencies, are very enamored of data and predictive analytics, predictive risk modeling. For example, looking into emergency room records to identify families most at risk of abusing the children raises all kinds of questions about bias and surveillance. I would be very interested in your take on this problem. I know Columbia University is developing a fair test procedure, an algorithm to help those machine-learning pipelines. But I would welcome your thoughts about that issue. Yeah, so this is a really important issue in this area, but also quite generally. The advantage of big data is it becomes more and more informative. And that means we can solve more and more problems. But it's also potentially more and more intrusive, OK? And so what's going to happen-- I mean, there's going to be a lot of treaties signed. Some of them will be signed like every year and they will be different, right? But whatever they are, there's going to be some compromise, because all of these issues are really important, like we want to maintain individual privacy. And also we'd like to solve these societal problems. My main answer to this, is please don't forget all of those parts, right? Some people in politics say it's a privacy violation. Nothing is ever only a privacy violation, right? It also produces a good. So the good is-- I mean, how much privacy would you be willing to give up to live 10 years longer than your expected lifespan? [LAUGHTER] Right? It's never put that way. It's only put, can I look at your email without your permission? Because the answer to that question is no, OK? But if the benefit were clear also, then it's not like the answer is necessarily yes, if your life were miserable, right? These are difficult questions, OK? But I just want us to remember the good, OK? And there are ways of using very private data, at least by academics, in ways that can benefit everybody, OK? A majority of children who have cancer in the United States are in randomized experiments. That is, we decide on their care based on flips of coins, in very scientific ways that benefit them also. But the parents of these kids have, every single one of them, given permission. And you know what? We all benefit because of that. When we give up data in the interest of science, big deal, right? So it could be that somebody at Google or the governments or somewhere is like looking at something they shouldn't be looking at. And if we find them, we're going to send them to jail, OK? But still, they could look at it and they really like annoy us, OK? But if it's something that could cause us all to live 10 years longer or solve the crime problem or eliminate the problem of people dying from pain medicines or any number of other issues, I just don't want anybody to forget the good. That's all. Where do we put the-- Have you seen collaborations developing between quantitative fields and social science fields? I mean, you have your institute, but do you see more interdisciplinary work in the future, more traditional academic walls or societal walls breaking down as people merge their talents? Yeah, I think that's a great question. I'll give one answer with two parts. In the social sciences, more and more social scientists, economists, anthropologists, political scientists, sociologists, psychologists are being trained as social scientists at large. Not completely, but more and more. More and more, there's co-authorships across fields. More and more, we know what people are doing in different fields. When we wrote our paper on Social Security, to respond to Myron again, we were very interested-- it was very important for us to figure out the culture of that organization and the psychology of people in a very difficult position with enormous political pressure. And so we studied that. Actually, in psychology there were people that worked very hard on that on these kinds of issues and how we might be biased under certain circumstances. And we had other kinds of ways of recognizing that we were inadequate, frankly. That's what that literature shows. And it was very important to us, incredibly important. And so there's much more of that. So that's one part, is yes. The other part is also yes. The other part is the methodological subfields of these different disciplines, political methodology within political science, econometrics within economics, sociological methodology within sociology, psychometrics and psychological statistics within psychology and cliometrics within history and chemometrics within chemistry and you name it, OK? All these are methodological subdisciplines submerged within substantive fields. But they are now talking to each other. And they're forming a sort of meta scientific discipline. We actually have a name for it. We're calling it data science, whatever the heck that is. It had to be a name different than all the existing names, right? Even though there's a lot of existing names. And it's great that there's these communications, because the solutions in one field turn out to be solutions in the others. We have a seminar at Harvard every Wednesday at noon. You're all invited. Free lunch, by the way. It's in applied statistics. It's billed as a tour of Harvard statistical innovations and applications with weekly stops in different disciplines. And I'll give you one example from this. We had an astronomer come-- oh, no. We had a political scientist come one week, predicting the number of presidential vetoes a year. Now, if you look at that, it goes like this. The average is around 15, some zero, some some more. It goes up, all right? And there's some statistical methods to predict that and some innovations in statistical methods to deal with a particular type of counts in presidential veto data. Next week someone came from the Chandra X-ray Observatory. What the heck is that? So that's a satellite that orbits the planet. It's a telescope in the X-ray spectrum. OK, so what's that? Like what does the data look like? So it's basically like a little checkerboard, and in each of the squares of the checkerboard, it counts photons, little particles of light. So it's counting photons. So it counts, right? And if you look at the data, you say, well, how many counts are there per period or whatever it is? And the person standing in front of the room said, well, it goes up and down. Sometimes it's zero. Usually it's about 15. And so it turned out that the political scientists got some methods that the astronomers were developing. And the astronomers got some methods that the political scientists were using. And there was seamless communication of information, even though neither one had any idea what the other one was talking about substantively. [LAUGHTER] So that's the kind of thing that I think we'd make a lot of progress from. So it was a great question. [INAUDIBLE] He wants to stop us. [LAUGHTER] OK. Sure, sure, sure. [INAUDIBLE] I think we should, because I know your time is limited, too. So I also wanted to say that the line that was occurring to me from William Gibson, the cyberpunk novelist, was, the future is already arrived. It's just unevenly distributed. I'd like to give a huge thank you to Gary for giving us a glimpse of that exciting future. [APPLAUSE] [MUSIC PLAYING]


Early life

King is the son of Bruce King, a three-time Governor of New Mexico,[2] and Alice M. King. He attended New Mexico State University and obtained a bachelor's degree in Chemistry in 1976. He received his Ph.D. in organic chemistry from University of Colorado, Boulder in 1980.

He then attended the University of New Mexico School of Law, where he received his J.D. In 1984, King formed the law firm of King and Stanley in Moriarty, New Mexico; in 1990, he assumed the position of Corporate General Counsel and Senior Environmental Scientist with Advanced Sciences, Inc., an environmental consulting firm.

In 1987 he married Yolanda Jones and more than 1,000 guests attended their wedding. Yolanda Jones King was the director of Engineering & Technical Management at the Air Force Nuclear Weapons Center (AFNWC) at Kirtland Air Force Base. She also served as chair for the NATO RTO Sensors and Electronics Technology Panel.

Gary King often accompanied his wife to meetings. They traveled to countries such as Taiwan, France, Italy, the Netherlands, Slovenia, Romania, Poland and the Czech Republic.[3]

Political career

King ran for Governor of New Mexico in 1998, but lost the Democratic primary to Mayor of Albuquerque Martin Chavez. In 1998, he became the Policy Advisor to the Assistant Secretary for Environmental Management at the U.S. Department of Energy (DOE) in Washington, D.C.. Within a year, he became the Department's Director of the Office of Worker and Community Transition. While at the DOE, he developed and implemented a program fostering cooperation between federal, state, local and Native American governments to enhance cleanup activities. He served for 12 years in the New Mexico House of Representatives.[citation needed] In 2004, King ran for New Mexico's 2nd congressional district seat, losing to incumbent Republican Steve Pearce by 60%-to-40%. In 2006, King was elected Attorney General of New Mexico. He was re-elected in 2010, winning against Curry County District Attorney Matthew Chandler.[citation needed]

As the 30th Attorney General of New Mexico (2007 to 2015) King spearheaded the effort to get legislation passed that made it a felony crime to engage in the practice of human trafficking. The United Nations committee invited King to present this legislation as a model for other nations seeking to end the practice of human slavery.[3]

On March 2, 2011, King on behalf of the Respondent, New Mexico, argued before the United States Supreme Court in Bullcoming v. New Mexico. On July 10, 2012, King officially announced that he was seeking the Democratic nomination for Governor of New Mexico.[citation needed]

2014 Gubernatorial race

On June 3, 2014, King won the New Mexico Democratic Primary for Governor, defeating the following candidates; Allen Webber, Lawrence Rael, Howie Morales and Linda Lopez, which all of them immediately endorsed him after losing the primary election. King unsuccessfully ran against incumbent Republican Governor Susana Martinez in the General election. He told fellow Democrats at a fundraiser that Martinez "does not have a Latino heart".[4]


  1. ^ Bruce King Obituary - Santa Fe, NM | Santa Fe New Mexican Retrieved 2018-08-31.
  2. ^ "Gary King's Biography - Project Vote Smart". 1954-09-29. Retrieved 2014-03-15.
  3. ^ a b Carol A. Clark: For New Mexico’s Attorney General and His Wife – It Really Was Chemistry, Los Alamos Daily Post, June 2, 2014
  4. ^ "King on Martinez: 'no Latino heart'",; accessed October 18, 2014.

External links

Legal offices
Preceded by
Patricia Madrid
Attorney General of New Mexico
Succeeded by
Hector Balderas
Party political offices
Preceded by
Diane Denish
Democratic nominee for Governor of New Mexico
Succeeded by
Michelle Lujan Grisham

This page was last edited on 26 August 2019, at 11:37
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.