To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

From Wikipedia, the free encyclopedia

Jeff Offutt
Offutt in 2002
Born (1961-04-30) April 30, 1961 (age 62)
Lexington, Kentucky, U.S.
Alma materGeorgia Institute of Technology[1]
Scientific career
FieldsSoftware engineering, computer science
InstitutionsGeorge Mason University, Clemson University
ThesisAutomatic Test Data Generation (1988)
Doctoral advisorRichard DeMillo
Websitecs.gmu.edu/~offutt/

Jeff Offutt is a professor of Software Engineering at the University at Albany, SUNY[2].[1] His primary interests are software testing and analysis, web software engineering, and software evolution and change-impact analysis.[3]

He is the author of Introduction to Software Testing with Paul Ammann published by Cambridge University Press. He is the editor-in-chief of Software Testing, Verification and Reliability with Robert M. Hierons. He also helped create the IEEE International Conference on Software Testing, Verification, and Reliability and was the first chair of its steering committee.

In 2019, Offutt received the Outstanding Faculty Award from the State Council of Higher Education for Virginia, the highest honor for faculty at Virginia's public and private colleges and universities. The award recognizes accomplishments in teaching, research, and public service.[4] He won the Teaching Excellence Award, Teaching with Technology, from George Mason University in 2013.[5]

Offutt is known for many fundamental contributions to the field of software testing, in particular mutation testing,[6][7] model-based testing,[8] bypass testing of web applications,[9] and automatic test data generation.[10][11]

Dr. Offutt received his undergraduate degree in mathematics and data processing in 1982 (double major) from Morehead State University, and master's (1985) and PhD (1988) in computer science from the Georgia Institute of Technology. He was on the faculty of Clemson University before joining George Mason in 1992.

He is the son of Andrew J. Offutt and brother of Chris Offutt. He is married to Jian and has three children, Stephanie, Joyce, and Andrew.

YouTube Encyclopedic

  • 1/3
    Views:
    9 693
    1 774
    712
  • GTAC 2010: Automatically Generating Test Data for Web Applications
  • Myths about MOOCs, Ebooks, and Software Engineering Education
  • Transforming Your Environmental Workflows with Mobile Apps and the Operations Dashboard

Transcription

>> OFFUTT: I want to say it's really an honor to come here. I've heard of GTAC before. I've heard of the company Google a couple of times before and it's--I mean it's really--it's really a pleasure to come here. I'll tell you something that was--that's really been exciting is I found that every single talk yesterday was interesting and I learned something. And that's--I wasn't so sure that would be true coming from, you know, as an academic researcher coming to a more industrial conference, but it's been fascinating. Good talks, very interesting subjects, I took a bunch of notes of things I want to look at as potential future research problems. So really I want to get to my talk quickly and I've cut out a lot of slides because I want to sit down and listen to the rest of you all. The talk after mine, by the way, is fascinating by a brilliant young scientist who, by the way, has had a great education. I'll tell you more about that later. So when I--when Google first asked me to come to GTAC, I said, "Well, here are a couple of things I might talk about." And they said, "Well, this one isn't very interesting and this one is and this one is. Can you do both?" And so I thought, "Well, can you give me three or four hours?" And they said, "No, only James Whittaker gets three or four hours at GTAC." So they said, "But you need to combine those and also fit it into an hour." So I think I've done that, at least I've done the first part. The other thing that's really pleasing to me is on the plane over here, I was looking at my slides and I was thinking, there's one part that I'll get to in a few minutes. And I thought, "Well, a lot of testers aren't going to get this because it's fairly technical, low level, really relies on program analysis that, you know, requires a lot of programming knowledge. And then yesterday, I thought, "Jeez! I should add more to that section because that may be the most interesting part of the talk for a lot of us. You know, I do want to make one other point, by the way, my book has never been recycled in the diapers, either for children or adults. I'm proud to say that. It was printed on recycled paper and I'm not sure where they got that paper, by the way. Now, I have some clue now. So let me get to it. So there are kind of two real ideas I'm going to talk about. One is more related to web apps than the other. And there really is the section three and five in my mini outline. And I'm going to start off with some motivation, I think as a lot of people here are testers, you probably don't need a lot of motivation. You may have seen some of these but then again some of you may be able to do what I do, which is steal some of the--some of the facts and use them to motivate other people for what you're doing. There's nothing else--there's some interesting stories, some things to think about. So I've been doing this for awhile. I finished my PhD in '88 and that was on the subject of testing. And what was frustrating in the '90s was I always felt like I was teaching something or selling something that most people didn't really care about. Because in the '90s, the quality of your testing didn't really have a big impact on the bottom line for most companies, right. It was not really competitive. But in the 21st century, we're going through major changes, and that's pretty exciting to somebody like me. A big change, we have a lot of things in our civilization that are being controlled by software and that's pretty exciting and because it's sort of controlling fundamental infrastructure. It has to be really good. The other thing is compared to the '80s and '90s, we have a much bigger market. It's more competitive. We have a lot more users, you know, there are maybe a few million back when I finished my PhD, now there are several billion. I don't--there's something like 6 billion people in the world. I don't know what percentage of them is using software on a constant basis, but it's way more than half, it's very high. Another thing that's--that I think is really interesting is how often we put software in places that we don't really think about. So I have this little thing, I have this--this little thing as software and, you know, I gave up my slides a few minutes ago. This is--there's a fair amount of software in here. Then I was thinking in my room last night, I was thinking about what I brought with me. Well, don't laugh; I'm a bit of old school in some ways. So this is an old Palm Pilot. It was good when I got this. It still works. A lot of software embedded in there. I have to listen to music, especially on, you know, 17-hour plane rides. My phone, of course, which is--it's not a very smart phone but it's still a lot of--a lot of software in there. I brought a camera. This even has some software, so if my hand is shaking it would still take a decent picture. That's pretty clever embedded software and, you don't realize even--even this brick to power up my computer, it's got some very clever software that allows, that converts 240 into 120 volts, so when I travel abroad. So if you just think of--look in your pockets and think about all the software you have is probably more than we had in the entire world 30 years ago. And all that software has to work very well because it's not the software that we're caring about; it's the device that it's embedded in. And then the--one of the big changes in the field now is agile processes. And I'm like a lot of people, I'm still kind of open-minded whether agile processes are going to work or not but one thing I know is it puts testing front and center. And from my perspective, that's a good thing. So we're in the middle of a revolution and it's changing--dramatically changing what testing does to the success of the software and the bottom line of companies. And that's very exciting to someone like me who's been a researcher for years and an educator in software. Now, a good friend of mine named Mark Carmen--I'm not sure if you can see that down there--gave me this quote and I just love this quote that we have a civilization with a skin now and the skin is software. And you think about--just think about the amount of software it took for us to travel from our homes through Hydrobed, to make reservations, airplane reservations, to get our luggage here, to check in at the hotel, to check in at the airplane, even--I mean I drove to my airport with a car that has over a hundred chips on it. It's a skin that surrounds us all the time and I just thought that that was a wonderful metaphor for what software is doing, and all that software has to work well. Here's a kind of a scary example of what happens when the software doesn't work very well. So this is Airbus 319, and I got this example from a friend of mine, Mary Jean Herald, who was actually on a plane when this happen once. The pilots were up in front of the plane flying from--this was from Atlanta to London, so somewhere over the North Atlantic in the middle of the night, it lost their autopilot. Okay, that's an inconvenience but think--they still can fly planes without autopilot. Then they lost their flight deck lighting and the intercom, which is pretty disturbing. Then the next thing that happens, they lost their flight and navigation displays. And if you've been ever been driving on a dark road and you turn off your lights, you can imagine what that feels like when everything goes black, especially when you're out in the middle of the ocean, in the middle of the night. So what do you think the pilots did? Any guesses? Pilots are trained not to panic easily. So I hope they--those are the planes we don't know what happened. Any other guesses? Yeah. That would be a good idea. I think the Air Force pilots would do that. You know, actually they did the same thing that you and I do when our computer screen goes black. They held the button down and then they prayed. I don't know if they're Christian, Muslim, Atheist but I know they prayed because they're flying a glider in the middle of the night over a very cold ocean with 500 lives behind them. And they pushed the button and indeed the software came back up and everything worked fine, and they made an announcement to the passengers, "We're sorry for the inconvenience for the interruptions to your in-flight entertainment service. We will try to put your movie back where it was when we had the slight interruption." And most of the passengers never knew what happened. This has happened on this plane several dozen times in the past few years and they still haven't solved the problem, but they know it's software. But they haven't solved this. So, the manual now for this plane has a page that says, how to recognize this problem and what to do. And I'm not sure if the manual includes the prayer because I don't think it's actually necessary to that point. But they have it in there. Last night, by the way, I read a short article online; Nissan is recalling several million cars. Why? Because of a software problem. That was just announced yesterday. Here's some other software failures that have been documented with some money attached to some of them. So there this NIST, NIST is a US government agency, National Institute of Standards and Technology. They did this detailed study and found that we're throwing away billions of dollars every year because of bad software. And they estimated that if we just tested better, we could cut that in half. That's a fair amount of money. Some of you may remember that this is northeast of North America, around the Great Lakes region, starting in Canada then propagated all around the Great Lakes. It was a smart software error that caused--that in the alarm system that caused one power station to go offline and that propagated and cost, I think it was something on the order of 30 or 40 millions of dollars damage to various systems. This--one of my favorites is this. Do we have anybody from Amazon here? Yeah. This is great. One of my friends got a bonus with this. So it's a buy one, get one free offer, except there was an if statement that was written backwards. So if you apply the coupon to get one free, you could still get the second one free. So I had a friend--was actually a student who actually got two items like that. I don't know how much they lost. Probably not that much because I'm sure they fixed it quickly. But that's--it's a very small thing. A big sea change occurred 2007. So Symantec tracks software security vulnerabilities, they have for many years. And they found in 2007, we crossed the line. Most security vulnerabilities are now due to the software errors, not the network problems or database or photography. It's now software. So if you don't test your software, you can't have secure software. And then there are these estimates about financial services and credit card sales applications where--and this is just in the USA by the way, these losses; every hour, millions of dollars. NIST pass on to consumers. So I don't really know what the worldwide monetary losses. But it's staggering and it's probably between a 5% to 10% drain on the world economy just because of bad software. Now, who can--who can improve the situation? It's people like you where I can only have a mild influence. People actually do things that can make a big difference. So how do we do this? This is an idea that came to me out of the textbook I wrote and this is really a process kind of view. And I'm going to talk about this briefly and then say how we fit the more technical aspects of testing in here. So this may look a little bit strange. I'm going to walk you through this graph. We--I look at testing as activities that we perform down here, I called the implementation abstraction level and up here the design abstraction level. So we take some software artifact that maybe the source code, that maybe a URL design document or requirements or user manual or something that describes something about the software. Lots of artifacts can be used to test from. And then we go through some analysis and we create some model, a graph or logical expression, something along that--we actually have four models we define; then we apply some engineering principles to create requirements on our test. Let's say, our test have to do something, has to cover every edge in a graph, for example. So those requirements are--describe how the test should be designed. We also have this sort of other path where we look at our software and we use a human-based approach to develop test requirements that are separate. And the interesting thing is you--these different kinds of approaches will detect different kinds of faults. And there are some faults you can't really get to with criteria and there are some faults that you'll never get to with the human-based approach. A human-based approach, if you only use that will also tend to yield lots and lots of tests in a fairly inefficient testing. Those tests are sometimes refined into something more detailed, depending on exactly what those--what the requirements look like. Then we generate values and that comes back down out of these design abstraction level, that's what I call it. So that's where we get actual values. Up here, we're essentially doing math. Just like real engineers, right. A civil engineer uses mostly algebra and calculus to model things about structures like a building or a dam or a lake and or an airplane and then they use those models. Those--that--those calculus models of the artifact they're trying to build to do all sorts of things. Compute what kind of materials were needed, what safety is--security issues, et cetera. This is the same kind of approach. It's really traditional engineering using mathematical structures up here to do some of our design work in an abstract way. It makes it more efficient. Then we--once we have the test values, we add some additional values, largely dissatisfy the kind of controllability and observability issues that Bob talked about yesterday. We automate those in the test grips, execute them, evaluate them and the results down here will provide feedback. Up here, we need more tests or maybe we need fewer tests or we need better tests. So, there is that kind of feedback loop through here. These activities, it turns out, can be grouped fairly readily into a couple of cat--in a few categories. Test design at the top, test automation down here, test execution and then test evaluation. And so, what we're doing by this separation is we're separating the tasks in the different kinds of activities that can be performed by different kinds of people and the interesting part is, the kind of knowledge and skills you need to do to test design up here are very different from the kind of knowledge and skills you need to do with test automation. And that's very different from the kind of knowledge and skills you need to execute and evaluate the results of the test. Okay so, we need different people to do those. And in any test organization, if you take someone who is really good at this, test automation, and have that person design tests, you're probably not going to get a good test, you're not going to have a happy--a happy employee and you're going to waste a lot of resources. So, that's just pure--that's poor management of your human resources. And what happens eventually is you have people who are good at this, doing this kind of work they, they leave. They want to go over to development and do something more interesting. So, separating the tasks allowed, allow us to sign the right activities to the right people and that's just playing good people management. Not that I'm a manager, in fact I stay in the academe partially because I'm afraid the company would make me manage but doing this separation allows us to raise our abstraction level and in this process of test design, by being separate from dealing with values, we're just dealing with mathematical abstractions, it makes that process much, much simpler, just like the algebra and the calculus does for traditional hard engineering fields. So, this is a process that I've been talking about a while--for a while. Every company I've talked to that has--that has started trying this and every manager project has found that it helps them get more tests faster and cheaper and their employees are much, much happier in their job. And that's very important because here is another graph that shows the cost of not doing the right kind of testing at the right time. This is something I pulled out of the report from the SCI and I redrew their graph because it took me about an hour to understand their chart but they documented a number of projects. It was--it was something like 30 or 40 projects and calculated the cost of what happened. When tests were found, when faults appeared in the program, when faults were found and the relative cost of finding and fixing faults. So, let me just walk you through this. The yellow here is when faults were originated. When they were put into the program, so requirements, design, programming. Then the green here is when we're actually able to detect those faults. So, a lot more faults were detected during system test, 20% during program unit testing and integration testing, few during requirements, a few during--I'm sorry, during design, a few during requirements and then not very many in production. But the real key are these red bars. The red bars represent the cost. So, the yellow and the green, they're percentages. The red is the unit cost. So, unit cost is fixed at one for finding and fixing a fault during early in the process. By the time we get to the integration test, it costs about five times as much to find and fix the faults in software. By the time we get to system testing, it's 10 times. And then if a pro--if a fault gets through to production, like all the examples I showed a couple of slides ago, that's about 50 times the cost of finding and fixing the faults early on. They're not as many but if we just assume something simple, $1,000 unit cost, 100 faults just to make the math easy, then we get these costs and you can see 6K finding problems during requirements, 20K during unit testing, and then 360K and 250K. So, the bulk of the cost is actually out here in system test and production, even though there are relatively few faults found in production, the cost per fault is so high that the cost of finding those faults starts to sky rocket. So, my view as a teacher and as a researcher is a big part of my job is to take these green bars over in design, program, unit--or unit test and integration test and pull those up by finding more faults there. And thereby taking these green bars and pushing them down so that then the cost starts to change. So, these new circles are the cost if we push those down, you know, I just took some estimates of how we might be able to do a better job testing. You can see that we wind up spending a little bit more money over here but we spend a lot less money over here. So, just with these nu--sort of, arbitrary numbers, that's a huge cost. Roughly a third of that--of the cost gets saved just by finding more faults earlier. That's a big win. Now, I know companies that don't do any testing until the system level. What do we call those? We call those companies that produce bad software, right? And I know companies that spend a lot more time finding tests early but there's a lot more we can do. So, how do we--how do we do that? How do we do--how do we get better tests? Well, one thing that's really clear is we need better tools. When I look at the tools on the market, from my perspective of seeing all the researches over the last 20 years, I'm very disappointed. Almost as disappointed as I am in their quality of PowerPoint, which hasn't really improved much in the last 20 years, but not quite. We also need to do--to have better practices and techniques. And the interesting thing that I've learned about Google, is that Google is doing a lot of this. A lot f companies aren't. We need more education, that's partly my fault as a--as a professor and we need different ways to organize our management strategies. The other thing that happens, a lot of testing QA teams don't have much technical knowledge. You know, how do you get on the testing QA team? If you get a degree in sociology and you want a job or, you know, you get hired by the developers and you're so bad, they ship you off to the testing group. That's not always true but that happens far too often and if I live in a place where there are a lot of government contractors and they're famous for doing that. They're very slow to adopt to anything. They need new ideas in fact. We need more expertise and when I compare that to development, the amount of knowledge a programmer needs today is vastly more than what a programmer need 10--needed 10 years ago. So, that's been increasing and testing knowledge required is going to be increasing the same way. We also need more specialization like the chart I showed you a minute ago. We have a lot of specialization and development that occurred in the '90's and earlier in its decade. That needs to happen now and we also need to reduce the manual expense. We're doing a lot of work by hand that could be done automatically and one of the biggies is getting the test data. Going through that design process at the top of that chart and then finding the values. That's largely done by hand right now. A lot of that could be automated. So, I'm going to talk a little bit about automatic test data generation and a couple of techniques for doing this. So, I had some students look at a couple of--look at some--look at some tools. I'm going to mention--they're summarized to fairly small studies and just to give an idea of what's going on with some of the tools. So, one student wanted to evaluate some automatic test data generators and we had some constraints. So, they had--we wanted to try it with Java programs, those weren't constraint--Java classes. We wanted to--we didn't want to spend money on the tools because we didn't have any. So, they had to be free. So, that's a bit of a limitation. And we evaluated these by seeing mutants to represent faults and seeing how many mutants were killed or how many faults were killed. Then we also added a couple of test criteria. Random test data generation which is really, really simple. So, she wrote--it's actually partially by hand and partially with a tool that she wrote in a couple of days and then adds edge coverage which we did by hand on the control flow graphs of the classes. Not a really big--not a really big study but the results were interesting. She came to me with these results and said "JCrasher is the best tool." I said "It is." It was not a lot of difference but there is some. Okay. But that's not the interesting part. She said "What do you mean?" And I said "Okay. Go add random testing and add edge coverage." And she said "Why? And I said "Just--just do it." So, she came back a few weeks later and said "Well, JCrasher was still the best. Edge coverage did pretty well, Random." I said, "You're missing the point. What are these three tools that you've got doing?" They're doing exact--almost exactly the same that random test data generation is doing." Right? You can stand here and throw darts at a dart board without looking and get the same results that these--that these three tools are doing or getting. But yet, here is the simplest, dumbest test criterion in edge coverage. Way out performing those, 68% isn't that impressive when you're talking about finding all of mutants in a program but it's still way better than these. So, they are essentially generating random values. That's what they're doing. I had another student--a couple of students do a fairly similar study but with a slightly different intent. They wanted to look at different criteria. So, I don't know if you know all those. I'm not going to spend any time describing those, there's a--I could recommend an out of date book if you want to read about those. We generated test again for Java classes and this time, hand seeded the faults because one of the criteria we're looking at was mutation and again, not a huge study and these were--we're looking at individual classes so fairly low level, 88 faults. And this is what we found. So, the green bar is--this is how many faults out of 88 were found by edge coverage, edge pair, all uses data flow. Prime path is weighted extract a finite number of meaningful pads from a--from a graph and then mutation, well, if you're really interested in test criterion and thought a lot about it, I think that actually the most interesting thing here is these blue bar beside mutation. The blue bars represent the number of tests that were needed and it's actually normalized so that they fit on the same graph because they are fair amount more than the actual faults that were found. But the fact is we needed a lot fewer test with mutation and if you're familiar with the criteria, that's pretty intriguing. If you're not, don't worry about that, but what these means, we have some really powerful criteria. That's sort of the summary. So, if we look at these two studies, I mean they can't be compared directly, but we can sort of summarize them together. We have some test data generators that are out there that really aren't very good. And they're--all three of those are widely used by the way. We found many dozens of companies that are using those, thinking they are good tools. Edge coverage is the--one of the weakest criteria that we've ever developed but it's much better than any of those tests. So, we have a long way to go. The hardest part was generating those test values. Okay. And the good thing for me, out of this--out of these studies is that we need to test better and we know how. We are not using all the ideas for how but we actually know how to test better. So that leads me to one way for getting the test values. That was the hardest part, it's getting the values. This is an idea that's been around a while called dynamic domain reduction. So, what does that mean, well, automatic test data generation tries to find inputs that will be effective at finding faults. There are two--there are two things that a test data generator has to do. It has to satisfy the syntax requirements on the inputs, the right range of values, the right type, I mean, it has to put an imager for an imager, for example and some semantic goals like a coverage criterion or whatever you want your tests to do on the program. If you're a theoretician, this is--all of these problems are formally undecidable. So that frustrates the theoretical people a lot. I don't think it matters to somebody who wants the test software. The syntax depends on the kind of level we're testing, so satisfying syntax requirements for unit testing is very different than for system testing, right. We have--we all define parameters for method cause in a class. We have a different input language for something like PowerPoint. Semantic goals also can vary from our one--some random values, maybe I want special values, or invalid values, or I have a test criteria and I want to satisfy. I'm going to talk about this--one method that is really applied for unit testing to satisfy some test criteria and this work--this research actually started way back in the late '70s with Fortran and Pascal, individual functions, with a method called symbolic execution which is now used quite widely in compilers for things like optimization and they recreate constraints that describe values and then use something linear programming to solve those constraints. So, this worked pretty well on very small functions and usually didn't quite satisfy something as simple as statement coverage and I don't know--you can probably can't read this but, if you want to look at the slides later, those are some references to some early papers, I think, the first one I found was 1975, so this started a while ago. Then the early '90s, we came up with some better techniques, some heuristics for solving the constraints instead of the LP servers which has some really severe limitations. Some better algorithms that were called symbolic evaluations instead of execution, so kind of a subtle difference that led to, that allowed us to test larger functions. Edge coverage got easy, dataflow, we started them pretty well on that and reasonably good on mutation, this is--actually, when I came in to this research area as a Ph.D. student. Then later in the '90s, we developed an idea called dynamic symbolic evaluation. Now there is a concept called concolic which is the same idea but a new idea. Something, a combination of concurrent and symbolic, I think, that's--that you may have heard that name. And then, a technique I want to tell you about called dynamic domain reduction which solves the constraints in a fairly effective and efficient way, and this allowed us to handle things like loops which we couldn't before, or raised pointers, anytime you put things on the stack and get very high mutation scores. So these other test criteria were solved pretty easily and now, we actually get high mutation scores. So let me--I'm going to talk about dynamic domain reduction, but the thing that's happening now in this area is people looking at, using search-based procedures like genetic algorithm, et cetera. Those have promise and I'll tell you why in a second. They're a lot simpler, they may scale higher but they haven't actually, though doing about as well as these techniques did, right now. So, they haven't, they haven't really gotten that solidly good yet, so, let me walk you through dynamic domain reduction. The problem with the previous techniques before this is that we get the systems of constraints that would explode very quickly, and reasonably large computers would start to run out of virtual memory. So, you were pretty limited in the size of the methods you could test because it would, the system constraints that describe the test were needed were described completely. Dynamic domain reductions says instead of keeping all the--instead of generating all of the constraints than satisfying them, we're going to satisfy them on the fly, so it's a more dynamic approach. So, for each input variable, we defined some input domain so that's the range of values that it needs to have, okay. So, for an integer, maybe negative maxent to maxent or zero to a hundred depending on the problem. Then we pick a test path, so instead of solving all test paths--once we solve one test path at a time then we walk through that path, symbolically evaluating it and then each step instead of keeping the constraints, we use the constraint to reduce the domain of one or more of the variables and throw that constraint away that tells us how to take that edge. We hid expressions, we evaluate those with some domain-symbolic algorithms that also reduced the domains of the variables. And when we finish, if we have values, we know for certain--those values will ensure that that path is executed. If it's empty, then we have to read re-evaluate the path. We may have to make different decisions. Let me show an example. This is a really simple example, very small method that takes three integers and decides which one is in the middle. So we started with initial guess and we have a lot of decisions of y is greater than or equal to z, we go this way; y less than z, we go that way, so let's start with our initial domains and just to make it simple, I centered these around zero from -10 to 10, three integer variables and let's pick a path. So let's say we want to take this path, 1 to 2 to 3 to 5 to 10. So we want an input that will take that path, that's--comes out of a test requirement. We have to get down to here. So how do we do that? Well, we start with the first branch. To take this first branch from one to two, y has to be greater than z, so we adjust the symbolic domains of the variables, y and z, so that all of the possible variables satisfy that constraint and we do something that's called choosing a split point. So, in this case, we chose right in the middle so that both variables are balanced with approximately the same number of possible values. So we split on zero, so then y's domain is now -10 to zero, z's domain is one to ten, and so all of the values in here will ensure that I take this edge, that's a guess, right. We chose a value to split on and it may have been a wrong choice, so if we get to the bottom, we don't have any solutions, we have to go back up and we'll make a different decision. So this has some built-in searching process and the decision and the algorithm that we built was used something called interpolation of numerical analysis where you guess in the middle then halfway between the beginning and the middle then halfway between the middle and the end then you keep bisecting it until you find something or you run out of effort. We run out of energy to keep searching. It doesn't always work, okay, but it always terminates, and if it works, we know we have a good solution. So then the next branch is two to three, x is greater than or equal to y; we choose another split point for x and y, -5 in this case. So now, x's domain is reduced to -5 to 10; y's domain is -10 to 5. We chose another, we take the next edge as 3 to 5, x is less than z. We choose another split point leaving this domain, -5 to 2, -10 to -5 and 3 to 10. And then the last edge is 5 to 10 is always taken so there's no constraint or decision associated with that. So, our final result is this set of domains here and any value from those will ensure that that path is executed, for example, 0, -10, 8. So this works when you have paths with a relatively large number of values will satisfy that path. This is a very efficient algorithm and that's true. If only a small number of values will find that path, you have a good chance of finding it, but sometimes you need to do more searching. So, for example, if you have a decision that says, if x equals 5, then you have to set x to 5 at that point which is a very small domain that may cause interference with the other variables that it maybe compared with later, okay. But, you, but it does eventually terminate and it actually works very, very well for methods with 50 to 100 lines which is actually a fairly large method by today's standards; not a system level technique, a unit level technique. Okay, so this, the hard part about this--now you can get into the issues of loops and pointers and rays, that makes it more complicated but these algorithms have ways to deal with those. Again, not complete, not assured that we'll always find a solution, but if we find a solution, that always works. The very complicated algorithm but they're actually very, very powerful, and I've seen--there may be more, but I've seen four companies try to build commercial tools based on kind of similar algorithms if not these, two of them, they failed. They couldn't get the algorithms written correctly and they generate what are essentially random values. One I got a chance to analyze very carefully on a consulting--the problem is they couldn't make a business case which was lucky for Google because the founder of Agitar now works for Google and makes funny videos that one of which you saw yesterday. And the tool is now owned by McCabe software, they won't let me look at it. There was also a tool that Microsoft is developing called Pex. I haven't actually used this but from the descriptions and the research papers and the like, it's actually--it's very similar, it's using similar technologies but I don't know the details. The search-based procedures that are, they work on now, they're much easier; the algorithms are much easier to built. So, as an example, the tool that implemented dynamic domain reduction was significantly harder to build than a compiler. Okay, so this is not simple stuff and not a little bit harder, but significantly harder than a compiler for language like Java. So, but, we're looking at, you know, easier algorithms but they're still less effective. The problem with this approach is nobody's found, nobody has yet found a way to make a business model work out in selling this kind of tools partly because it's really hard to build these tools. So another question is, that works for Java classes, what can we do out sort of a higher level? For example, if we want to test web applications and generate test automatically. I came on to this question through--what I did call input validation testing. So an input validation testing, you have some domain of inputs that the software is expecting and you want to make sure that the software processes those inputs correctly and not inputs out of that domain or at least it deals with them in an appropriate matter; shouldn't crash, it shouldn't return incorrect results if you give it a value outside of the valid range. So, if you're a wise programmer, you check your inputs before you use them, right? I say wise programmer because there are good programmers who don't, but wise programmers do because if you don't eventually it will cause problems. And there are some interesting, there are some hard questions in this. How do you recognize invalid inputs? And then if you find an invalid input, what should you do with it? What should you--you just throw it back to the user and say I don't like this, do something different and there are some other options you can have. That's not really, really what I'm focusing on. It turns out, it's not hard to validate input, but it's really easy to get it wrong, and some of it is practical. Some of it is really very fundamental and here's something that makes it really hard to check my inputs and I think of this in terms of how do we represent the input domain? I think about goal domains. All right, a goal domain is what we want to have and they're often very regular. So, if you think about a credit card, what's a valid credit card number? It's more complicated than we might think. The first digit identifies the industry and there's some digits that are valid, some that aren't. The first six digits, and the length specify who issued the card. The final digit is a check digit, so there is a formula for all the previous digits that should yield the check as its last digit, and then all the other digits specify your account, specific account. Thanks for turning off the air conditioner. I think my voice feels better now. So there, if you are interested, there are more details in that link, that's where I got these details, so that's the goal domain. But what do we often see in this specification? Somebody writes down, well, we need you to check that the first digit is in--is one of these, and the length is between 13 and 16 which is American expression, Visa. What's the common implemented domain, what do programmers usually do? Check to see all digits are numeric. All digits are numeric. That's what we usually, that's what most software checks for; not even this thing. And by the way, this isn't fully correct. So if you're traveling on a military ID, there are lots of websites you can't use because I think the first digit is seven, maybe it's two, it's either the two or seven. And a lot of web applications won't accept that number as being valid. So how do you represent these? Well, sort of at an abstract level, our desired inputs or goal domain, it's kind of irregular, right? There are all sorts of bumps and nooks and crannies in the region. The specified domain is similar but there are valid values that are not accepted and there are invalid values that are accepted. And then we have what programmers often do, a very smooth circle to make the software very simple. Are all the digits integer, which is close but it's not exactly right. So as a test, what does that mean? We have this region around the edge of our input domains where we can expect to find a lot of problems. And this led me to--led me to this idea, I call it bypass testing. Oh, and by the way, we also find a lot of security vulnerabilities here, sort of an accidental side effect that I didn't think about until somebody pointed out to me at a conference. So what is bypass testing? Well, if you think about when you're using a web application, we're sitting here with my client, I'm sending sensitive data to a server, my credit card, my address, et cetera, and it's being checked partially on my client, partially on the server, right. So there's some checks in the HTML and the JavaScript running on my computer. If bad data gets through to the server, all sorts of bad things can happen. The database might be corrupted, the server might crash. We might have security issues. And the thing is when I'm doing it, I'm okay, but there are bad guys out there who might bypass all the checks out here that have ability to do that and send malicious data on to the server. And by the way, my next door neighbor works for the CIA, and he's assured me, yes, Bin Laden is a Mac guy. He hasn't updated it in awhile. It's hard to buy the newest technology, the newest technology when you're living in a cave, but he's a Mac guy. Then we have, maybe, these bad nets or crazy people thinking, "Let's see what I can do." And, you know, other kind of dangerous people sending malicious data, sometimes on purpose, sometimes accidentally. The point is, we can bypass all the checks on the client. A lot of that's done with JavaScript or with HTML. Users can turn all that off, right. I can disable the JavaScript, I can modify the HTML. And what bypass testing does is it does that to intentionally validate as many validation constraints as possible. The first way this happened is when we're--one of my students was automating some tests of web application, I think, using HTTP unit. He came to me and said, "Should I embed the JavaScript in the HTTP unit?" And I thought, "Oh, wait a minute. There's my input validation. It's one of those light bulbs. I thought, "Oh, you don't have to run the JavaScript. We're bypassing all of that." So this validates where the input validation is done well. It also checks how robust the software. And there's some security evaluation. This is not a complete security solution, but there's something there. So one of the things--so the first paper was, I mentioned here, I had a master student do--build a tool and do a case study on a bunch of--on a bunch of web applications. So how does this--so we'll look at the data. How does this work? Well, first, we look at the visible input restrictions. So that's on the client in the web app, right. The HTML tags--you know, if it's a radio button that specifies the values that can come in and the attributes, right, through--you can specify the length, for example, of a text field. Those were HTML attributes that restrict the input domain. Then, we have checks in JavaScript running on the client. Then, we model these as constraints on the input. I said, "Okay, we specified in HTML that this field can only have a value 20, 30, or 40, because that's what the radio button values are. So we describe these as constraints or model those, then we intentionally violate those constraints. So instead of 20, 30, or 40, we send in zero or a hundred. And we have some rules for violating the constraints that are sort of mutation-like--if you're familiar with mutation--and it's easy to tune this to get more tests or fewer tests with more violations, fewer violations depending on how much effort you want to put in to your testing, then these are encoded into some sort of test evaluation framework like HTTP unit or Selenium or whatever. And that framework bypasses all of these of checks, right, because you're not sending it through the user interface, through the HTML, you're sending it through a call from a Java program running on your client that is sent to the server. So I had a master student build a tool to implement this, and he came to me and said, "Would you like--okay, the tools running, it doesn't do everything we want, but it does a lot of it, which should I try this on?" I said, "We don't need the source, right. No, we just need the URL to the front-end of the program." I said, "Try it on some commercial websites." And he said, "Well, which one?" And I said, "Which ones do you use?" And he said, "Well, I use the bank. I use Google. I use Amazon. I said, "Okay, go try it in all of those." And as he left through them, I said, "Oh, Macelius, whatever you do, don't log in." And he said, "Why not?" And I said, "Well, what happens if your program causes the bank to dump a million dollars into your account." He said, "Oh, that's, that's not so good because they can come and get me, wouldn't they?" I said, "Yeah, don't log in." So, let me just describe these results. So let's start on the right here. The short story is the blue is good, the red is bad. So the blue is valid responses, the red is some sort of invalid response. And I didn't click this, by the way. I didn't make--I'm not trying to flatter anybody. Google turned out looking very, very good. This was the main search engine page and some options. We didn't try Google Mail so I don't know if that's as good. Amazon did very well. Our cable company wasn't too bad. My service is bad, but their software works pretty well. But then if we kind of go down here, here's his life insurance company. About 70% of the test we ran, we found some kind of failure, 70%. This is production software, been deployed, been used for a while, 70% of the test found a problem. No matter how you measure your tests, 70% of your tests find a problem is spectacular efficiency, right? As a tester, you're jumping up and down. I know how to find faults in software. But this was--this wasn't under test. This is way past test. The details over here, we divide it up into fault and failure where the software accepted the bad inputs and did something with it and then exposure where it crashed and sent the message back that said, "Error in line 25 of program, blah, blah, blah, blah." As--most users' find that annoying and frustrating and we go to another company. But as the bad guys, that's information they use to crack into the system. So that's actually really bad thing. So, you know, here's--here's some other examples. His bank was 12% or 13% of the test found errors, that's not that many, except that's a bank and he didn't log in, right. So there are some things he wasn't able to test. So because we didn't log in, this is kind of conservative. The other thing that's conservative about this study is we didn't have access to the backend. Like, remember that Bob's discussion of absorbability yesterday, web applications have somewhat low absorbability because it's hard to see what happens in the database, another backend, you know, storage artifacts, memory or long-term storage, we didn't have access to that, so we may--some of these tests may have done bad things that we don't know about so they may have found actually more faults then it showed up in our study. We weren't working with the companies; we were just using their software and found all of these faults. So what this tells me is we have a lot of really bad web app software out there. We worked with a company EVIA to help them learn how to implement this in their process a year so later. So they had some production ready software, software that was finished. They turned it over the production which put in the right package, figured out how to deploy it and pushed it out to all the customers. And the software is something that would notify people of problems on a phone switching network. So it had algorithms for who to notify and how to notify them. Again, the test here of invalid inputs, and we expect some kind of exceptional behavior, not some--not just as a success. We didn't--again, we didn't check the backend and we went through six of their screens for the software and generated these many tests for each of the screen, total of 184 tests, 92 of them failed, 63 unique failures. So, 33% of the tests found problems. They thought they were finished testing. The interesting thing is the developer was very pissed off. And he spend a lot of time yelling that nobody would actually give inputs like this so he didn't have to worry about it. And the manager fixed it. So this is--no matter how--how you count that up, one-third of your test found the--finding failures, that's a very effective testing. Okay, so how do we get this--so here are a couple of ideas that haven't really got that much traction in the industry and one is 15 years old, one is just a few years old, they haven't been used a lot--but how do we get there. I was on a panel at a conference a couple of years ago. And we were tasked with coming up with reasons why some of the ideas in testing aren't being more widely used. And we came up with these four. So lack of test education, there's a guy out in West Coast somewhere, Washington State, he had some small company out there, that has--that was quoted in an article as saying, "Half of our engineers are testers." Our programmers spent half of their time testing. I don't know if that's an exaggeration or not, but that's a lot of testing effort. He says that three quarters of Microsoft's time is spent on testing. There's some--I heard some character at the conference last year tell me that people at his company--there's some search company. They look for--there are people that look for things, Google or something. He claimed that they spent half of their time doing unit testing. That's a lot of testing. I teach at a university in the US. You know how many US universities require undergraduate computer science to study testing? Zero, yeah. And this is something like 3,000 institutions of higher learning. A lot of computer science departments, none of them require class on testing. What about master's degrees in computer science? Zip, yeah, no university in North America requires software testing to get a bachelor's or a master's in Computer Science. Yet, these guys say, "That's half--this could be half their job when they graduate." But what do we teach them? We teach them one, you know, week-long lecture about--in semester-long class about how to test based on books that were written 20, 30 years ago so they don't learn anything about testing in fact. Do you know how many undergraduate testing classes there are in the US? About 30. We actually did a survey in 2005 and found 15, and that was--I'm pretty confident that was reasonably close. There are more now. Three or four are being created every year, but that's still a tiny number compared to the universities. Interestingly, when I asked my students in my graduate classes how many people took a class in testing as an undergraduate, more than a quarter of them say they did. And guess which quarter? People from India. So universities in India seem to be teaching testing, and about half of our students at my university are graduate students who are now from India. So a lot of them actually took the class in testing but none of the people who studied in the US. So that's one problem. We don't teach people how to test and so they're not doing a very good job in a lot of cases. Another problem is the process. When you adapt new strategies, you have to change your process and that's really expensive especially for big company. Most of the companies that are using these ideas are sort of small startup companies. It's really hard to change the process especially in a big--in a big company. Another problem is a lot of the tools we have are really hard to use. And companies buy the tools and two or three people use them and then they get stuck on the shelf and they get dusty. What's the tool--the only tool that's widely used for software testing is J unit. All right, that--I'm sure it's by far the most widely used tool and probably by a lot. But for all of these, you actually have to know the theory. You have to take a graduate course in software testing to use a lot of the tools. This is for usability. So I drive a car all the time, I don't know how that engine works, why should I know how a testing tool works? We have a lot of people programming that have no idea how compiler works. Why do you have to know how a testing tool works to use a testing tool? Because the usability was designed poorly. Then the other thing, you know, I talked about this little bit with my examples, we have a lot of very bad tools. Tools that's just aren't effective, they don't do very much, but people don't know--they actually think they're pretty good. And one of the key technical problems is generating those test values and very few tools help you design or generate test values, right, that's a major issue. So something like J unit, it's a box. It's a very useful box, but it's a box. What you put in that box? Will you put in, hopefully, good tests? How do you get good tests? Well, we do that by very slow process, everything by hand. Why haven't we seen more Automatic Test Data Generation? Well, the tools are either very weak or they're really hard to use. They're very hard to develop. And, you know, I just [INDISTINCT] found out companies don't want to pay for this. They haven't concluded that the return on investment in these kind of tools is worth it to their bottom line. The data that I've seen out of STI and this shows otherwise, but that is a good return on investment. Another issue is folks like me, we want theoretical perfection. When I first saw Agitar--agitator, I thought, "They're ignoring other theory like they're not solving all of the criteria. They're just making guesses." And it really bothered me and then I looked at the results, I thought but they're creating really good tests. And I thought it but they're skipping all the theory but, you know, they're creating a really good tests. And the engineer in me finally said, "That's a good thing, creating good tests." And I read that, I found this little book, I don't know if any of you have saw this, "The Way of Testivus." Let me just open this up and randomly, "An imperfect test today is better than a perfect test someday." And most of academics just can't accept that. It's a pretty good book. And I'll throw my book away and use this book. It won't take my students very long to read it either, that's good. Okay, that's all my props. So, the testers have to understand all this stuff and that's just too much. What a practical tester this one. I had a--I gave a talk last year that said, it was titled, "Testers Ain't Mathematicians." Ain't the sort of my vernacular from Appalachia for it or not. I don't know if I use anything in India. But testers ain't mathematicians and it's true, they're not and they don't want to be. And we shouldn't expect them or require them to be. But that process, the model driven test design process allows one mathematician to serve a lot of testing. So you need not very many mathematicians and the rest don't have to be. So what do we need? We need integrate AT--Automatic Test Data Generation with development. The unit test, the unit level tools have to be designed for developers. And they're easy to use and they have to give good tools but as Testivus tells us not perfect tests. So, here's what I think, a unit level tools should look like for Automatic Test Data Generation. First, these users should not have to know much about testing because they're programmers, they don't want to be testers. Second is you ignore all of those theoretical problems--it took me awhile just to be able to say that aloud but I've made progress, so you just ignore those. It's engineering right, it's not science. It has to integrate with ID. So, if you're using the clip, it should be in a clip's plug in. It should automate with some test framework like GNU unit or if you have another favorite test framework, that's fine. And the process should be finding thoughts as semantic problems in yourself. They're the same as finding these syntactic problems. Compilers come back and tell you your syntax mistakes. Why don't they tell you your semantic mistakes as well? They're not going to find all of them because we're ignoring all these completeness and feasibility but they can find a lot of them very quickly especially sort of the basic ones. And so, after my Java class compiles cleanly, we're C++ if your, you know, working in the '90s, then Automatic Test Data Generation should kick in and start--produce tests, automate them, run them, compare them against an expected value and come back and show the results to the programmers. So okay, here is your next set of problems that you get to deal with. So then, you can start debugging. It's not going to find all the problems, that's not going to be complete but then, we can move those bars up of, you know, during unit testing and move the system testing job now. So we have fewer problems to deal with at the system level. At the system level, while we should be able to generate tests based on the input domain, we should pull that out of the user interface. We should not leave the source, right? And the test should be automated but we have to have a way for humans to come in and add some tests. And I--I've been starting a collection of faults I've found in software that I don't think any criteria could find. And I have a feel--I'd--I'm not going to go into those because, you know, the next guys want to talk soon. But you have to have a way for the humans to come in and add additional tests for things that the criteria are blind to. If we can integrate that together where there's a language that human can put in test requirements that then gets satisfied automatically, that reduces a lot of that manual effort as well, saving money. So the process, as soon as I integrate my system, check in everything, the library, then those tests are created and run and should be part of the integration tool. So instead of making the testers do all the work, it should support the testers. Oops, allowing them to do, you know, the work that requires a human brain. So here's the--so the global issue with test design. We have human-based test design. We have knowledge of the domain, the testing and intuition. And you have to generate values. Then we have criteria-based design, we're using engineering principles to cover, you know, generate values that cover things like the source or requirements. A lot of people go around saying that you have to do one or the other and the other is wrong and the fact that I did that for awhile. Then I heard a talk by a guy who I really associated with human-based testing, I thought, oh, he never makes any sense. And I heard this talk and I kept thinking, "He makes sense. But he's wrong." But this talk really makes sense but I know he's wrong. And I finally realized he's not wrong, I was wrong. And maybe he was wrong because they're actually room for both, so [INDISTINCT] or actually improve my view of testing enormously by teaching me that these are actually both. He doesn't--I'm not sure if he knows that, I didn't tell him that. But there's no reason to be competitive, we actually need both. So, to be able to test efficiently and effectively or effectively and efficiently, we have to be cooperative with the two. And that's actually pretty, pretty hard to do. So to summarize this, researchers like me, we always want perfect solutions. We took all this theory in Computer Science undergrad, right? We had to take three semesters of Calculus, four or five other math classes, [INDISTINCT] theory, algorithms. So we think theories are important but, you know, we're teaching, what's the degree they get? Computer Science. What's the job they get? Software engineering. I think, we're teaching them all wrong. We need less theory, more engineering. And industry needs engineering tools and they need engineers. So we actually need to teach more of that engineering kind of thing. And, you know, the bottom line is we've got some really good ideas for Automatic Generating Test Data, it's ready for transition. And I think the tools should be free and perfectly open source. I haven't seen anybody go over that much effort. The hard part is it's a lot of effort. It's not--in building J unit for free was much easier than building something like this, that's a major--that's a major headache. So, my contact information and, you know, as far as I know, this book has never been used to make diapers. And I tried to find one in my trash can but I think, it had already been recycled. So I was--I think that that Alberto really did throw my book away because I had one. So okay, Andan, do I have time for questions or I should ask the next speakers. >> ANDAN: So, we'll just do couple of questions because we are running short of time. Maybe you guys can take the questions later offline. >> And Jeff... >> OFFUTT: Yeah. >> So Automated Test Data Generation mutation and data constraints, these are essential parts of fuzzing. And most of fuzzing is always done at system level rather than at a unit level. So, I would like to understand how your concepts of Test Data Generation mutation, et cetera translate into the fuzzing domain? >> OFFUTT: That's an interesting question because the ideas of fuzzing, they've been around for years. I mean, they're--it comes out of turn--out of the concepts you've had for things like mutation and general rules for violating the input domains, right, and valid values, stress testing, things like that. And suddenly, there is this term, fuzz testing. Nobody knew the previous ideas but suddenly everybody knows fuzz testing. So, what's really--I mean that's what fuzz testing is. It's using these ideas but it's--it's a kind of a cooler term, right? Just like the ideas for J unit, I mean, I was having students build projects like that in the '80s as class projects. Not as sophisticated as J unit, mind you but the same basic idea. And they weren't being used until somebody came up with this really cool name and they started marketing--marketing it heavily called J units. So yes, that's what fuzz testing is. And concolic testing is the same ideas of symbolic--dynamic symbolic testing with a little bit of twist that they use some assertion in a slightly different way and it actually works better than some of these ideas that were developed more recently. And so fuzz testing is associated with concolic testing as well. But it's the same app--it's the same thing. >> [INDISTINCT] >> OFFUTT: Yeah. >> [INDISTINCT] The reason I asked this question is that what you demonstrate in that--the flow of the cord is known. So that the data constraints can be known and then, you can come up with efficient test data. >> OFFUTT: Right. >> In fuzzing, only the protocol which the data follows is known not the cord which is going to process that. >> OFFUTT: Right. >> The way you demonstrated about Web applications, we don't know what is happening on other side. So I was trying to understand that how your concepts are Automated Test Data Generation, your algorithms could help in Automated Test Data Generation when only the protocol is known. For example, you could do network protocol fuzzing, you could do file fuzzing, so is there any translation? Is there any research work done on that? >> OFFUTT: What--the fuzzing actually looks a lot more like the bypass testing ideas than the dynamic domain reduction. You're right, you need to--you need some structure like the code for that. But by the bypass testing, you don't. So the fuzzing is semi formalized rules for creating--for creating data. If that can be encoded in something more specific with rules for what the values look like which is not hard. And I've seen a couple of papers that actually did that. Then it--those rules that you can start to embed in the tool and create the values. And so our ideas in fuzzing that aren't in the papers that we have on bypass testing but they can be used in exactly the same way. And there's--and you could also have some randomization where you have some rules where you take a value and I randomly change it in ways. And that's--sometimes, that can be very effective. And in fact, the agitator did some of that when it was having trouble going down the path. It would randomly change some of the values. They got close to the path it wanted and often get down that path. And then, there are search procedures that actually do that in more clever ways that use the path constraints as part of the optimization functions or just part of the--part of the solution for things like genetic algorithms. >> OFFUTT: You have a... >> Imagine that--rather [INDISTINCT] as well, so where do expected results come from? >> OFFUTT: So he's asking these techniques, where do expected results come from? That's actually pretty hard. Some of those you're going to have to have a person decide. There's a fair amount of research going on right now to see if with these kind of techniques at the unit testing level, can I come up with expected output or at least approximate expected output but that's a problem that has not been fully solved. At the bypass testing level, it's actually much simpler to come up with expected output because the expected behaviors you get a message that says something like this value's not bad. But at the detail--at the unit level, that's much harder. >> [INDISTINCT] >> OFFUTT: Yeah. >> [INDISTINCT] >> OFFUTT: Failure was--either not getting a four or three or not getting a message that said, "Your data wasn't valid." So it behaved as if it was normal behavior, as if it was valid data. That was invalid behavior. Is there another question or should we go on? >> [INDISTINCT] >> OFFUTT: Okay. Okay, well again, thank you all for having me then. I'm looking forward to the rest of the talks.

References

  1. ^ a b "People: Jeff Offutt". Volgenau School of Engineering. Archived from the original on 5 June 2010. Retrieved 8 March 2012.
  2. ^ "Jeff Offutt | University at Albany". www.albany.edu. Retrieved 2023-11-29.
  3. ^ "Computer Science Department Faculty". George Mason University. Retrieved March 7, 2012.
  4. ^ "2019 Outstanding Faculty Awards". State Council of Higher Education for Virginia. Retrieved May 29, 2019.
  5. ^ "2013 Teaching Excellence Awards". George Mason University, Mason News. Retrieved May 1, 2013.
  6. ^ DeMillo, Rich; Jeff Offutt (September 1991). "Constraint-Based Automatic Test Data Generation". IEEE Transactions on Software Engineering. 17 (9): 900–910. CiteSeerX 10.1.1.118.8072. doi:10.1109/32.92910.
  7. ^ Offutt, Jeff (2011). "A Mutation Carol: Past, Present and Future". Information & Software Technology. 53 (10): 1098–1107. CiteSeerX 10.1.1.360.8045. doi:10.1016/j.infsof.2011.03.007.
  8. ^ Offutt, Jeff; Aynur Abdurazik (October 1999). "Generating Tests from UML Specifications". Second International Conference on the Unified Modeling Language (UML99): 416–429.
  9. ^ Offutt, Jeff; Ye Wu; Xiaochen Du; Hong Huang (November 2004). Bypass Testing of Web Applications. IEEE Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE). pp. 187–197. doi:10.1109/ISSRE.2004.13.
  10. ^ Offutt, Jeff; Zhenyi Jin; Jie Pan (January 1999). "The Dynamic Domain Reduction Approach to Test Data Generation". Software: Practice and Experience. 29 (2): 167–193. doi:10.1002/(sici)1097-024x(199902)29:2<167::aid-spe225>3.3.co;2-m.
  11. ^ Offutt, Jeff (August 1988). Automatic Test Data Generation (Ph.D.). Georgia Institute of Technology.

External links


This page was last edited on 14 February 2024, at 21:20
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.