Internet Systems Consortium

Internet Systems Consortium

Founded	1994; 30 years ago (1994)
Founder	Paul Vixie Carl Malamud Rick Adams
Type	Network Engineering
Focus	DNS, BIND, DHCP, Kea, Internet
Location	Newmarket, New Hampshire, U.S.
Area served	Worldwide
Key people	Jeff Osborn (President)
Employees	35
Website	www.isc.org
Formerly called	Internet Software Consortium

ASN	3557

Internet Systems Consortium, Inc., also known as ISC, is a Delaware-registered, 501(c)(3) non-profit corporation that supports the infrastructure of the universal, self-organizing Internet by developing and maintaining core production-quality software, protocols, and operations.^[1]^[2] ISC has developed several key Internet technologies that enable the global Internet, including: BIND, ISC DHCP and Kea. Other software projects no longer in active development include OpenReg and ISC AFTR (an implementation of an IPv4/IPv6 transition protocol based on Dual-Stack Lite).

ISC operates one of the 13 global authoritative DNS root servers, F-Root.^[3]^[4]

Over the years a number of additional software systems were operated under ISC (for example: INN and Lynx) to better support the Internet's infrastructure. ISC also expanded their operational activities to include Internet hosting facilities for other open-source projects such as NetBSD, XFree86, kernel.org, secondary name-service (SNS) for more than 50 top-level domains, and a DNS OARC (Operations, Analysis and Research Center) for monitoring and reporting of the Internet's DNS.

ISC is actively involved in the community design process; it authors and participates in the development of the IETF standards, including the production of managed open-source software used as a reference implementation of the DNS.^[5]

ISC is primarily funded by the sale of technical support contracts for its open source software.^[6]

YouTube Encyclopedic

1/3
Views:
4 667
3 157
6 230

Transcription

>> ...and SIE as a security information exchange, which is kind of a more or less freeish way to get Paul and Jim later, to get access to a bunch of data, which includes things like passive DNS and spam feeds and things like this. Eric Ziegast is an ISC... >> ZIEGAST: It's a program manager for SIE. >> [INDISTINCT] job. >> Okay, good. And Paul Vixie is, you know, one of the ISC founder folk and [INDISTINCT]. >> [INDISTINCT]. >> VIXIE: You mentioned. >> ZIEGAST: Okay, all right. So, SIE, Security Information Exchange, people are familiar with different kinds of exchanges. You have exchanges of money on--like NASDAQ, you have Internet exchanges which Paul actually founded one of them, Palo Alto Internet Exchange, where they need to exchange Internet traffic and they may exchange pieces of copper between all of the Telco or Fiber these days between all of the Telco gear. Back in 2007, Paul Vixie, David Dagon kind of thought of this idea that there's a lot of pools--there's a lot of poll data out there that a lot of people don't have access to and we need to find a way to get it altogether. So, one of things that we're doing--because this is security data, people don't like sharing it because they can get trouble for it because either they're snooping on their customers or because--or because it can cause harm to people just because of the fact that someone knows something that they shouldn't have. So what we've created is a legal and privacy framework, which is basically a contract and a bunch of privacy directives that's say, "All right, everyone is in this together, you can share stuff freely within the--within the infrastructure but you don't take the raw stuff out. It has to go through lots of processing before you can take stuff out of there. The legal document keeps everyone honest. Another reason we're here is to centralize the data collection. When you have pools of data that are all over the Internet, it's hard to do any cross-correlation. You don't have any standard way of--you don't have any standard way of sharing the data. They have different formats. You can get all the formats the same and you can get all the data together in one datacenter or with in a framework of datacenters all connected to each other, there's a better chance that you're actually going to be able to do some cross-analysis, which is one of the things we're trying to do. You may have some passive DNS, you may have some NetFlow, you may have some Darknet data or whatever you have, but if that's all you have, you're not going to be able to do much with it, but if you can combine some of that, you're going to find--you're going to be able to find out a lot more and find it out a lot more quickly than if you don't have access to it at all. One of the reasons we're doing this is to create a network effect between the security researchers. One model can suggest that it's like stone soup or ever heard of the parable. We're bringing the--we're bringing the soup in the stone, which includes our infrastructure in a network and the tools that we have and then a lot of people are bringing their carrots and onions and potatoes to make it taste better and the more people add stuff, the better tasting the soup gets, and eventually you can actually do some really effective work with it. Typically these days you have relationships between these various participants, businesses, ISPs, law enforcement. The people on the top typically have some of the data or the victims and the people at the bottom usually have the ability to do something with it and they all have their own independent non-disclosure agreements, contracts, whatever you need to manage, you keep everything private and functional and keeping it going on this. That's a lot of paperwork and that's a lot of trust that has to be built up between people and it's inefficient. So one of the things we're doing with SIE is to create the efficient sharing within a common legal and privacy framework. People can bring their data in the SIE, have it a sort of clearing house and a place where it's all available freely and you can share freely within there and you could sign a single agreement that everyone else signs and everyone can work within SIE. We're not going to replace everyone's sharing; we're just helping to enable things which might be inefficient right now. Typically, for the infrastructure, we have a bunch of sensor operators out there. You know, it might be something sniffing packets off the wire, it might be something attached to your mail server that the spam flows into or, you know, you have a WebCrawler and you're going out searching but you just go ahead and create--or basically a packet-sized bundles and they'll all get uploaded via rsync and some scripts that to our redundant servers so that we can broadcast that unto an Ethernet infrastructure within the datacenter. Inside of that broadcast infrastructure are a bunch of researchers who are out bringing their own machines and hopefully some of their own data, the comparatives. As we build out, we're going to build a node on the East Coast and will relay data between the nodes. And additionally, each of the researchers will be able to talk to each other over a private network. And as we add other areas like, for example, going to Europe or Asia or wherever, we may have a relay or people can upload stuff into the cloud and at some point, oh, boy, there's a wonderful--there's a cloud thing, let's strike that. We have relays where the data can enter and at some point it may get promoted so that it can talk to all of the rest in there, it's on equal basis. And we basically take any of the unique data that's going from one area and pass it to others so the others can see it. It may not be all of the data, there's a lot of data and a lot of it doesn't need to get shared but maybe some of aggregate stuff will get shared between the nodes. Here are some other types of data that is out there. We started with passive DNS and DNS blacklist data, we have some Darknet and NetFlow but there's a lot of other different types of data, which aren't necessarily conducive to plain old packet captures, and they have to invent something so that we can describe these various types of data so that they can be officially shared on the broadcast network. The first thing we did with--we work with was passive DNS, Florian Welmer of Germany, he pioneered the capturing and we modified some tools so we can do a better job of collecting it on the name servers to collect more of the data. Typically, a passive DNS is--at least the way that we collect it is you have a name server and your clients all go up through recursive name server or caching server to find out the names that you're looking up. You're looking up www.google.com, well, the client is not going to talk to a Google name server, it's going to talk to recursor at their ISP that where recursor sends a request out to the common name servers to google.com name servers, and then they'll feed the data back to recursive server. And then once the answer is found, the recursive server returns that to the client. The position where we listen is on the downward arrows, they're going back to the recursive name server. There is a great benefit to that and that it helps enable the privacy of the clients because the recursive name server is doing another queries at that level, not the clients. So if you have a large population, say, a thousand people or a million people, you won't necessarily who's making the query but you at least find out the information that's out there and that's part of the goal is to actually get a better map of what's out there as far as IP addresses, mapping to names, mapping to name servers to get a better map of what's out there that you would not normally see if you didn't have these sensors out there. So we asked a bunch of ISPs in universities and friends to donate some data and we're very appreciative of that. And we have been building up and trying to get more data and more data types from different sources. But mainly to do DNS data collection back then was TCPDUMP or DNScap. There are some inadequacies in those programs that they couldn't capture everything so they created a new program called NCAP and NCAP tool and added a bunch of features into that so that you could--so that you could replicate the passive DNS data into the broadcast infrastructure that we're setting up. We added some features for doing plug-ins. They can do filtering on the data as it passes into one. You can filter out the things that you don't need to see so that we can spend more time crossing and setting the data that you want. For the other data types, for example, spam or link pairs or malware or whatever we created end message and we'll get to that later. Also to enable some collaboration between the researchers, we set up a VPN between our sites so that researchers at one site can talk to another using unicast. Basically your typical access to web server or a database or who is or DNS lookups or whatever, but it's all with the private within the framework. Some of the hardware that we need for this needs to handle high packet rates, we need a fast switch that won't drop packets. The servers will typically be 64-bit with a lot of--with a lot of RAM in your storage, you know, if all you're doing is allowing disk is fine before you want to do anything active with it, you're going to need SSD or a lot more RAM because there are high packet rates for a lot of what we're dealing with. NCAP tool, here are some of the--here are some of the things that we did to approve upon pcap or DNScap with larger DNS packets these days you'll actually have--you actually have fragments that is what used to be able to fit 512 bytes doesn't fit anymore, especially if we're doing things like DNSSEC and increasing the number of servers and the amount of data that's coming back with each of these request. So we need to be able to reassemble the packets and NCAP does that automatically whereas you might actually miss that data if you were using just pcap. We drop the link query info. We don't need to carried up the Ethernet MAC address, we're just really interested in layering it three and above. Normalized the network formats so that when we collect on a Sun, we'll work on OpenBSD on a PC or work on an HP, you know, running risk, you know, just basically network, you know, network by order. Nanosecond timestamps instead of millisecond and then added some user-defined flags so we can actually track what was the sensor that actually give us the data. What's key to SIE is the fact that we have this common infrastructure where everyone can listen to an Ethernet bus of the same data. So, when it comes into one of our nodes, you know, we broadcast it on a local area network on a VLAN. And everyone who's on that VLAN will get that packet at the same time, so it's not like we're just sending it to each researcher or actually just broadcasting it that makes for a lot of efficiencies. And we need to be able to take data from files or put them out to files and we can do all sorts of passive [INDISTINCT] the packets. Mostly of the packets we can ship them around in many different ways. One of the best benefits we had with NCAP is when we started making modules to do deduplication, which is very necessary. It's a pattern matching internal database lookups like if we you want to match what you're seeing off of the wire against an internal table of something that you know, and that was really important. Typically when people are setting up security or doing security data gathering, they'll put everything into a database and it may not scale. At some point, they'll become deskbound and we need to be able to keep the information flowing in real time and not just being trapped into a database which will eventually slow down. And we can't just log the data because it's really not useful. People need to create real time tools to be able to analyze that on the slide. What we ended up doing is we built a--it was a loop called a loosely coupled multi-processor where one machine would actually stored out with the data, broadcast it unto the network and other machine would do some different processing. And then once that machine does processing you have to broadcast it back out unto the network and then a bunch of other machines would be interested in that need you further processing. So, you have a whole bunch of machines altogether on the same broadcast network and they're all doing the processing in real time. I have a diagram describing that more. We partition our data like for the various data types and to different VLANs so you got passive DNS on one, deduplicate and passive DNS on another, you have some NetFlow on one, you have some spam on another and then you can choose which channel VLANs or channels that you want to subscribe to, you know, to cut down on your overhead of what you have to process. So this is the typical--this is the typical use of a filter or DNS coming on in the left. This is the route--let's say you're passive DNS data that's coming in from the sensors, it's very high, at a very high rate of speed, you have program which operates on a server runs it completely out of RAM that does a deduplication of what's in there. You know, you may have a hundred people looking up www.google.com but you don't really care about that. You're really interested of the fact that www.google.com is out there and here's the information that came with it. So, deduplication takes it down to a reasonable level where people can actually do the processing with it. And then you can do some additional filtering like we have something that helps detect fast flux. Fast flux in one example would be, let's say, when your name servers keep changing their IP addresses, that's very useful for helping to detect Darknets because that is one of the behaviors that they use. So, here's a graph from last night where we get about 40,000 packets per second and that's why it's somewhere around, you know, 80 to 100 megabits of all just DNS packets. And to do any real analysis with it, you know, once you got, you know, like a, you know, 20 or 40 servers that can kind of split or take a part of that feed off a bit, you know, it's really going to be inefficient to process. So, here the deduplication, you know, part in the graph, if you just look at the numbers that--now, you're down about 5,000 packets per second, which is a little easier to process. That's like an eight to one, we're getting about an eight to one benefit out of our deduplication, and it last about every four hours. It'll roll over for fresh data, so gives you an idea of how that works. And then just we're finding fast flux, well, you know, you're about two to four packets per second of just things that are changing at fast flux. And we can actually just watch that stuff scroll by on the screen and use your human intelligence to figure stuff out like, you know, if you have like bank phishing site and it's changing its name servers where you just see it scroll by on the screen and say, "Hey, there's a domain. I'm interested in." The concept of loosely coupled multi-processor is very important. Dave Boggs was doing this Xerox PARC & DEC back in the '80s and he invented the Ethernet. A lot of people were just using it as, here as I get my packets from A to D but he is starting to use different, he is starting to use methods actually as broadcast. So you can actually take one piece of data and broadcast developed to multiple servers efficiently. And another feature this is--we're doing, we're leaning toward real-time analysis. A lot of people in the research field these days, they've build their big databases and they do queries against the databases. Well, if instead, you know, if you do that and you're--it's taking too long, you know, bad guys, they're moving on after a few hours and if it's going to take you a whole day to figure out something well, you're pretty much losing. So, if you can basically find ways to put the stuff you know and around, you can compare it with that what's on the feed and you can actually give yourself an advantage. Do you want to say anything more about... >> Some of this isn't new except it's--what old is new again because the--when this kind of thing used to work when computers were passed in the real world is slow or computers were fast enough as you put in the database triggers so that everything got dump to the database but then when certain things, certain lines were crossed and you would learn. "Okay, you just try to put something in the database that caused the following exceptional behavior, you should analyze this." And, certainly, in the case of SQL, there's no way you're going to keep up with even 5,000 per second let along hundreds of thousands of things per second, that's their trigger [INDISTINCT] and all the SST in the world isn't going to help you do that. So, I will say that this is, to me, the money bullet points in this presentation is that the security community has gotten into the habit of storing PCAP files or storing things in database tables and then having [INDISTINCT] that go look for things and we are not keeping up. The bad guys are winning. I'm tired of that so, the idea of teaching people once again how to look at things in real time and look for prosper relations in real time was at the heart of this project, originally. >> ZIEGAST: There is some benefit to analysis in the rears and that it–we can perhaps go back and look for things that would help you change your patterns in what you're looking for in real time. Just much like a stock analyst would look at their historical trends of a stock to figure--try to project what's going on with the future. Well, you know, at some point, you have to have something automated that's pointing that trigger for you to buy or sell. And that's perhaps some of the kind of analysis we'd be doing for Internet security data. So, here are some of the things that we've got on SIE right now. Raw Passive DNS is what comes in from the sensors and then we filter it out. Fast-flux is one example. We do some comparisons to some things like, CBL or SORBS, Spamhaus CBL. That's interesting because you may know the IP address of something that's bad, but you don't necessarily know all the names it's using. So, using this, you can actually, in real-time, gather all the names that are being used by a particular machine or that people are looking up against it. How to do some cross-correlation and prevented work. We gather some DNS queries from some DNS blacklist from dynamic DNS providers. Some top-level domains; we put isc.org in there. AS112 is a project where all the stuff that shouldn't leak out for DNS lookups, someone looking at 10.in-addr.arpa, for example, or something within there. Well, you know, that's basically misconfigured networks or name servers out there. Some of the root servers which no longer operate; they still have an IP address that people are seeing the query so, we're gathering a little bit of that too. We don't gather root servers, you know. We operate our own root servers but we actually have a firewall between us and them, and a lot of that analysis is actually goes out to DNS org. So, NCAP was great for packet data but there's more that we need to capture. We need to be extensible. Those maybe on the stake we made with NCAP when we set out, we didn't put version numbers in there. We need to be able to create new format as new data comes available. It needs to be fast and scalable, you know, if you're looking a set to describe this stuff, you can imagine, oh, let's just use XML. Well, that doesn't work very well. That's part of why we're here at Google today is because you guys had something that was very applicable. It needs to be fast. It needs to use all these features from NCAP for working with our infrastructure. And we also need to have filtering methods that we can plug-in and developers can use so that we can keep up with the time. And Robert--we gave a lot of this to Robert and Robert pretty much picked it out. He created NMSG. So now, we're going to switch over to Robert. All right. Do you want to take my yellow? >> EDMONDS: Yeah, let's switch cables. >> ZIEGAST: I was doing 800x600 but we'll see it whatever you have works. >> EDMONDS: So something Eric reminded of this from RFC 1034. The sheer size of the database and frequency updates suggest that as we maintain it in a stupid manner. Approaches it, it attempt to collect a consistent copy of the entire database become more and more expensive and difficult, and hence should be avoided in 1987. Well, unfortunately, we like doing expensive and difficult things. >> Oh, he was taking about the post stuff here. >> EDMONDS: Yeah. Well, this gets where, in fact, in all of the hostnames. Let's see here. So, we have this NMSG file format. That's just the successor to the NCAP packet capture format and the ideas that we're not only capturing packet information but also things that are not necessarily best represented as packets on a wire or datagram on the wire. So the ideas--we don't know what types of information we're going to store so we should make it store opaque blobs of information. And perhaps, at run time, we will load a module and be able to learn how to interpret that blob. So, we have blobs on the order of 10 to 10,000 kilobytes in length. We probably don't want to optimize the transmission of DVD ISOs, UDP broadcast network. We're interested on things like DNS and email and HTTP, things of that order, of size. We optimize--we decided to optimize for UDP over jumbo frame Ethernet in order to minimize number of socket receives and after it'd be done to either particular quantum of data. And it turns out that Google has a ready-made including format called Protocol Buffers. It's essentially an extensible binary wire format for encoding fields of data, print of types, integers, floats, binaries. Unfortunately, Protocol Buffers are not self-delimiting and they're not self-describing so we have to add some additional framing and some additional intelligence in order to be able to use that for our UDP broadcast media. And for the protocol engineers, it's essentially a description of the protocol, so maybe, constant length header part and the variable length part which can encode one or more payloads. So the NCAP format captures one packet and represents one packet when it's rebroadcast. But since we're bashing that data or we're buffering that data, we can pack more than one payload into a jumbo Ethernet frame. So, the average DNS packet is probably less than 512. We're going to fit perhaps 16 of those into a jumbo frame Ethernet packet, or even more so why not minimize a number of socket calls, system calls that you have to perform in order to read that data off the network. What if your payload is larger and your jumbo frame Ethernet frame? We should be able to fragment that so, I'm sure you've seen spams, emails that are longer than eight kilobytes. So, we want to avoid having to, having truncated that indicates--this blob has been truncated and you have to deal with that. While we can just fragment--we can fragment a payload and they receive, reassemble it and pass that reassembled payload to the client application. And we are moving to the [INDISTINCT]. So, there's four byte magic of value and it's the beginning of the frame or beginning of the file buffer. The flag is octet and there's a bit, I mean, there's fragment, it was a bit that means compress. There's a version octet, conversion is two and we're going to represent up to about four gigabytes of payload. We have not come in, ready for that load. So fragment, there's a bit that says fragment. I mean, fragment into multiple frames, receiver as to reassemble it and we kind of, you know, like the fragmentation because that limits us to 64 kilobytes. So, we do the fragmentation or the segmentation in the application layer, much like TCP. And there's a bit that will compress the data so, we can fit even more DNS data or email data into a given Ethernet frame. And if you see both of the bits and you compress it and then fragment it because doing it in the other direction and the other order is problematic which well, you won't use the maximum, you won't use as many bytes on the frame if you fragment it and compress. So the payload header, this is the verbal like part and it is now encoded using Google Protocol Buffers. There is a vendor ID and a message type. The message types are pure vendor so if you want to create your own payload message types, we will assign you a vendor ID and you can assign whatever message type value as you want. A timestamp, 64-bits seconds plus 32-bits nanosecond so, you get a nanosecond precision timestamp. And then we have a few optional fields for a classification, source, operator, and group so cooperating senders and receivers again, pretty to classify your data, and the payload itself. This is the opaque blob of information. Each vendor ID, message type tuple will identify a particular unique type of message. And, we don't necessarily require that the blobs can be encoded with GPB, but they frequently are and we optimize with that particular case. So now, we have the LIBNMSG c-client API and this is a client applications that want to process sender, receive, read, write setup, you know, multiplexing, demultiplexing, all sorts of [INDISTINCT] with their NMSG and payloads. We include both the simple single-threaded interface and we have a multi-threaded octopus I/O engine that you can utilize if you want. The multi-threaded code is good and that we could spread the load across multiple CPUs when we decode those packets, those messages. And, if you happen to have a trunk of code that processes with those messages and you make it reentrant so it could be called at multiple times from multiple threads, you get whatever speedup as possible from that. We are currently developing Python and Perl bindings. The Python bindings are stabilizing--we haven't yet made a stable release of the Python bindings yet. And [INDISTINCT] is working on the Perl bindings because I do not use Perl. >> ZIEGAST: [INDISTINCT] age. >> EDMONDS: Well, Python is probably just a few years younger than Perl but there's also a message module interface so that we can extend the message types that whoever understands without having to recompile it and relink all the readers and writers. So, essentially, this is a DSL that exports a particular structure and deals in particular fields and may optionally provide function pointers that perform a specific processing on, specific, you know, pretty printing, parsing, type of-- >> ZIEGAST: So, for example, DNS. >> EDMONDS: Yes, DNS has a variety of interesting and particular wire formats where it's–-for its data fields. Let me see. The traditional label, octet label encoded name which is--with those security vulnerability recently based on this concept in the SSL Certificates that were happened to have embedded nulls in the labels which is valid according to the DNS wire protocol. So, there's an ISC DNS message type that will provide a specific function to turn a label encoded DNS name into human readable, you know, dot delimited name and we don't want to put that type of logic in the LIBNMSG. We want to push that out into a plug-in for that particular message type. Since, we try to make the d-core library as agnostic of the upper layers as possible. And typically, the message module is a really short amount of code that [INDISTINCT] some generated object code from the protocol buffer's compiler. Yes, we can make it as long as there's additional complexity, we can keep out of the core library. And that's the end of my presentation. >> ZIEGAST: Cool. This is all available online. You can download it at ftp.firm.ISC website. And you can actually see some of the encryptions we have where ISC is vendor ID1, you know, if you wanted to start using this yourself, you can make yourself a vendor ID. >> EDMONDS: Why don't you--you would ask us for vendor ID? >> ZIEGAST: Oh, yeah. Yeah, everyone just picks their own, you know. John Pixel isn't moderating that anymore. I know, yeah, we'll just start with us and we'll make sure that in the source code that everyone will play nicely. So now that we have–so now we have a message, we now actually have some new channels that we can make available to people. Spam; we have--we have some--a bunch of spamtraps out there and some–there's actually another provider that's sending us these spam reports. We basically take all the envelope information, some of the headers, extract some URLs out of it and then packet-sized that, and you people can actually use that. A search provider has given us some URL link pairs so that we an actually–if someone wanted to, they can actually make a map of here is the normal web, and then if it's not in that, maybe you wanted to spend some of special attention to it. NetFlow is actually packet-sized, you can actually–we don't do anything with that, we just–NetFlow already has its own set of tools like the silk tool kit. People are aware of Conficker. We got ourselves involved a lot with that and help aggregate and collect all of the data that was going through the sinkholes. So we created some types that we had all the web servers report in. And we also captured the DNS data and we actually captured some of the P2P data and we created channels for them, so we could actually have people see that stuff coming in real time. We are making some development work from malware. I may just try that more on another slide. We're also getting some darknet fees from our self and other JISP and we expect to be getting some more. But that's just normal packets, it's not necessary end message whenever I choose to create an end message module to describe stuff. Particularly, if malware, you know, you might start with hashes or MD5s, at another layer you might--above that you might have people passing around, "I'm interested in this" or here's, "I saw this too." And above that, you might have something more descriptive, you know, maybe even encapsulates some XML as IO depth is very popular with that. >> Is an info channel just to–so it's just single blob that has all forms of NetFlow and sFlow or it is a single vendor? >> ZIEGAST: It's–we have some–we have some version 5 just from our own routers, but getting NetFlow from our own routers has been difficult. It is not--it is not an active channel. But, yeah, I would expect that sFlow version 9 is pretty much the standard for that because of--there's IPv6 and... >> Right. >> ZIEGAST: ...all that other stuff. Some other people are working with NetFlow and using--combine with passive DNS. But we're not doing any of that to build in ourselves right now. There are already tools out there. So for passive DNS, you know, you can sniff stuff off the wire that ".255" just basically typifies. If there's a channel number that's 202 and here's the broadcast address, we're listening on port 8430 where all of these packets are splitting out there. And, you know, you see some information like the time stamp, there's the type ISC ncap, the identifier of the sensor operator who submitted it which kind of randomized and kept separate, but you kind of tell where the data is coming from one source. There's the name server that it came from, our name server I commented out, that's the sensor operator, as you can see where it came in. And some of the flags that go with it, the first part is the answer part or, actually, the first one is the query; the zero means if there's no answer, but then it told you where the name servers were in the name server section in additional info that actually handed the IP addresses. So you can take all of that, you can put that into a structure if you're using "libnmsg". But you can even just do plain old text processing based on that. >> EDMONDS: Oh, you should use a library. >> ZIEGAST: It's much more efficient. >> EDMONDS: Don't force server through your text. >> ZIEGAST: But, you know, some of us old timers used awk, sed, perl and such, and don't keep up with that. So as some people have done some useful things without being efficient, but, yes, do take the time in the auction where in the C or python or whatever it carries out there and save yourself some money on your hardware. So another one, here's a sinkhole for Conficker. So here we have another type, you know, "4 ISC http" and so we get where the request came from. So, these are people who are infected for coming back to try to talk to the command and control over the web, and we just happened to take over the domain that they're using for that. So we can kind of do some things. We do some POF. We look at the–all the stuff that we could get out of the request and that would help identify for which particular strain of Conficker they were infected with. And now they can create a database, people can use some of these data for a mediation tools and some people do. So, you know, Chris Lee did a lot of work. He's out of–was at Georgia Tech in our shadow server. He put a lot of it together and make it useful for people. So for spam, we do a bunch of preprocessing scripts that basically take the email message, you know, out of your standard input and then extracts the things that people are interested in. This should be the helo from RCP to the IP address that came from, you know, there are receipt headers in the URLs that are found in the message. No one's really interested in the image blobs yet, but if someone does become interested, we might do something with it. We have plugins for postfix and qpsmtpd, qpsmtpd which is a pro-tool which is actually a very fast and efficient from Linux and VST servers. So we have several spam traps of that type where it's just basically taking unused domains and people will just keep sending spam to it anyway. They are seeded or populated. And that's very useful because that really is spam. We have methods where you can actually tag "this is spam" report. So if someone says or reporting address, an abuse address, or they have a button, they click on their client; we can actually create packets that say, "Here's a user report". Now, there's a little that you have to go a little statistical because there are false pauses and that that some people may take some marketing and say its spam where it may not necessarily be pure spam. Something we haven't implemented yet but would be interesting, would be, say, as every mail message comes in, take the headers from that or the envelope info and say what spamassassin score was with it, and then you can basically–if you have everyone reporting into a central source, you have a really good chance of real time reputation, maybe that some commercial services out, you might be able to do a public domain one. All this spam is a great starting point for analysis, you'll find–typically find that people are using botnets or in some cases even just buying services of the bolt requesting. I will lead you to some other data. Here's an example of some spam that you get off at spam channel. Again, the time stamp and the time sensor, it's a spam trap. I confiscated of this, except that's a real domain. And there's a URL in there, that "ff24490.gif" or yet that points to basically a--but, you know, a biox, you know, kind of advertisement. And so this is obviously some kind of a bad domain that's being used for phishing or, actually, in this case, just spam. So one of the things we do is we take some of the passive DNS that we have in there and we look up that domain, it points to some IP address, well, lo and behold, here are all the other domains that are being used by that IP address that we've collected via passive DNS. So once you find one, you can actually do blocking or add blocking on all of the other domains even before they're used, like maybe they don't use them all right away. So you can actually be–start to be proactive. You can find some other information like ".j8w.ru" is probably something close to what the real name for that server is. And then you can go program around down and find some stuff. When you have multiple data types, like passive DNS or spam or other information about the networks, you can do a whole bunch of data combine and find some more interesting info. Jose Nazario and Thortsen Holz back in Malware08, they made a paper where they create a point system for all these different things when you combined them together and add up the points. They actually say, "Hey, that's a fast-flux spot". You know, like being a multiple networks within IP address ranges or how often or how many host do you have in the A records. Dave Dagon and Wenke Lee, they just basically took passive DNS, a little bit of a string matching like looking for the record virus inside of the packets coming by and they are very fast and successful in finding FakeAV sites even before they were blacklisted by other services. Richard Clayton is a professor who was investigating one of the centers that's being done in the UK. So he takes–he takes a list of hosts of the passive DNS, combine with some active scans that he does himself, and he can pretty much determine which of the URLs were getting blocked, including at one point internet archive, which is a pretty heavy website to do blocking. Andrew Fried is a consultant with us. He's going to be talking of Blackhat DC in January about a lot of stuff that he did of combining the spam, the spam's BGP info, passive DNS, analysis of some top level demands zone files to basically go after things like Zeus/Avalanche or phishing or whatever. He's very active in the community in just saying, "Hey, you know, here are all these new domains, they get to stuck in the server bowl and they start to get blocked." But he used to do this full time for the IRS back when they were having phishing problems, and now he's actually helping not only the IRS but all these other people who are getting hit with a lot of the same methods. Ed Stoner worked with CERT and in Flocon in January. He's going to be talking about how he puts together a passive DNS with NetFlow to help expand on what botnet knowledge you already have. You can already get some botnet knowledge out of NetFlow but if you take passive DNS to the next level, you can basically use your IPs to help find more names which help you find more IPs to help you find more names and eventually get map of everything. We're actively looking to start the malware channel. Some other people are creating products for DNS reputation based on the data we're interested and perhaps offering scanning because, you know, people who are doing scanning right now for DNS, they're finding out the bad guys in figuring out who they are so they're getting blocked. So let me offer some scanning infrastructure for people. Automated abuse or distributed denial of service attack reporting, it might be a way to standardize and have some real term report and saying, "Hey, I'm getting flooded". And then you tell a whole bunch of other people which might include, you know, your antivirus vendors or ISPs direct LAN, whereas as opposed to picking up the telephone. With the URL search data, you can perhaps find that a whole bunch of people are coming to some place at ones, and that might be because Britney Spears did something that day or it could be that there's a new virus that everyone's downloading at once. But if you have everyone looking at a new URL all at ones, you can perhaps take a look at that and--you know, BGP updates, you can--that security data as well, that could be helpful for finding out people's networks getting stolen up from under them. And here's how people can help. You know, if there's sharing methods that people are using right now that are between themselves, they want to incorporate more people into that. We can help reduce some of the overhead by having it to put in one central place and using our broadcast infrastructure to get to all the people who need it. We don't have to resend or recopy that data between multiple phases. If you don't like working with service agreements and NDAs and stuff like that, we can help simplify things. Something else you can do is bring some servers SID–SIE and actually take a look at what's out there. I mean, there's a lot of good minds here at Google. I can imagine some of them might be interested on the security side of seeing what's out there and seeing what they can figure out and see if they could combine it with some of the data that you guys already have. And you can also install sensors, you know, particularly with ISPs or corporation or security companies. Everyone's got some kind of data that is perhaps worthless junk to them, but to someone else they can--they could do something effective with it when they combined it something else. So people can go ahead and send us some more data. We'd appreciate it, and so the rest is security community that works with us. So we're SIE, you can send us email with that info. It will get to all three of us. We got a website. There's my phone number. And for nmsg, you can go and get download it yourself. We have a developer mailing list that we recently set up where people can talk about how they use it. This is generic, it's open source. This is not SIE specific. You can use the stuff internally and then tell us how you're using it, it might be interesting. Ncap's available from us as well. We thank you guys for making Google Protocol Buffers and we especially thank all the sensor operators who are out there donating data to make all these useful.

History

Originally the company was founded as the Internet Software Consortium, Inc. The founders included Paul Vixie, Rick Adams and Carl Malamud. The corporation was intended to continue the development of BIND software. The founders believed that it was necessary that BIND's maintenance and development be managed and funded by an independent organization. ISC was designated as a root name server operator by IANA, originally as NS.ISC.ORG and later as F.ROOT-SERVERS.NET.^{[citation needed]}

In January 2004, ISC reorganized under the new name Internet Systems Consortium, Inc.^[7]

In July 2013, ISC spun off the Security Business Unit to Farsight Security, Inc. a new company started by ISC founder Paul Vixie.^[8]

In early 2020, ISC closed its headquarters in Redwood City, California and moved its operations to Newmarket, New Hampshire.^[9]

Open Source

ISC develops and maintains open source networking software, including BIND and two DHCP implementations: ISC DHCP and Kea DHCP. ISC also distributes INN and several older, unmaintained projects.^[1] Some early aspects of its software were developed by developers that were commercially employed by Nominum, amongst others.^[10]

ISC license

ISC developed and used the ISC license, which is functionally similar to the simplified BSD and MIT licenses. The ISC license is OpenBSD's preferred license for new code.^[11]

All current versions of ISC-hosted software are available under the Mozilla Public License 2.0.^[12]

DNS root server

ISC operates the DNS "F" root server,^[1] the first such server to be distributed using anycast. In 2007 it was announced that ISC and ICANN would sign an agreement regarding the operation of F, the first such agreement made between ICANN and a root-server operator.^[13]

Usenet moderators list

ISC maintains and publishes (on ftp.isc.org) the central Usenet moderators list and relays for moderated groups, so individual server operators don't have to track moderator changes.^[14]

Internet Domain Survey

The Internet Domain Survey searched the Domain Name System (DNS) to discover every Internet host. The survey began when only a few hundred hosts were Internet-linked.^[16] The earliest published reports, dated 1993, were performed by Network Wizards owner Mark K. Lottor. The Internet host count was 1313000 in January 1993 and 1062660523 in the January 2017 survey.^[17]

ISC ended its sponsorship and publication of the Internet Domain Survey in 2019.^[18]

References

^ ^a ^b ^c "The History of ISC". Internet Systems Consortium. Retrieved 2020-09-14.
^ Internal Revenue Service (2007-12-15). "501(c)(3) exemption letter". Internet Systems Consortium. Archived from the original (PDF) on 2008-11-18. Retrieved 2009-04-11.
^ "F-Root". Internet Systems Consortium. 10 July 2019. Retrieved 2020-09-14.
^ "Milestone Agreement Reached Between ICANN, and F Root Server Operator, Internet Systems Consortium". ICANN. Retrieved 2020-09-14.
^ "IETF Standards Written by ISC Contributors". Internet Systems Consortium. ISC. Retrieved 14 September 2020.
^ "2019 ISC Annual Report" (PDF). ISC. Retrieved 2020-09-14.
^ "ISC Mission". Internet Systems Consortium. Retrieved 2020-09-14.
^ "ISC Spins Off Its Security Business Unit". Internet Systems Consortium. 2 July 2013. Retrieved 2020-09-14.
^ "ISC 2019 Year in Review". Internet Systems Consortium. 13 January 2020. Retrieved 14 September 2020.
^ "Nominum Inc history". Archived from the original on 2008-12-20. Retrieved 2008-12-04. David Conrad founded Nominum in 1999 to develop BIND9 and ISC DHCP3 for the Internet Software Consortium
^ "OpenBSD: Copyright Policy". openbsd.org. Retrieved 5 March 2024.
^ "ISC Software Licenses". Internet Systems Consortium. Retrieved 14 September 2020.
^ "Milestone Agreement Reached Between ICANN, and F Root Server Operator, Internet Systems Consortium". ICANN. Retrieved 2020-09-14.
^ Usenet Hierarchy FAQ Section 4.
^ "Internet host count history". Internet Systems Consortium. Archived from the original on May 18, 2012. Retrieved May 16, 2012.
^ "ISC Internet Domain Survey". Internet Systems Consortium. Archived from the original on 2008-11-17. Retrieved 2009-04-11.
^ "Internet Domain Survey, January 2017". Internet Systems Consortium. Retrieved 2017-02-14.
^ "ISC ends Internet Domain Survey". Internet Systems Consortium. 26 August 2019. Retrieved 14 September 2020.

External links

ISC Official site

This page was last edited on 5 March 2024, at 13:40

From Wikipedia, the free encyclopedia