Hello,
If you do not lookup every IP to determine the Domain name base rules, how would you garentee that you weren't missing an import packet? You have my interest. What services could be farmed off that would side-step realtime lookup? and would it be realistic in lieu of it memory requirements?
I do have a chacheing name server running. The initial ip resolution is still horribly slow and with my active server, my cache is already spinning wheels. I dedicate a hugh chuck of ram to it, but my traffic load is still larger then my memory for a DNS cache.
According to the netfilter team, the SOHO script had a typo in it (from the author that wrote it) that allowed a bad packet flag combination. BullDog blocked where netfilter did not. I had the flags set diferently then the script (from an example in snort). I have rewritten the script now the I understand netfilters complex command line, but then, as I'm sure many would agree, being new to netfilter, I had to rely on their prebuilt scripts.
A perfect example for DNS blocking is a persistant harvester/spammer that I've have the mis-adventures of. I would always see this combination:
1. DNS inquiry
2. Web page scanning by another IP in the same Domain.
3. A third IP address of the same Domain will connect to deliver the spam.
By carefully determining this pattern I have been able to block the initial inquiry, thus (in this case) block a spammer that was unloading 3000-7000 spams a day on my server.
I have used grep in the past to filter and count. I have also piped the tcpdump output to a file to example it. All traffic is verify destined to my server. It is definately busy as I host 5 domains on it. On the average, I can expect a million people a day through various services.
tcpdump uses libpcap. libpcap does not stop the normal flow. If it where libipq based, then the whole network would have come to a screaming halt. I am thinging of going that route for simplicity and a few advanced features offered by tieing directly into the kernel queue. If I do go libipq, I've got even less time to process a packet and return the verdict. This is definately "hot on the wire" and I know there is still a lot of weight I need to shave to get here.
I'm sure a realtime system could be done better with udns or adns libraries. This is an area that I am definately open to exploration. The DNS database is nice if you need to search domain information heavily. But I do agree that there maybe better options. I've tried not to limit my thinking in developing BullDog. What I have now is working quite well and certainly better then I stated. However; there is always room for improvement.
Your times are pretty consistant with mine.
Try setting up a cacheing name server. I have enough diverse traffic that my results showed very little if any improvement. Cache works well if the data stream is predictable. My server runs all the basic services and thensome from a diverse section of the internet. The only thing consistant abou my traffic is what comes off my lan.
This is great if you wish to flood your table list. BullDog only puts entries in the iptables list that have actually hit the system.
As an extreme visual demonistration, repeat your rule with the aol domain. A couple of health pests will crush your system resources. If you don't crash your system, I'd be very surprised and would like to see the memory layout. Even with 2G of ram, that rule on aol.com (or any large domain) would sink me. This is another clear advantage of BullDog. The probability that every IP address in the world that I don't want is going to attack all at once is very low whereas the probably of me crashing my server for loading such a large static rule list is (according to netfilter and my own tests) absolute.
Actually no. Netfilter has a documented issue where if the table grows beyound 30,000 entries total, the entire system begins to degrade. Supposedly 2.6 is going to correct this issue. In keeping up on the achives, I haven't seen this happen yet.
I will describe the life of a packet header in BullDog:
A packet hits the collector.
The collector send the entire packet to a processor, where any number of tests can be performed.
If that pack passes the test, the packet header is stored in a splay tree. All future packets are compared to that packet header. They must match, if they do, the the processor is short-circuited. The splay tree is allow to grow only to a certain size. Then its flushed and starts all over. Not every packet is pushed through the same funnel. A DNS lookup only occurs on the FIRST sight of a packet, from then on out, the system knows to allow or block based upon the rules.
