Reply to comment:

Hi Anonymous, <br /><br />Thank you for your feedback - although it would be beneficial to us and the community to know who you are. <br /><br />I'd like to avoid starting a programming language war here. I do agree with you that Java is bloated and would like to use something else to gain more performance for this. However the distributed computing Hadoop offers is implemented in Java and the benefits that this brings to the table far outweigh the performance penalty. <br /><br />The power of this system for us really comes at scale - we use libtrace, libpcap, etc. extensively and will continue to do so. For us it is very useful to be able to iterate over TBs of data within a reasonable amount of time and being able to lower that time by simply adding computing capacity. With libtrace our only way of scaling was vertical and if we reached that limit we had to compute batches on different machines and later merge those results into one which was an error-prone process. <br /><br />I cannot agree, however, with your claim that you can do 100GB of data in a single thread with libtrace in three minutes. As I mentioned, we use it extensively on modern hardware and have never reached performance levels anywhere close to this. I just ran a 1GB PCAP (uncompressed) and it took 1.5 minutes to read it. This was discarding all the output and therefor not producing any IO. <br /><br />Regards, <br />Wolfgang