Economic Incentives for Cooperation to Fight Spam

Around 90% of all email is spam, and the situation is not getting better. The IIAR project at the University of Texas is building econometric models for economic incentives for Internet email providers, so that they can better prevent and deal with spam and botnets.

Users of gmail or Yahoo mail may not notice because those services filter pretty effectively, but around 90% of all email is spam, and the situation is not getting better. Worse, most of that spam comes from botnets, which infest a large proportion of all computers on the Internet, and also support phishing, DDoS attacks, and other malicious activities.

While the black hats cooperate for money from their underground economy of spammers, bot herders, warez vendors, exploit writers, etc., the white hats still mostly try to go at it alone, organization by organization. Sure, there are many coordination projects and conferences, but none of them are as effective as the criminal economy.

To illustrate, which Electronic Mail Service Providers are effective at preventing and dealing with spam, and which are not? There are many lists of spamming countries, but that is not very useful: how do you tell a whole country to clean up its act? Sure, it can pass laws against spamming, but laws are slow and spammers adapt quickly.

There are a few rankings of spam by domain or Autonomous System Number (ASN), such as the one done by CBL (the Composite Blocking List). Can something like that be leveraged into a reputation system to name and shame bad or negligent actors and reward good actors? This is an example of one type of economic incentive the IIAR project is exploring. (IIAR stands for Incentives, Insurance, and Audited Reputation). Reputation systems are effective in many other fields, ranging from automobiles to graduate school rankings. Maybe they can be brought to bear on Internet security.

The IIAR project operates at multiple levels, from operations to business theory. Operations include investigating effects of specific botnet takedowns, as presented at NANOG 48 (see FireEye's Ozdok Botnet Takedown In Spam Blocklists and Volume Observed by IIAR Project , CREC, UT Austin. John S. Quarterman, Quarterman Creations, Prof. Andrew Whinston, PI CREC, UT Austin).

See below a figure showing the Top 10 botnets sorted by amount of traffic produced by them (if you click on the image you can see more graphs and the rest of the presentation mentioned above).

Figure 1: Top 10 botnets by volume of traffic created

The project has dug further into that, and will be following up with more information about what happened to Mega-D, which ASNs were affected, and how long these effects persisted.

(Mega-D, also known as Ozdok, is a botnet that at its peak was responsible for sending between 30% and 35% of spam worldwide. This botnet that was taken down by FireEye, a security company, in November 2009. For more information see the presentation above or refer to http://en.wikipedia.org/wiki/Mega-D )

The IIAR project has just had a theoretical paper accepted to CEAS 2010, the Collaboration, Electronic messaging, Anti-Abuse and Spam Conference: "A Game Theoretic Model and Empirical Analysis of Spammer Strategies", by Manoj Parameswaran, Huaxia Rui, Serpil Sayin and Andrew Whinston. This paper looks at spam volume per single IP address (thanks to CBL for the custom data) and makes a first cut at building a game theoretic model to explain some aspects of how spammers and blocklists interact.

Figure 2: Spam volume by time for a single IP address

In Figure 2 above you can see spam volume (data provided by CBL) on the y axis and time on the x axis, for one single IP address; one of thousands we have examined in this manner. Such low-level analysis is useful to characterize behavior that can be further analysed at the ASN level through aggregate metrics. There will be more on that in a later post.

The project has a daily database of half a dozen anti-spam blocklists going back more than a year. Mining this database permits examinations at levels from single IP addresses upwards through netblocks to ASNs, organizations, groups of organizations, and countries. Due to custom data (thanks to CBL, PSBL, UTCS, and Team Cymru), the project can also map botnets to IP addresses and to ASNs. A wealth of empirical, statistical, and theoretical investigations are possible, and the team is working on several lines simultaneously, publishing in venues ranging from Internet operational conferences to business journals.

Conclusion

The IIAR project is building econometric models for economic incentives for Internet mail providers, so that they can better prevent and deal with spam and botnets. We are doing this because years of observations by the principals convinced us that the white hats were not cooperating at nearly the same level as the black hats. The black hats have their criminal economy to spur them on; the white hats also need economic incentives.

The project team is submitting more presentations as the project explores several threads of investigation, for example digging more into the empirical data related to the recently accepted conference paper with the IP-level game-theoretic model. We are also pursuing the line of thought that was presented at NANOG, going further by correlating which botnets match the affected ASNs, and looking for other ASNs similarly affected. Another goal is to statistically analyze effects on the IP address, the ASN and the organizational level. All this work also funnels into one of our long-term goals, which is to build a reputation system that will produce some of the long-needed economic incentives for better cooperation for Internet security.

Feedback please

We are interested to receive feedback on this study from the RIPE community. Would you like to see more cooperation on fighting spam and do you have any suggestions on how to achieve this?

We are looking for as many specific incidents of botnet takedowns, exploit discoveries, and vulnerability patches as we can find. We'd be most happy to correspond with any takedown company, ISP, software vendor, security company, research organization, etc. that can supply such information.

We could also use more sources of attributions of botnet infestations to specific IP addresses. If you have some, please let us know. For all information, we can promise either anonymity or credit, as desired by the source.

Further, we could use some good test subjects. If you would like to know what we can see for your ASN in changes in spam volume, numbers of addresses listed, attribution to botnets, etc., please let us know.

We hope to be producing very rough drafts of some aspects of a reputation system in the coming months, and we could use a few reviewers for those. If you're interested, please let us know.

Please leave your comments and suggestions under the article. You can also send mail to labs@ripe.net or directly to the author (see below).

Acknowledgements

This material is based upon work supported by the National Science Foundation under Grant No. 0831338.

For the international cast of characters (U.S., India, China, Finland, and Turkey) and links to the grant and other publications, see the project's home page.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

John S. Quarterman
antispam@quarterman.com