TWC: Medium: Collaborative: Know Thy Enemy: Data Mining Meets Networks for Understanding Web-Based Malware Dissemination
is based upon work supported by the National Science Foundation
under Grant No. CNS-1314603. Any opinions, findings, and
conclusions or recommendations expressed in this material are those
of the author(s) and do not necessarily reflect the views of the
National Science Foundation.
How does web-based malware spread? We use the term web-based malware to describe malware that is distributed through websites, and malicious posts in social networks. We are in an arms race against web-based malware distributors; and as in any war, knowledge is power. The more we know about them, the better we can defend ourselves. Our goal is to understand the dissemination of web-based malware by creating "MalScope," a suite of methods and tools that uses cutting-edge approaches to build spatiotemporal models, generators and sampling techniques for malware dissemination. From a scientific point of view, this project brings together two disciplines: Data Mining and Network Security. The outcome is a suite of novel, sophisticated, and scalable techniques and models that will enhance our understanding of malware dissemination at a large scale. We use two types of web-based malware dissemination data: (1) user machines accessing dangerous sites and downloading web-based malware; and (2) Facebook users being exposed to malicious posts. We already have and will continue to obtain more data from our industry partners (e.g., Symantec's WINE project), open-access projects, or collect on our own (e.g., MyPageKeeper).
The broader impact of our work is that it will enable the development of security solutions for end-users and industry. A 15-minute network outage costs a 200-employee company about $40K, while identity theft costs about $1,500 per person on average. By knowing the enemy better, security researchers and industry can more effectively stop the interconnected manifestations of Internet threats: identity theft, the creation of botnets, and DoS attacks. The PIs have a track record of technology transfer, with collaborators at industrial labs (Yahoo, MSR, Symantec, AT&T, IBM), national labs (LLNL, Sandia), open-source software ("Pegasus"), and spin-off startups (StopTheHacker). Educational impacts include developing a new course, providing publicly available educational material, and open-source software.
Data mining, web-based malware dissemination, graph mining.
- NSF, Award Number: CNS-1314603, Duration: September 1, 2013 - August 31, 2017 (Estimated)
The following professors are co-PIs on this project:
The following Rutgers graduate students work on the project:
Tutorials with co-PIs:
Sucheta Soundarajan, Tina Eliassi-Rad, and Brian Gallagher. A Guide to Selecting a Network Similarity Method, Proceedings of the 2014 SIAM International Conference on Data Mining (SDM'14), Philadelphia, PA, April 2014.
Long T. Le, Tina Eliassi-Rad, Foster Provost, and Lauren Moores. Hyperlocal: Inferring Location of IP Addresses in Real-time Bid Requests for Mobile Ads, Proceedings of the 6th ACM SIGSPATIAL International Workshop on Location-Based Social Networks (LBSN'13), Orlando, FL, November 2013.
Michele Berlingerio, Danai Koutra, Tina Eliassi-Rad, and Christos Faloutsos. Network Similarity via Multiple Social Theories, Proceedings of the 5th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM'13). Niagara Falls, Canada, August 2013.
Invited talks at conferences and workshops:
- Danai Koutra, Tina Eliassi-Rad, and Christos Faloutsos. Node Similarity, Graph Similarity and Matching: Theory and Applications, The 2014 SIAM International Conference on Data Mining (SDM'14), Philadelphia, PA, April 2014.
Tina Eliassi-Rad and Christos Faloutsos. Discovering Roles and Anomalies in Graphs: Theory and Applications, The 2013 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery In Databases (ECML PKDD 2013), Prague, Czech Republic, September 2013.
- Tina Eliassi-Rad. Discovering Roles in Graphs: Algorithms and Applications.
International Conference on Network Science (NetSci'14), Berkeley, CA, June 2-6, 2014.
- Tina Eliassi-Rad. Structural Features Threaten Privacy across Social Graphs. Charles River Workshop on Private Analysis of Social Networks, Boston, MA, May 2014
- Tina Eliassi-Rad. Discovering Roles in Graphs: Algorithms and Applications. SIAM SDM Workshop on Mining Networks and Graphs: A Big Data Analytics Challenge (MNG 2014),
Philadelphia, PA, April 2014.
- Tina Eliassi-Rad. Measuring Tie Strength in Implicit Social Networks. Spring Workshop on Mining and Learning (SMiLe), Oostende, Belgium, March 2014.
- Tina Eliassi-Rad. Affecting Dissemination on Graphs. DIMACS/RUCIA Workshop on Information Assurance in the Era of Big Data , Piscataway, NJ, February 2014.
- Tina Eliassi-Rad. Discovering Roles in Graphs: Algorithms and Applications. IPAM Workshop on Mathematics of Social Learning, Los Angeles, CA, January 2014.
- Tina Eliassi-Rad. Measuring Tie Strength in Implicit Social Networks. DIMACS Workshop on Statistical Analysis of Network Dynamics and Interactions, Piscataway, NJ, November 2013.
All other papers, talks, tutorial slides, and other resources are available here.
Last updated June 21, 2014 by Tina Eliassi-Rad.