TWC: Medium: Collaborative: Know Thy Enemy: Data Mining
Meets Networks for Understanding Web-Based Malware Dissemination
Tina Eliassi-Rad (PI)
|
Webpage: http://eliassi.org
|
Network Science
Institute
|
Phone: (617) 373-6475
|
College of Computer and
Information Science
|
Email: tina
AT eliassi.org; eliassi AT ccs.neu.edu
|
Northeastern
University
|
Address: 360 Huntington
Avenue, Mailstop 1010-177, Boston, MA 02115
|
This
material is based upon work supported by the National Science Foundation under
Grant No. CNS-1314603. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s) and do
not necessarily reflect the views of the National Science Foundation.
1. GENERAL INFORMATION
1.1. Abstract
Link to NSF abstract
How does web-based malware spread?
We use the term web-based malware to describe malware that is distributed through
websites, and malicious posts in social networks. We are in an arms race
against web-based malware distributors; and as in any war, knowledge is power.
The more we know about them, the better we can defend ourselves. Our goal is to
understand the dissemination of web-based malware by creating "MalScope," a suite of methods and tools that uses
cutting-edge approaches to build spatiotemporal models, generators and sampling
techniques for malware dissemination. From a scientific point of view, this project
brings together two disciplines: Data Mining and Network Security. The outcome
is a suite of novel, sophisticated, and scalable techniques and models that
will enhance our understanding of malware dissemination at a large scale. We
use two types of web-based malware dissemination data: (1) user machines
accessing dangerous sites and downloading web-based malware; and (2) Facebook
users being exposed to malicious posts. We already have and will continue to
obtain more data from our industry partners (e.g., Symantec's WINE project),
open-access projects, or collect on our own (e.g., MyPageKeeper).
The broader impact of our work is that it will enable the development of
security solutions for end-users and industry. A 15-minute network outage costs
a 200-employee company about $40K, while identity theft costs about $1,500 per
person on average. By knowing the enemy better, security researchers and
industry can more effectively stop the interconnected manifestations of
Internet threats: identity theft, the creation of botnets, and DoS attacks. The PIs have a track record of technology
transfer, with collaborators at industrial labs (Yahoo, MSR, Symantec,
AT&T, IBM), national labs (LLNL, Sandia), open-source software
("Pegasus"), and spin-off startups (StopTheHacker).
Educational impacts include developing a new course, providing publicly
available educational material, and open-source software.
1.2. Keywords
Data mining, web-based malware
dissemination, graph mining.
1.3. Funding agency
- NSF,
Award Number: CNS-1314603, Duration: September 1, 2013 - August 31,
2018 (Estimated)
2. PEOPLE INVOLVED
The
following professors are co-PIs on this project:
The
following graduate students and postdocs have worked on the project:
3. RESOURCES
Selected
Papers:
- Sucheta Soundarajan, Tina
Eliassi-Rad, Brian Gallagher, and Ali Pinar, ε -WGX:
Adaptive Edge Probing for Enhancing Incomplete Networks, Proceedings
of the 9th International ACM Web Science Conference (WebSci'17), Troy, NY,
June 2017.
- Kijung Shin, Tina Eliassi-Rad, and Christos Faloutsos. CoreScope: Graph Mining Using k-Core
Analysis--Patterns, Anomalies, and Algorithms, Proceedings of the
16th IEEE International Conference on Data Mining (ICDM'16), Barcelona,
Spain, December 2016.
- Sucheta Soundarajan, Tina
Eliassi-Rad, Brian Gallagher, and Ali Pinar. MaxReach: Reducing Network Incompleteness through Node
Probes, Proceedings of the 2016 IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining (ASONAM'16), San
Francisco, CA, August 2016.
- Chen
Chen, Hanghang Tong,
B. Aditya Prakash, Tina Eliassi-Rad, Michalis Faloutsos, and Christos
Faloutsos. Eigen-Optimization
on Large Graphs by Edge Manipulation, ACM Transactions on Knowledge
Discovery in Data (TKDD), 10(4), Article 49, June 2016.
- Sucheta Soundarajan, Acar Tamersoy, Elias Khalil,
Tina Eliassi-Rad, Duen Horng
Chau, Brian Gallagher, Kevin Roundy. Generating
Graph Snapshots from Streaming Edge Data, Proceedings of the 25th
International World Wide Web Conference (WWW'16), Montreal, Canada, April
2016.
- Chen
Chen, Hanghang Tong,
B. Aditya Prakash, Charalampos Tsourakakis, Tina Eliassi-Rad, Christos Faloutsos, and
Duen Horng Chau. Node Immunization on
Large Graphs: Theory and Algorithms, IEEE Transactions on Knowledge
and Data Engineering (TKDE), 28(1):113-126, January 2016.
- Sucheta
Soundarajan, Tina Eliassi-Rad, Brian Gallagher, and Ali Pinar. MaxOutProbe:
An Algorithm for Increasing the Size of Partially Observed Networks,
NIPS Workshop on Networks in the Social and Information Sciences,
Montreal, Canada, December 2015.
- Long
T. Le, Tina Eliassi-Rad, and Hanghang Tong. MET: A Fast Algorithm for
Minimizing Propagation in Large Graphs with Small Eigen-Gaps, Proceedings
of the 2015 SIAM International Conference on Data Mining (SDM'15),
Vancouver, British Columbia, Canada, April 2015.
- Keith
Henderson, Brian Gallagher, and Tina Eliassi-Rad. EP-MEANS: An
Efficient Nonparametric Clustering of Empirical Probability Distributions,
Proceedings of the 30th ACM SIGAPP Symposium On Applied Computing
(SAC'15), Salamanca, Spain, April 2015.
- Sucheta
Soundarajan, Tina Eliassi-Rad, and Brian Gallagher. A Guide to
Selecting a Network Similarity Method, Proceedings of the 2014 SIAM
International Conference on Data Mining (SDM'14), Philadelphia, PA,
April 2014.
- Long
T. Le, Tina Eliassi-Rad, Foster Provost, and Lauren Moores.
Hyperlocal: Inferring
Location of IP Addresses in Real-time Bid Requests for Mobile Ads, Proceedings
of the 6th ACM SIGSPATIAL International Workshop on Location-Based Social
Networks (LBSN'13), Orlando, FL, November 2013.
- Michele
Berlingerio, Danai Koutra, Tina Eliassi-Rad, and Christos Faloutsos. Network
Similarity via Multiple Social Theories, Proceedings of the 5th
IEEE/ACM International Conference on Advances in Social Networks Analysis
and Mining (ASONAM'13). Niagara Falls, Canada, August 2013.
Selected
tutorials with co-PIs:
- Tina
Eliassi-Rad, Sucheta Soundarajan, Ali Pinar, and Brian Gallagher. Problems with Incomplete Networks:
Biases, Skewed Results, and Solutions, The 2016 SIAM International Conference on Data Mining (SDM’16),
Miami, FL, May 2016.
- Danai Koutra, Tina Eliassi-Rad,
and Christos Faloutsos. Node and Graph
Similarity: Theory and Applications, The 2014 IEEE International
Conference on Data Mining (IEEE'14), Shenzhen, China, December 2014.
- Danai Koutra, Tina Eliassi-Rad,
and Christos Faloutsos. Node Similarity,
Graph Similarity and Matching: Theory and Applications, The 2014
SIAM International Conference on Data Mining (SDM'14), Philadelphia,
PA, April 2014.
- Tina
Eliassi-Rad and Christos Faloutsos. Discovering Roles and Anomalies
in Graphs: Theory and Applications, The 2013 European Conference on
Machine Learning and Principles and Practice of Knowledge Discovery In
Databases (ECML PKDD 2013), Prague, Czech Republic, September 2013.
Selected
invited talks at conferences and workshops:
- Tina
Eliassi-Rad, So Many Choices, So Few Guidelines: Sifting Through Functions
On Networks. StatPhy26 Satellite
Meeting on Complex Networks: From Theory to Interdisciplinary Applications,
Marseille, France, July 2016
- Tina
Eliassi-Rad, So Many Choices, So Few Guidelines: Sifting Through Functions
On Networks. The 2016 International
Conference on Social Computing, Behavioral-cultural Modeling, &
Prediction and Behavior Representation in Modeling and Simulation
(SBP-BRiMS’16), Washington, DC, June 2016.
- Tina
Eliassi-Rad. The Reasonable Effectiveness of Roles in Networks. IEEE ICDM Workshop on Behavior
Analysis, Modeling, and Steering (BEAMS’15), Atlantic City, NJ,
November 2015.
- Tina
Eliassi-Rad. The Reasonable Effectiveness of Roles in Networks. Workshop on Information in Networks
(WIN’15), New York, NY, October 2015.
- Tina
Eliassi-Rad. The Reasonable Effectiveness of Roles in Networks. NetSci'15 Satellite on Higher-Order
Models in Network Science (HONS'15), Zaragoza, Spain, June 2015.
- Tina
Eliassi-Rad. Discovering Roles in Graphs: Algorithms and Applications. MIT Lincoln Laboratory's 5th Annual
Graph Exploitation Symposium (GraphEx),
Dedham, MA, August 2014.
- Tina
Eliassi-Rad. Discovering Roles in Graphs: Algorithms and Applications. International
Conference on Network Science (NetSci'14), Berkeley, CA, June 2-6,
2014.
- Tina
Eliassi-Rad. Structural Features Threaten Privacy across Social Graphs. Charles
River Workshop on Private Analysis of Social Networks, Boston, MA, May
2014
- Tina
Eliassi-Rad. Discovering Roles in Graphs: Algorithms and Applications. SIAM SDM Workshop
on Mining Networks and Graphs: A Big Data Analytics Challenge (MNG
2014), Philadelphia, PA, April 2014.
- Tina
Eliassi-Rad. Measuring Tie Strength in Implicit Social Networks. Spring Workshop on Mining and
Learning (SMiLe), Oostende, Belgium, March
2014.
- Tina
Eliassi-Rad. Affecting Dissemination on Graphs. DIMACS/RUCIA
Workshop on Information Assurance in the Era of Big Data , Piscataway,
NJ, February 2014.
- Tina
Eliassi-Rad. Discovering Roles in Graphs: Algorithms and Applications. IPAM Workshop on
Mathematics of Social Learning, Los Angeles, CA, January 2014.
- Tina
Eliassi-Rad. Measuring Tie Strength in Implicit Social Networks. DIMACS Workshop
on Statistical Analysis of Network Dynamics and Interactions,
Piscataway, NJ, November 2013.
All other papers, talks, tutorial slides, and other
resources are available here.
Last updated September 1, 2017 by
Tina Eliassi-Rad.