• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Research
  • Group
  • Publications
  • Teaching
  • Videos

Nate Veldt

Texas A&M University College of Engineering

Research

For a list of publications, see my publications page,  google scholar page, or DBLP profile.

For information about my current students, click here.

If you are interested in doing research with me as a graduate student, take a look at my information for prospective students page.

Funding Sources

My research is generously supported by the Army Research Office (ARO), the Air Force Office of Scientific Research (AFOSR), the Department of Energy (DoE), and contracts from Lawrence Livermore National Laboratory.

Research overview: algorithms for data science

My research focuses on algorithms and computational methods for data analysis, especially data that can be modeled by a graph or network. My overarching goal is to develop methods that are fast, satisfy strong mathematical guarantees, and can be used to extract new insights from datasets coming from biology, sociology, medicine, e-commerce, and many other application domains.

https://veldt.engr.tamu.edu/wp-content/uploads/sites/235/2021/07/research-slide-video.mp4
Bridging the theory-practice gap in data analysis

There’s often a large gap between the best theoretical results and the actual practical tools that people use to analyze and extract information from large complex datasets. In my research I aim to help bridge that gap whenever possible. This often involves one of two high-level approaches:

  1. Take a theoretical tool or idea that comes with strong guarantees, and make it more practical for real world use, without sacrificing those theoretical guarantees. Some examples from my research:
      • My work on localized graph clustering  (ICML 2016, SIAM Data Mining 2019) built on earlier algorithms that came with great theoretical guarantees but were challenging to implement and use in practice. I was able to develop methods that came with similar theoretical guarantees and were also easy to implement. I was able to then use my implementations to tackle several large scale MRI segmentation problems.
      • Correlation clustering is a problem that originated in the theoretical computer science literature, but has also used for applications such as image segmentation, cross-lingual link detection, and cancer mutation analysis. The best theoretical algorithms for correlation clustering rely on solving an extremely large optimization problem, that in practice almost no one uses because of its high memory requirements. In 2018-2020, I developed new practical optimization methods (SIMODS 2019, SIAM CSC 2020) that scaled up to problems with over a trillion constraints. More recently, I have been developing approximation algorithms for the problem that can be made even more scalable, based on covering linear programs (ICML 2022, CIKM 2023, ICML 2024, SOSA 2026)
  2. Develop a deeper theoretical understanding of a practical technique that is often used in practice, but previously came with no formal theoretical guarantees. A key example from my research:
      • My PhD thesis focused on a framework for graph clustering called LambdaCC (WWW 2018), which unifies previous clustering algorithms and objective functions that were not previously known to be related. While many other clustering methods come with no formal approximation guarantees, LambdaCC comes with approximation algorithms that are guaranteed to output a solution that is within a bounded factor of the optimal solution. In follow-up work, I developed techniques for learning how to set hyperparameters to best fit an application of interest (WWW 2019), as well as theoretically rigorous techniques for approximating the generalized objective across all parameter regimes (MFCS 2020)
Research on hypergraphs

Much of my recent research has focused on analyzing complex systems of multiway interactions by modeling them as hypergraphs. 

Here is one recent article that highlights some of my ongoing research: Hypergraphs are worth the hype.

Some examples of applications

I aim to develop tools that can be broadly be applied to different types of datasets and application domains. Here is a non-exhaustive list of application domains and problems that have arisen in my work:

  • Sociology. E.g., quantifying homophily (the tendency for people to interact and connect with people that are similar to them) in group settings (Science Advances 2023); understanding how factors such as dorm, major, and graduation year affect social network connections among different students on university campuses (WWW 2019).
  • Communication networks. In a 2022 ICWSM paper, two collaborators and I modeled and analyzed the dataset of Anthony Fauci’s released emails by considering various ways it could be modeled as a graph or hypergraph (ICWSM 2022)
  • Biology. E.g., finding sets of related genes in a gene-interaction network (WWW 2018) This has also been a key application in recent research on the Shigella organism, see Page 121 here.
  • E-commerce. E.g., detecting sets of related products in an Amazon product hypergraph (KDD 2020, WWW 2021)
  • Medical imaging. E.g., quickly identifying the left atrial cavity in a full body MRI scan (SIAM Data Mining 2019)

Research communities

My research is very interdisciplinary and I enjoy collaborating with other researchers across computer science, mathematics, statistics, and other fields. My work has appeared in top computing conferences such as ICML, WebConf, ICLR, WSDM, CIKM, and KDD. I’m also an active SIAM member and publish in SIAM venues such as SIAM Data Mining, SIAM CSC, SIAM SOSA and the SIAM journal on the Mathematics of Data Science.

I’m especially excited to be a member of the SIAM Activity Group on Applied and Computational Discrete Algorithms which “brings together researchers who design and study combinatorial and graph algorithms motivated by applications.” The goals and focus of SIAM ACDA closely align with my research and my interests. You can check out their webpage for more details on the activity group and the ACDA conference. Currently I am serving a 2-year term as the Secretary for ACDA (one of four elected officers for the activity group).

Research videos

I occasionally post videos about my research on my YouTube Research Channel, ranging from short promotional videos, to research talks, to behind-the-scenes videos. For more information you can check out my Videos page.

Here’s a video from a few years ago that gives an overview of a KDD 2020 paper, that additionally touches on some early work on graph clustering with resolution parameters:

© 2016–2026 Nate Veldt Log in

Texas A&M Engineering Experiment Station Logo
  • College of Engineering
  • Facebook
  • Twitter
  • State of Texas
  • Open Records
  • Risk, Fraud & Misconduct Hotline
  • Statewide Search
  • Site Links & Policies
  • Accommodations
  • Environmental Health, Safety & Security
  • Employment