I'm an Associate Professor in the Department of Computer Science and Engineering at the University of Notre Dame. My research fields are data mining, machine learning, and natural language processing. My data science research focuses on graph and text data for applications such as intelligent assistance, recommender system, question answering, scientific discovery, and mental healthcare. It is at the intersection of knowledge graph, graph machine learning, information extraction, text mining, and text generation. [C.V.]
My recent projects focus on knowledge-augmented NLP, open-domain question answering, text generation and large language models for education and mental health, graph data augmentation, and graph property prediction for material discovery.
I am directing the Data Mining towards Decision Making (DM2) Laboratory, supported by National Science Foundation (NSF), National Institutes of Health (NIH), Office of Naval Research (ONR), Amazon, Snap, Condé Nast, and ND International.
What's New
- December 2023: IfQA led by Wenhao was selected for Outstanding Paper Award in EMNLP QA track!
- October 2023: Zhihan's work, Mengxia's work, and Wenhao's work were accepted to EMNLP on LLM instruction generation, comparative reasoning, and question answering!
- October 2023: The third KnowledgeNLP workshop (Knowledge-augmented Methods for NLP) will be held at ACL 2024!
- September 2023: Gang's work was accepted to NeurIPS on graph generative diffusion models!
- September 2023: Awarded a new grant from NSF for AI equality! We are excited to work with other PIs to broaden the impact of AI for future work. Thank you, NSF!
- August 2023: The second KnowledgeNLP workshop (Knowledge-augmented Methods for NLP) was successful at KDD 2023! There were 200+ attendees. Slides of keynote talks and oral presentations are available!
- June 2023: Awarded a new grant from NSF for intelligent scientific text analytics! We are excited to develop advanced natural language intelligence for sciences. Thank you, NSF!
- May 2023: Gang's work was accepted to KDD on graph regression and imbalance learning!
- April 2023: Noah's work was accepted to ACL on large language models for question answering!
Latest Publications
- Get an A in Math: Progressive Rectification Prompting,
AAAI, 2024.
- Pre-training Language Models for Comparative Reasoning,
EMNLP, 2023.
- IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions,
EMNLP, 2023.
- Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models,
Findings of EMNLP, 2023.
- Exploring Contrast Consistency of Open-Domain Question Answering Systems on Minimally Edited Questions,
EMNLP (from TACL), 2023.
- Data-Centric Learning from Unlabeled Graphs with Diffusion Model,
NeurIPS, 2023.
- Generate rather than Retrieve: Large Language Models are Strong Context Generators,
ICLR, 2023.
- Semi-Supervised Graph Imbalanced Regression,
KDD, 2023.
- Large Language Models are Built-in Autoregressive Search Engines,
Findings of ACL, 2023.
- A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods,
EACL, 2023.
- Explaining AI-informed Network Intrusion Detection with Counterfactuals,
INFOCOM, 2023.
- Rationalizing Graph Neural Networks with Data Augmentation,
TKDD, 2023.
- Transfer Learning across Graph Convolutional Networks: Methods, Theory, and Applications,
TKDD, 2023.
- Graph Data Augmentation for Graph Machine Learning: A Survey,
IEEE Data Engineering Bulletin, 2023.
Recent Talks
- Effective and Efficient Knowledge-Intensive NLP
[abstract]:
cover RACo (EMNLP 2022), GenRead (ICLR 2023), and EDMem (EMNLP 2022).
- Data Augmentation for Graph Regression
[abstract]:
cover GREA (KDD 2022), SGIR (KDD 2023), and DCT (NeurIPS 2023).
- Enhancing Language Generation with Knowledge Graphs
[abstract]:
cover FASum (NAACL 2021), MoKGE (ACL 2022), and EDMem (EMNLP 2022).
- Novel Methods that Learn to Augment Graph Data
[abstract]:
cover GAug (AAAI 2021), Eland (CIKM 2021), CFLP (ICML 2022), and GREA (KDD 2022).
- Structured Knowledge is Still Essential to Understand Sciences
[abstract]:
cover SciKG (KDD 2019), MIMO (EMNLP 2019), Tablepedia (WWW 2020), TCN (WWW 2021), and GenTaxo (KDD 2021).
- Graph Learning for Behavior Modeling:
cover TUBE (KDD 2019), M2TUBE (TNNLS 2022), CalendarGNN (KDD 2020), CoEvoGNN (DLG 2020 Best Paper / TKDE 2021), GAL (CIKM 2021), and PamFul (TNNLS 2021), including user profiling, recommendation, and suspicious behavior detection.
Last updated on December 10, 2023.