I'm an Associate Professor in the Department of Computer Science and Engineering at the University of Notre Dame. My research fields are data mining, machine learning, and natural language processing. My data science research focuses on graph and text data for applications such as intelligent assistance, recommender system, question answering, scientific discovery, and mental healthcare. It is at the intersection of knowledge graph, graph machine learning, information extraction, text mining, and text generation. [C.V.]
My recent projects focus on knowledge-augmented NLP, open-domain question answering, text generation and large language models for education and mental health, graph data augmentation, and graph property prediction for material discovery.
I am directing the Data Mining towards Decision Making (DM2) Laboratory, supported by National Science Foundation (NSF), National Institutes of Health (NIH), Office of Naval Research (ONR), Amazon, Snap, Condé Nast, and ND International.
What's New
- August 2023: The second KnowledgeNLP workshop (Knowledge-augmented Methods for NLP) was successful at KDD 2023! There were 200+ attendees. Slides of keynote talks and oral presentations are available!
- June 2023: Awarded a new grant from NSF for intelligent scientific text analytics! We are excited to develop advanced natural language intelligence for sciences. Thank you, NSF!
- May 2023: Gang's work was accepted to KDD on graph regression and imbalance learning!
- April 2023: Noah's work was accepted to ACL on large language models for open-domain question answering!
Latest Publications
- Graph Data Augmentation for Graph Machine Learning: A Survey,
IEEE Data Engineering Bulletin, 2023.
- Semi-Supervised Graph Imbalanced Regression,
KDD, 2023.
- Large Language Models are Built-in Autoregressive Search Engines,
Findings of ACL, 2023.
- Exploring Contrast Consistency of Open-Domain Question Answering Systems on Minimally Edited Questions,
TACL, 2023.
- Generate rather than Retrieve: Large Language Models are Strong Context Generators,
ICLR, 2023.
- A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods,
EACL, 2023.
- AutoGDA: Automated Graph Data Augmentation for Node Classification,
LoG, 2022.
- A Unified Encoder-Decoder Framework with Entity Memory,
EMNLP, 2022.
- Retrieval Augmentation for Commonsense Reasoning: A Unified Approach,
EMNLP, 2022.
- Graph Rationalization with Environment-based Augmentations,
KDD, 2022.
- Learning from Counterfactual Links for Link Prediction,
ICML, 2022.
- Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts,
ACL, 2022.
- Dict-BERT: Enhancing Language Model Pre-training with Dictionary,
ACL, 2022.
- Deep Multimodal Complementarity Learning,
IEEE Transactions on Neural Networks and Learning Systems, 2022.
- A Survey of Knowledge-Enhanced Text Generation,
ACM Computing Surveys, 2022.
Recent Talks
- Effective and Efficient Knowledge-Intensive NLP
[abstract]:
cover RACo (EMNLP 2022), GenRead (ICLR 2023), and EDMem (EMNLP 2022).
- Data Augmentation for Graph Regression
[abstract]:
cover GREA (KDD 2022), SGIR (KDD 2023), and DCT (DLG 2023).
- Enhancing Language Generation with Knowledge Graphs
[abstract]:
cover FASum (NAACL 2021), MoKGE (ACL 2022), and EDMem (EMNLP 2022).
- Novel Methods that Learn to Augment Graph Data
[abstract]:
cover GAug (AAAI 2021), Eland (CIKM 2021), CFLP (ICML 2022), and GREA (KDD 2022).
- Structured Knowledge is Still Essential to Understand Sciences
[abstract]:
cover SciKG (KDD 2019), MIMO (EMNLP 2019), Tablepedia (WWW 2020), TCN (WWW 2021), and GenTaxo (KDD 2021).
- Graph Learning for Behavior Modeling:
cover TUBE (KDD 2019), M2TUBE (TNNLS 2022), CalendarGNN (KDD 2020), CoEvoGNN (DLG 2020 Best Paper / TKDE 2021), GAL (CIKM 2021), and PamFul (TNNLS 2021), including user profiling, recommendation, and suspicious behavior detection.
Last updated on August 16, 2023.