Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Short Text Similarity based on Probabilistic Topics

Published in Knowledge and Information Systems, 2010

Calculates similarity between short texts based on their probabilistic topic distributions.

Recommended citation: Xiaojun Quan, Gang Liu, Zhi Lu, Xingliang Ni, Liu Wenyin. (2010). "Short Text Similarity based on Probabilistic Topics." Knowledge and Information Systems 2010.
Download Paper

Discovering Phishing Target Based on Semantic Link Network

Published in Future Generation Computer Systems, 2010

Uses semantic link networks to identify the targets of phishing attacks.

Recommended citation: Liu Wenyin, Ning Fang, Xiaojun Quan, Bite Qiu, Gang Liu. (2010). "Discovering Phishing Target Based on Semantic Link Network." Future Generation Computer Systems 2010.
Download Paper

Automatic Categorization of Questions for User-Interactive QA

Published in Information Processing & Management, 2011

Develops automated methods for categorizing questions in interactive QA platforms to improve retrieval.

Recommended citation: Wanpeng Song, Liu Wenyin, Naijie Gu, Xiaojun Quan, Tianyong Hao. (2011). "Automatic Categorization of Questions for User-Interactive QA." Information Processing & Management 2011.
Download Paper

Term Weighting Schemes for Question Categorization

Published in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011

Evaluates and proposes novel term weighting schemes specifically designed for question categorization tasks.

Recommended citation: Xiaojun Quan, Liu Wenyin, Bite Qiu. (2011). "Term Weighting Schemes for Question Categorization." IEEE Transactions on Pattern Analysis and Machine Intelligence 2011.
Download Paper

Short Text Clustering by Finding Core Terms

Published in Knowledge and Information Systems, 2011

Improves short text clustering by identifying and utilizing core terms to represent text content.

Recommended citation: Xingliang Ni, Xiaojun Quan, Zhi Lu, Liu Wenyin, Bei Hua. (2011). "Short Text Clustering by Finding Core Terms." Knowledge and Information Systems 2011.
Download Paper

User Interest Modeling and Its Application for Question Recommendation

Published in Information Processing & Management, 2012

Models user interests to recommend relevant questions in user-interactive Question Answering systems.

Recommended citation: Xingliang Ni, Yao Lu, Xiaojun Quan, Liu Wenyin, Bei Hua. (2012). "User Interest Modeling and Its Application for Question Recommendation." Information Processing & Management 2012.
Download Paper

Antiphishing through Phishing Target Discovery

Published in IEEE Internet Computing, 2012

Proposes a method to combat phishing by discovering the intended targets of phishing websites.

Recommended citation: Liu Wenyin, Gang Liu, Bite Qiu, Xiaojun Quan. (2012). "Antiphishing through Phishing Target Discovery." IEEE Internet Computing 2012.
Download Paper

Link Graph Analysis for Business Site Selection

Published in IEEE Computer, 2012

Applies link graph analysis techniques to the problem of optimal business site selection.

Recommended citation: Xiaojun Quan, Hui Xiong, Wenyu Dou, Liu Wenyin, Yong Ge. (2012). "Link Graph Analysis for Business Site Selection." IEEE Computer 2012.
Download Paper

Feature Selection for High-Dimensional Imbalanced Data

Published in Neurocomputing, 2013

Addresses the challenges of feature selection in high-dimensional imbalanced datasets.

Recommended citation: Liuzhi Yin, Yong Ge, Keli Xiao, Xuehua Wang, Xiaojun Quan. (2013). "Feature Selection for High-Dimensional Imbalanced Data." Neurocomputing 2013.
Download Paper

Towards Building a Social Emotion Detection System for Online News

Published in Future Generation Computer Systems, 2014

Describes the architecture and implementation of a system for detecting social emotions in online news.

Recommended citation: Jingsheng Lei, Yanghui Rao, Xiaojun Quan, Qing Li, Liu Wenyin. (2014). "Towards Building a Social Emotion Detection System for Online News." Future Generation Computer Systems 2014.
Download Paper

Affective topic model for social emotion detection

Published in Neural Networks, 2014

Introduces an affective topic model to capture latent emotional themes in social media text.

Recommended citation: Yanghui Rao, Qing Li, Wenyin Liu, Qingyuan Wu, Xiaojun Quan. (2014). "Affective topic model for social emotion detection." Neural Networks 2014.
Download Paper

Latent Discriminative Models for Social Emotion Detection

Published in ACM Transactions on Information Systems, 2015

Proposes latent discriminative models that incorporate emotional dependency for social emotion detection.

Recommended citation: Xiaojun Quan, Qifan Wang, Ying Zhang, Luo Si, Liu Wenyin. (2015). "Latent Discriminative Models for Social Emotion Detection." ACM Transactions on Information Systems 2015.
Download Paper

Towards Non-Monotonic Sentence Alignment

Published in Information Sciences, 2015

Investigates algorithms for non-monotonic sentence alignment, addressing complex cross-lingual correspondences.

Recommended citation: Xiaojun Quan, Chunyu Kit. (2015). "Towards Non-Monotonic Sentence Alignment." Information Sciences 2015.
Download Paper

Generating Multi-hop Reasoning Questions to Improve MRC

Published in WWW, 2020

Proposes generating multi-hop reasoning questions as a data augmentation strategy to improve Machine Reading Comprehension.

Recommended citation: Jianxing Yu, Xiaojun Quan, Qinliang Su, Jian Yin. (2020). "Generating Multi-hop Reasoning Questions to Improve MRC." WWW 2020.
Download Paper

Joint Chinese Word Segmentation and Part-of-speech Tagging

Published in ACL, 2020

A joint model for Chinese Word Segmentation and POS tagging utilizing two-way attentions of auto-analyzed knowledge.

Recommended citation: Yuanhe Tian, Yan Song, Xiang Ao, Fei Xia, Xiaojun Quan, Tong Zhang, Yonggang Wang. (2020). "Joint Chinese Word Segmentation and Part-of-speech Tagging." ACL 2020.
Download Paper

Low-Resource Generation of Multi-hop Reasoning Questions

Published in ACL, 2020

Addresses the challenge of generating multi-hop reasoning questions in low-resource scenarios.

Recommended citation: Jianxing Yu, Wei Liu, Shuang Qiu, Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin. (2020). "Low-Resource Generation of Multi-hop Reasoning Questions." ACL 2020.
Download Paper

Multi-Domain Dialogue Acts and Response Co-Generation

Published in ACL, 2020

A co-generation framework that simultaneously generates dialogue acts and responses in multi-domain settings.

Recommended citation: Kai Wang, Junfeng Tian, Rui Wang, Xiaojun Quan, Jianxing Yu. (2020). "Multi-Domain Dialogue Acts and Response Co-Generation." ACL 2020.
Download Paper

Constituency Lattice Encoding for Aspect Term Extraction

Published in COLING, 2020

Incorporates constituency lattice information into encoding for more accurate aspect term extraction.

Recommended citation: Yunyi Yang, Kun Li, Xiaojun Quan, Weizhou Shen, Qinliang Su. (2020). "Constituency Lattice Encoding for Aspect Term Extraction." COLING 2020.
Download Paper

Multi-hop Reasoning Question Generation and Its Application

Published in IEEE Transactions on Knowledge and Data Engineering, 2021

Explores the generation of multi-hop reasoning questions and its applications in QA systems. DOI: 10.1109/TKDE.2021.3073227.

Recommended citation: Jianxing Yu, Qinliang Su, Xiaojun Quan, Jian Yin. (2021). "Multi-hop Reasoning Question Generation and Its Application." IEEE Transactions on Knowledge and Data Engineering 2021.
Download Paper

Multi-Document Transformer for Personality Detection

Published in AAAI, 2021

A Multi-Document Transformer architecture designed to aggregate information from multiple user documents for personality detection.

Recommended citation: Feifan Yang, Xiaojun Quan, Yunyi Yang, Jianxing Yu. (2021). "Multi-Document Transformer for Personality Detection." AAAI 2021.
Download Paper

Syntax-Enhanced Pre-trained Model

Published in ACL, 2021

Integrates syntactic information into pre-trained models to improve their understanding of sentence structure.

Recommended citation: Zenan Xu, Daya Guo, Duyu Tang, Qinliang Su, Linjun Shou, Ming Gong, Wanjun Zhong, Xiaojun Quan, Daxin Jiang, Nan Duan. (2021). "Syntax-Enhanced Pre-trained Model." ACL 2021.
Download Paper

Compound Aspect Extraction by Augmentation and Constituency Lattice

Published in IEEE Transactions on Affective Computing, 2022

Focuses on compound aspect extraction using data augmentation and constituency lattices. DOI: 10.1109/TAFFC.2022.3161683.

Recommended citation: Xiaojun Quan, Zhengcheng Min, Kun Li, Yunyi Yang. (2022). "Compound Aspect Extraction by Augmentation and Constituency Lattice." IEEE Transactions on Affective Computing 2022.
Download Paper

GL-RG: Global-Local Representation Granularity for Video Captioning

Published in IJCAI, 2022

Combines global and local representation granularities to generate more precise and descriptive video captions.

Recommended citation: Liqi Yan, Yiming Cui, Qifan Wang, Xiangyu Zhang, Fuli Feng, Dongfang Liu, Xiaojun Quan. (2022). "GL-RG: Global-Local Representation Granularity for Video Captioning." IJCAI 2022.
Download Paper

XPrompt: Exploring the Extreme of Prompt Tuning

Published in EMNLP, 2022

Explores the boundaries of prompt tuning to achieve parameter efficiency without sacrificing performance.

Recommended citation: Fang Ma, Chen Zhang, Lei Ren, Jingang Wang, Qifan Wang, Wei Wu, Xiaojun Quan, Dawei Song. (2022). "XPrompt: Exploring the Extreme of Prompt Tuning." EMNLP 2022.
Download Paper

Multi-Party Conversation Modeling for Emotion Recognition

Published in IEEE Transactions on Affective Computing, 2023

A comprehensive study on modeling multi-party conversations for emotion recognition. DOI: 10.1109/TAFFC.2023.3273589.

Recommended citation: Xiaojun Quan, Siyue Wu, Junqing Chen, Weizhou Shen, Jianxing Yu. (2023). "Multi-Party Conversation Modeling for Emotion Recognition." IEEE Transactions on Affective Computing 2023.
Download Paper

Joint Generator-Ranker Learning for Natural Language Generation

Published in ACL, 2023

A joint learning framework that iteratively optimizes a generator and a ranker for high-quality natural language generation.

Recommended citation: Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen. (2023). "Joint Generator-Ranker Learning for Natural Language Generation." ACL 2023.
Download Paper

MixPAVE: Mix-Prompt Tuning for Few-shot Product Attribute Value Extraction

Published in ACL, 2023

A Mix-Prompt Tuning approach for few-shot product attribute extraction in e-commerce scenarios.

Recommended citation: Li Yang, Qifan Wang, Jingang Wang, Xiaojun Quan, Fuli Feng, Yu Chen, Madian Khabsa, Sinong Wang, Zenglin Xu, Dongfang Liu. (2023). "MixPAVE: Mix-Prompt Tuning for Few-shot Product Attribute Value Extraction." ACL 2023.
Download Paper

MUSTIE: Multimodal Structural Transformer for Web Information Extraction

Published in ACL, 2023

Introduces a multimodal structural transformer designed for efficient and robust web information extraction.

Recommended citation: Qifan Wang, Jingang Wang, Xiaojun Quan, Fuli Feng, Zenglin Xu, Shaoliang Nie, Sinong Wang, Madian Khabsa, Hamed Firooz, Dongfang Liu. (2023). "MUSTIE: Multimodal Structural Transformer for Web Information Extraction." ACL 2023.
Download Paper

APrompt: Attention Prompt Tuning for Efficient Adaptation of Pre-trained Language Models

Published in EMNLP, 2023

Proposes Attention Prompt Tuning (APrompt) for efficient and parameter-efficient adaptation of pre-trained language models.

Recommended citation: Qifan Wang, Yuning Mao, Jingang Wang, Hanchao Yu, Shaoliang Nie, Sinong Wang, Fuli Feng, Lifu Huang, Xiaojun Quan, Zenglin Xu, Dongfang Liu. (2023). "APrompt: Attention Prompt Tuning for Efficient Adaptation of Pre-trained Language Models." EMNLP 2023.
Download Paper

Dual-Feedback Knowledge Retrieval for Task-Oriented Dialogue Systems

Published in EMNLP, 2023

Proposes a dual-feedback mechanism generating positive and negative feedback from the generator to train the retriever in TOD systems.

Recommended citation: Tianyuan Shi, Liangzhi Li, Zijian Lin, Tao Yang, Xiaojun Quan, Qifan Wang. (2023). "Dual-Feedback Knowledge Retrieval for Task-Oriented Dialogue Systems." EMNLP 2023.
Download Paper

MCC-KD: Multi-CoT Consistent Knowledge Distillation

Published in EMNLP, 2023

Generates multiple rationales for each question and enforces consistency among predictions by minimizing bidirectional KL-divergence.

Recommended citation: Hongzhan Chen, Siyue Wu, Xiaojun Quan, Rui Wang, Ming Yan, Ji Zhang. (2023). "MCC-KD: Multi-CoT Consistent Knowledge Distillation." EMNLP 2023.
Download Paper

Knowledge Fusion of Large Language Models

Published in ICLR, 2024

The pioneering FuseLLM paper. It leverages generative distributions of source LLMs to externalize collective knowledge and transfer it to a target LLM.

Recommended citation: Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi. (2024). "Knowledge Fusion of Large Language Models." ICLR 2024.
Download Paper

Alignment-Enhanced Chinese Grammatical Error Corrector

Published in ACL, 2024

Proposes an alignment-enhanced corrector training both a correction model and an alignment model to address overcorrection in Chinese GEC.

Recommended citation: Haihui Yang, Xiaojun Quan. (2024). "Alignment-Enhanced Chinese Grammatical Error Corrector." ACL 2024.
Download Paper

SocialBench: Sociality Evaluation of Role-Playing Conversational Agents

Published in ACL, 2024

Introduces SocialBench, the first benchmark designed to systematically evaluate the sociality of role-playing agents at individual and group levels.

Recommended citation: Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Gao Xing, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, Fei Huang. (2024). "SocialBench: Sociality Evaluation of Role-Playing Conversational Agents." ACL 2024.
Download Paper

Knowledge Verification to Nip Hallucination in the Bud

Published in EMNLP, 2024

Mitigates hallucinations by verifying and minimizing inconsistency between external knowledge in alignment data and the intrinsic knowledge of LLMs.

Recommended citation: Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi. (2024). "Knowledge Verification to Nip Hallucination in the Bud." EMNLP 2024.
Download Paper

Self-Evolution Fine-Tuning for Policy Optimization

Published in EMNLP, 2024

Introduces SEFT, training an adaptive reviser to elevate low-quality responses and guide policy optimization using unannotated data.

Recommended citation: Ruijun Chen, Jiehao Liang, Shiping Gao, Fanqi Wan, Xiaojun Quan. (2024). "Self-Evolution Fine-Tuning for Policy Optimization." EMNLP 2024.
Download Paper

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

Published in EMNLP, 2024

Proposes a multi-LLM agent framework decomposing tool learning into planner, caller, and summarizer roles to overcome small model limitations.

Recommended citation: Weizhou Shen, Chenliang Li, Hongzhan Chen, Ming Yan, Xiaojun Quan, Hehong Chen, Ji Zhang, Fei Huang. (2024). "Small LLMs Are Weak Tool Learners: A Multi-LLM Agent." EMNLP 2024.
Download Paper

Lookahead Routing for Large Language Models

Published in NeurIPS, 2025

Presents Lookahead Routing, a method for improving efficiency and performance in large language model inference.

Recommended citation: Canbin Huang, Tianyuan Shi, Yuhua Zhu, Ruijun Chen, Xiaojun Quan. (2025). "Lookahead Routing for Large Language Models." NeurIPS 2025.
Download Paper

Probabilistic Token Alignment for Large Language Model Fusion

Published in NeurIPS, 2025

Proposes probabilistic token alignment to improve the effectiveness of large language model fusion.

Recommended citation: Runjia Zeng, James Chenhao Liang, Cheng Han, Zhiwen Cao, Jiahao Liu, Xiaojun Quan, Yingjie Victor Chen, Lifu Huang, Tong Geng, Qifan Wang, Dongfang Liu. (2025). "Probabilistic Token Alignment for Large Language Model Fusion." NeurIPS 2025.
Download Paper

FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion

Published in arXiv Preprint, 2025

Introduces a reinforcement learning framework for model fusion, combining weighted supervised fine-tuning and weighted preference optimization.

Recommended citation: Longguang Zhong, Fanqi Wan, Ziyi Yang, Guosheng Liang, Tianyuan Shi, Xiaojun Quan. (2025). "FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion." arXiv Preprint 2025.
Download Paper

Advantage-Guided Distillation for Preference Alignment in Small Language Models

Published in ICLR, 2025

Proposes Advantage-Guided Distillation for Preference Alignment (ADPA) to guide the alignment of small language models using nuanced distribution-level signals from teacher models.

Recommended citation: Shiping Gao, Fanqi Wan, Jiajian Guo, Xiaojun Quan, Qifan Wang. (2025). "Advantage-Guided Distillation for Preference Alignment in Small Language Models." ICLR 2025.
Download Paper

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Published in ICLR SCI-FM Workshop, 2025

A study at the intersection of preference optimization and heterogeneous model fusion, enhancing chat capabilities through multi-model integration.

Recommended citation: Ziyi Yang, Fanqi Wan, Longguang Zhong, Canbin Huang, Guosheng Liang, Xiaojun Quan. (2025). "FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion." ICLR SCI-FM Workshop 2025.
Download Paper

Weighted-Reward Preference Optimization for Implicit Model Fusion

Published in ICLR, 2025

Introduces Weighted-Reward Preference Optimization (WRPO), an implicit fusion method enabling capability transfer between LLMs without requiring vocabulary alignment.

Recommended citation: Ziyi Yang, Fanqi Wan, Longguang Zhong, Tianyuan Shi, Xiaojun Quan. (2025). "Weighted-Reward Preference Optimization for Implicit Model Fusion." ICLR 2025.
Download Paper

Discriminative Policy Optimization for Token-Level Reward Models

Published in ICML, 2025

Revisits token-level reward assignment by decoupling reward modeling from language generation and deriving a token-level reward model (Q-RM) through discriminative policy optimization.

Recommended citation: Hongzhan Chen, Tao Yang, Shiping Gao, Ruijun Chen, Xiaojun Quan, Hongtao Tian, Ting Yao. (2025). "Discriminative Policy Optimization for Token-Level Reward Models." ICML 2025.
Download Paper

BlockPruner: Fine-grained Pruning for Large Language Models

Published in ACL, 2025

Targeting redundancies in multi-head attention and MLP blocks, this work proposes a fine-grained, training-free structured pruning approach for LLMs.

Recommended citation: Longguang Zhong, Fanqi Wan, Ruijun Chen, Xiaojun Quan, Liangzhi Li. (2025). "BlockPruner: Fine-grained Pruning for Large Language Models." ACL 2025.
Download Paper

Cool-Fusion: Fuse Large Language Models without Training

Published in ACL, 2025

A training-free fusion approach that ensembles heterogeneous LLMs at the text level and uses reranking to select the best generated segments.

Recommended citation: Cong Liu, Xiaojun Quan, Yan Pan, Weigang Wu, Xu Chen, Liang Lin. (2025). "Cool-Fusion: Fuse Large Language Models without Training." ACL 2025.
Download Paper

Mutual-Taught for Co-adapting Policy and Reward Models

Published in ACL, 2025

Presents Mutual-Taught, a self-training method that iteratively co-adapts policy and reward models during alignment without extra human annotation.

Recommended citation: Tianyuan Shi, Canbin Huang, Fanqi Wan, Longguang Zhong, Ziyi Yang, Weizhou Shen, Xiaojun Quan, Ming Yan. (2025). "Mutual-Taught for Co-adapting Policy and Reward Models." ACL 2025.
Download Paper

FuseChat: Knowledge Fusion of Chat Models

Published in EMNLP, 2025

Part of the FuseLLM series, this work proposes a framework to fuse knowledge from multiple chat models into a unified, more robust chat model.

Recommended citation: Fanqi Wan, Longguang Zhong, Ziyi Yang, Ruijun Chen, Xiaojun Quan. (2025). "FuseChat: Knowledge Fusion of Chat Models." EMNLP 2025.
Download Paper

ReAlign: Structured Revision for Small Language Model Alignment

Published in EMNLP, 2025

Introduces ReAlign, combining on-policy learning stability with reviser-assisted supervision to improve alignment in small language models.

Recommended citation: Ruijun Chen, Jiajian Guo, Hongzhan Chen, Fanqi Wan, Qifan Wang, Xiaojun Quan. (2025). "ReAlign: Structured Revision for Small Language Model Alignment." EMNLP 2025.
Download Paper

ThinkSwitcher: When to Think Hard, When to Think Fast

Published in EMNLP, 2025

Proposes a dynamic framework enabling Large Reasoning Models to switch between short and long Chain-of-Thought modes based on query complexity.

Recommended citation: Guosheng Liang, Longguang Zhong, Ziyi Yang, Xiaojun Quan. (2025). "ThinkSwitcher: When to Think Hard, When to Think Fast." EMNLP 2025.
Download Paper

ProFuser: Progressive Fusion of Large Language Models

Published in AAAI, 2026

Introduces ProFuser, a progressive fusion approach for combining multiple large language models effectively.

Recommended citation: Tianyuan Shi, Fanqi Wan, Canbin Huang, Xiaojun Quan, Chenliang Li, Ming Yan, Ji Zhang, Minhua Huang, Wu Kai. (2026). "ProFuser: Progressive Fusion of Large Language Models." AAAI 2026.
Download Paper

talks

teaching

Natural Language Processing

Undergraduate & Postgraduate Course, Sun Yat-sen University, 2025

Undergraduate: 2019–2025 (Annually)
Postgraduate: 2021, 2024, 2025