Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
publications
A Revisit of Fake News Dataset with Augmented Fact-checking by Chatgpt
2023, Zizhong Li, Haopeng Zhang, Jiawei Zhang
This paper revisits the existing fake news dataset verified by human journalists with augmented fact-checking by large language models (ChatGPT), and we name the augmented fake news dataset ChatGPT-FC.
Unveiling the Magic: Investigating Attention Distillation in Retrieval-Augmented Generation
2024, Zizhong Li, Haopeng Zhang, Jiawei Zhang
This paper conducts a comprehensive review of attention distillation workflow and identifying key factors influencing the learning quality of retrieval-augmented language models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics.
Learning by Ranking: Data-Efficient Knowledge Distillation from Black-Box LLMs for Information Retrieval
2024, Zizhong Li, Haopeng Zhang, Jiawei Zhang
This paper introduce Intermediate Distillation, a data-efficient knowledge distillation training scheme that treats LLMs as black boxes and distills their knowledge via an innovative LLM-ranker-retriever pipeline, solely using LLMs ranking generation as the supervision signal.
Token-Level Precise Attack on RAG: Searching for the Best Alternatives to Mislead Generation
2025, Zizhong Li, Haopeng Zhang, Jiawei Zhang
This paper proposes Token-level Precise Attack on the RAG (TPARAG), which leverages a lightweight white-box LLM as an attacker to generate and iteratively optimize malicious passages at the token level and is suitable for both white-box and black-box RAG systems.