Kaili Huang

Senior MLE @ Apple | LLM Post-Training, RL & Alignment
Ex-Microsoft AI & ByteDance AI Lab (later Seed) | MSCS @ Stanford | BE @ Tsinghua

prof_pic_3.jpg

I am a machine learning engineer and researcher focused on LLM post-training, reinforcement learning (RL), ranking and retrieval, evaluation, and large-scale ML systems.

Across Apple, Microsoft AI, and ByteDance AI Lab (later Seed), I have worked on LLM post-training, RL, reward design, policy optimization, retrieval-augmented generation (RAG), multimodal embedding, human-in-the-loop evaluation, safety-oriented data workflows, and production ML systems. My current work at Apple focuses on SFT/RL post-training for large-scale LLM retrieval agents; my previous work at Microsoft AI focused on LLM evaluation, RAG alignment, multimodal retrieval, and efficient neural retrieval.

I received my M.S. in Computer Science from Stanford University and my B.Eng. from Tsinghua University. At Stanford, I was fortunate to learn from and work with leading professors and researchers across NLP, reinforcement learning, information retrieval, and ML systems, including Christopher Manning, Tengyu Ma, Christopher Potts, Monica Lam, and Matei Zaharia. At Tsinghua, I worked with renowned NLP professors Minlie Huang and Xiaoyan Zhu on dialogue systems and conversational AI.

news

Jun 28, 2026 I will be attending ACL 2026 in San Diego. Excited to meet with friends there!
May 02, 2026 I’m happy to share that I joined Apple as a Senior Machine Learning Engineer this January, working on LLM post-training, RL policy optimization, and alignment evaluation for large-scale retrieval. I look forward to pushing the boundaries of reasoning-driven retrieval and alignment. Read more on LinkedIn.
Apr 06, 2025 Excited to share that our paper, ColBERT-serve: Efficient Multi-Stage Memory-Mapped Scoring, was accepted and presented at the 47th European Conference on Information Retrieval (ECIR 2025). The work introduces a memory-mapped multi-stage scoring approach that makes late-interaction neural retrieval (ColBERT) far more efficient and scalable for serving LLM-powered retrieval systems.

selected publications

  1. ACL
    KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation
    Hao Zhou, Chujie Zheng, Kaili Huang, and 2 more authors
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), Jul 2020
  2. TACL
    CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
    Qi Zhu, Kaili Huang, Zheng Zhang, and 2 more authors
    Transactions of the Association for Computational Linguistics (TACL), Jun 2020
  3. NLPCC
    A Large-Scale Chinese Short-Text Conversation Dataset
    Yida Wang, Pei Ke, Yinhe Zheng, and 4 more authors
    In Natural Language Processing and Chinese Computing (NLPCC). Best Student Paper Award , Oct 2020
  4. ECIR
    ColBERT-Serve: Efficient Multi-stage Memory-Mapped Scoring
    Kaili Huang, Thejas Venkatesh, Uma Dingankar, and 9 more authors
    In Advances in Information Retrieval: 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6-10, 2025, Proceedings, Part IV, Lucca, Italy, Oct 2025
  5. arXiv
    DeepThink: Aligning Language Models with Domain-Specific User Intents
    Yang Li, Mingyu Luo, Yang Gong, and 4 more authors
    arXiv preprint arXiv:2502.05497, Feb 2025