Daeyoung Kim

Daeyoung Kim

Research Engineer

I am a Research Engineer at KT (Korea Telecom), developing an omni-modality model. I received my M.S. in AI from KAIST, advised by Prof. Edward Choi, and my B.S. in Computer Science and Information Security Convergence from Korea University.

My research interest is building reliable and robust multimodal AI. I am passionate about developing systems that can understand and process information from diverse sources—such as images, text, and speech—while ensuring their trustworthiness and reliability.

Download CV

Research Interests

  • Multimodal AI
  • Reliable Machine Learning
  • Natural Language Processing

Education

  • M.S. in Artificial Intelligence
    KAIST, 2021 - 2023
  • B.S. in Computer Science
    Korea University, 2015 - 2021

Work Experience

KT (Korea Telecom)

Research Engineer | Seoul, South Korea

Dec 2024 - Present

  • Constructed synthetic document and instruction-tuning datasets for a Vision Language Model (VLM).
  • Designed the core architecture and trained an omni-modal (image, text, speech) model.
NCSOFT

Research Engineer | Seongnam, South Korea

Feb 2023 - Dec 2024

  • Managed the full training pipeline for VARCO-VISION, a Vision Language Model (VLM)
  • Developed VARCO-Text, an LLM-based writing assistant, and its datasets
  • Performed alignment tuning for VARCO LLM 2.0 with custom Korean datasets
  • Designed and built EvalBiasBench, a benchmark to identify and mitigate LLM-as-an-Evaluator biases
NAVER

Research Intern | Seongnam, South Korea

Jul 2022 - Jan 2023

  • Scaled an LLM-based sentence embedding model from 137M to 7B parameters
  • Developed a zero-shot text classification method using sentence encoders
NAVER

Research Intern | Seongnam, South Korea

Jul 2020 - Aug 2020

  • Developed a sentiment and intent classification model for online comments with custom data augmentation

Publications

* denotes equal contribution.

VARCO-VISION: Expanding Frontiers in Korean Vision-Language Models

Jeongho Ju*, Daeyoung Kim*, SunYoung Park*, and Youngjune Kim

Technical Report, 2024

PDF Model
OffsetBias: Leveraging Debiased Data for Tuning Evaluators

Junsoo Park*, Seungyeon Jwa*, Meiying Ren, Daeyoung Kim, and Sanghyuk Choi

In Findings in Empirical Methods in Natural Language Processing (EMNLP), 2024

PDF GitHub Dataset Model
Towards the Practical Utility of Federated Learning in the Medical Domain

Seongjun Yang*, Hyeonji Hwang*, Daeyoung Kim, Radhika Dua, Jong-Yeup Kim, Eunho Yang, and Edward Choi

In Proc. of Conference on Health, Inference, and Learning (CHIL), 2023

PDF GitHub
Revisiting the Importance of Amplifying Bias for Debiasing

Jungsoo Lee*, Jeonghoon Park*, Daeyoung Kim*, Juyoung Lee, Edward Choi, and Jaegul Choo

In Proc. of Association for the Advancement of Artificial Intelligence (AAAI), 2023 (Oral Presentation)

PDF GitHub
Uncertainty-Aware Text-to-Program for Question Answering on Structured Electronic Health Records

Daeyoung Kim, Seongsu Bae, Seungho Kim, and Edward Choi

In Proc. of Conference on Health, Inference, and Learning (CHIL), 2022

PDF GitHub
Question Answering for Complex Electronic Health Records Database using Unified Encoder-Decoder Architecture

Seongsu Bae, Daeyoung Kim, Jiho Kim, and Edward Choi

In Proc. of Machine Learning for Health (ML4H), 2021 (Oral Presentation)

PDF
Empowering Sentence Encoders with Prompting and Label Retrieval for Zero-shot Text Classification

Jimin Hong*, Jungsoo Park*, Daeyoung Kim*, Seongjae Choi, Bokyung Son, and Jaewook Kang

Preprint, 2022

PDF