Avatar

Kun Qian

Computer Science Researcher

San Jose, California, USA

kunqian.usa at gmail.com


About me

My name is Kun Qian (钱坤 in Chinese). I am currently a researcher at IBM Research. My research is at the intersection of data integration, artificial intelligence, and NLP. The goal of my work is to build intelligent human-in-the-loop machine learning systems for entity understanding and knowledge creation. Before joining IBM Research, I earned my PhD in Computer Science under the supervision of three excellent advisers: Balder ten Cate, Phokion Kolaitis, and Wang-Chiew Tan at University of California Santa Cruz (UCSC). Before coming to the US, I also worked with James Cheng at Nanyang Technological University in Singapore.

Find me on social media.


Curriculum Vitae


Education

PhD in Computer Science
2012-2017
University of California, Santa Cruz (USA)

Master in Software Engineering
2007-2010
Beihang University (China)
Visited Kyushu University(Japan) from 2008-2009

Bachelor in Software Engineering
2003-2007
Chongqing University (China)


Skills

Object-oriented:
Python, Java

Web development:
Angular 7, Angular Material, AngularJS, Javascript, HTML/CSS, W3.CSS, Django

Deep learning framework
Pytorch

Others:
Jupyter notebook, LaTeX


Professional Affiliations & Services

Academic membership:
AAAI (since 2019)

Program committee and journal referees:
AAAI 2020
IEEE ICDE 2020 (industry track)
IEEE Big Data 2019
ACM TODS (2019, 2018)
IEEE TKDE (2019)
WebDB@SIGMOD (2018)

External reviewers:
CIKM (2017, 2018)
KDD 2017
AAAI 2017
ADAMA 2017



Visitors since May 2019
Locations of Site Visitors

Avatar
This is a picture drawn by my wife, whose studio can be found here.

Research Interests

Interested topics (last update: December 2019)

human-in-the-loop machine learning, active learning, deep learning, artificial intelligence, data integration and data exchange.


I am currently interested in building intelligent human-in-the-loop machine learning systems for entity understanding and knowledge curation. I am interested in developing intuitive user interfaces for these systems so that users without coding skills can learn high-quality models with low efforts. Representative systems include LUSTRE (ICDE 2018 demo track) and SystemER (VLDB 2019 demo track).

Since 2019, I have been particularly interested in designing advanced active learning techniques to help deep learning based entity understanding approaches to perform well even in low-resource settings (i.e., few labeled data or no labeled data is available). One representative system is PARTNER (AAAI 2020 demo track). Moreover, one of my latest projects is understanding the explainability for NLP tasks, and we are giving a tutorial at AACL 2020 on this topic.


Selected Publications

* Authors are ordered alphabetically in two cases: (1) it is a technical tutorial, or (2) it's theoretical work done with my PhD advisers (Balder ten Cate, Phokion Kolaitis, and Wang-Chiew Tan) where we adopted the convention in theory community.


  • (New) Explainability for Natural Language Processing.[Tutorial]
    Shipi Dhanorkar, Yunyao Li, Lucian Popa, Kun Qian*, Christine T Wolf, and Anbang Xu
    The 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics. (AACL-IJCNLP) 2020
  • (New) PARTNER (Title temporarily omitted).
    Kun Qian, Poornima Chozhiyath Raman, Yunyao Li, and Lucian Popa.
    The 34rd AAAI Conference on Artificial Intelligence. AAAI 2020 (demo)
  • Learning-based Human-in-the-loop Methods for Entity Resolution.[Tutorial]
    Sairam Gurajada, Lucian Popa, Kun Qian*, and Prithviraj Sen.
    The 28th ACM International Conference on Information and Knowledge Management (CIKM 2019).
  • Learning Explainable Entity Resolution Algorithms for Small Business Data using SystemER.
    Kun Qian, Douglas R Burdick, Sairam Gurajada, and Lucian Popa.
    Data Science for Macro-modeling with Financial and Economic Datasets (DSMM'19) @ SIGMOD'19.
  • Low-resource Deep Entity Resolution with Transfer and Active Learning.
    Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, and Lucian Popa.
    The 57th Annual Meeting of The Association for Computational Linguitics (ACL 2019).
  • SystemER: A Human-in-the-loop System for Explainable Entity Resolution.
    Kun Qian, Lucian Popa, and Prithviraj Sen.
    The 45th International Conference on Very Large Data Bases (VLDB 2019). (to appear).
    Los Angeles, USA
    PDF Poster Video demo
    Recorded in March 2019
  • Knowledge Refinement via Rule Selection.
    Phokion Kolaitis, Lucian Popa, Kun Qian*
    The 33rd AAAI Conference on Artificial Intelligence (AAAI 2019).
    Hawaii, USA.
  • Exploiting Structure in Representation of Named Entities using Active Learning.
    Nikita Bhutani, Kun Qian, Yunyao Li, H.V. Jagadish, Mauricio A. Hernandez, Mitesh Vasa.
    The 27th International Conference on Computational Linguistics (COLING 2018).
    Santa Fe, USA.
  • LUSTRE: An Interactive System for Entity Structured Representation and Variant Generation.
    Kun Qian, Nikita Bhutani, Yunyao Li, H.V. Jagadish, and Mauricio A. Hernandez.
    The 34th IEEE International Conference on Data Engineering (IEEE ICDE 2018).
    Paris, France.
    Preprint Video demo
    Recorded in December 2017
    Video demo (long version recorded in 2019)
    Recorded in December 2017
  • Active Learning of GAV Schema Mappings.
    Balder ten Cate, Phokion Kolaitis, Kun Qian*, and Wang-Chiew Tan.
    The 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. (PODS 2018).
    Houston, USA
    PDF Slides Poster Presentation
    June 2018
  • Discovering Information Integration Specifications from Data Examples.
    Kun Qian.
    PhD dissertation, UC Santa Cruz.
  • Active Learning for Large-Scale Entity Resolution.
    Kun Qian, Lucian Popa, and Prithviraj Sen.
    The 26th ACM International Conference on Information and Knowledge Management (CIKM 2017).
    Singapore.
  • Approximation Algorithms for Schema-Mapping Discovery from Data Examples.
    Balder ten Cate, Phokion Kolaitis, Kun Qian*, and Wang-Chiew Tan.
    ACM Transactions on Database Systems (ACM TODS 2017).
  • Approximation Algorithms for Schema-Mapping Discovery from Data Examples.
    Balder ten Cate, Phokion Kolaitis, Kun Qian*, and Wang-Chiew Tan.
    Alberto Mendelzon International Workshop on Foundation of Databases and the Web (AMW 2015).

Talks

  • Low-resource Deep Entity Resolution with Transfer and Active Learning.
    @ 2019 CROSS Research Symposium & Oktoberfest

Work Experience

Research Engineer @ IBM Research - Almaden (USA)
Feb 2017 - Current

Project Officer @ Nanyang Technological University (Singapore)
October 2010 - August 2011