赵海主页分词自然语言计算语言学机器学习

教授

上海交通大学计算机科学与工程系

地址：上海市闵行区东川路800号

电子邮件：zhaohai at cs.sjtu.edu.cn

ACL-2019 领域主席形态学、分词,

ACL-2018 高级领域主席形态学、分词,

ACL-2017 领域主席语法分析,

ACL-2016 出版事务主席

新闻及硕士博士申请致信

上海交通大学，摄于2010年3月16日

评测论文软件

研究兴趣

自然语言处理，机器学习，生物信息学，人工智能

授课

自然语言处理

评测

[2010] NEWS-2010(和宋彦共同参与)
命名实体研讨会-2010 实体翻译评测任务

          英中和中英翻译均为第一名,

正式结果在此.

[2009] CoNLL-2009 (和陈文亮共同参与)
第13届计算自然语言学习大会(CoNLL-2009)的国际评估：多语种的句法和语义学习

     单一语义任务的7个提交系统中名列七种语言的总分第一，同时也是本次评估所有提交的20个系统语义总分第一名

     句法-语义联合任务的13个提交系统中名列七种语言的总分第二，

                     语义部分总分第一，
                     英语、加泰罗尼亚语和西班牙语的联合学习任务单项第一
正式结果在此，我们的系统报告在此以及此。

[2008] CoNLL-2008
第12届计算自然语言学习大会(CoNLL-2008)的国际评估：语法与语义依存的联合学习

     20个提交结果中排名第四。

正式结果在此，我们的系统报告在此。

[2007] Bakeoff-4
第一届中国中文信息学会汉语处理评测暨第四届国际中文自然语言处理Bakeoff (Bakeoff-4, Bakeoff-2007, 2008)

     28个研究团队提交的166个分词结果中赢得本届Bakeoff分词的封闭评测的所有五项第一名

     33个命名实体识别结果中赢得三个第二名，一个第三名

Bakeoff-4的正式结果在此。我们的系统报告在此。

[2006] Bakeoff-3
第三届国际中文分词竞赛 (Bakeoff-3, Bakeoff-2006)

     在29个研究团队提交的101个分词结果中赢得四项第一、两项第三

Bakeoff-3的正式结果在此。我们的系统报告在此。

顶部

论文

[2023]

Zhuosheng Zhang#, Hai Zhao, Longxiang Liu#. 2023.
Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension.
IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
[arXiv:2301.03953(PDF)]

Junlong Li, Zhuosheng Zhang, Hai Zhao*. 2022.
Dialogue-adaptive language model pre-training from quality estimation
Neurocomputing Volume 516, Issue C, Pages 27-35
[arXiv preprint: 2009.04984(PDF)]

Zhuosheng Zhang#; Kehai Chen; Rui Wang#; Masao Utiyama; Eiichiro Sumita, Zuchao Li, Hai Zhao*. 2023.
Universal Multimodal Representation for Language Understanding.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), DOI: 10.1109/TPAMI.2023.3234170
[IEEE] [arXiv(PDF)]

[2022]

Zuchao Li, Hai Zhao*, Junru Zhou, Kevin Parnow, Shexia He. 2022.
Dependency and Span, Cross-Style Semantic Role Labeling on PropBank and NomBank.
ACM Transaction on Asian and Low-Resource Language Information Processing (TALLIP), Vol.21(6), Article No.: 130, pp 1–16
[ACM(PDF)]
[arxiv(PDF)]

Jiayi Wang, Rongzhou Bao, Zhuosheng Zhang, Hai Zhao*. 2022.
Rethinking Textual Adversarial Defense for Pre-trained Language Models
IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 30, pp. 2526-2540, DOI: 10.1109/TASLP.2022.3192097
[IEEE]
[arXiv preprint: 2208.10251(PDF)]

Yilin Zhao, Zhuosheng Zhang and Hai Zhao*. 2022.
Reference Knowledgeable Network for Machine Reading Comprehension
IEEE/ACM Transactions on Audio, Speech and Language Processing, volume 30, pages 1461-1473, DOI: 10.1109/TASLP.2022.3164219
[IEEE]
[arXiv preprint: 2012.03709(PDF)]

Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu and Michael Zeng. 2022.
Scaling Multi-task Pre-training with Task Prefix.
Findings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), pages 5671–5685
[ACL Anthology (PDF)]
[arXiv:2210.06277(PDF)]

Hongqiu Wu, Ruixue Ding, Hai Zhao*, Boli Chen, Penguin Xie, Fei Huang, Min Zhang. 2022.
Forging multiple training objectives for pre-trained language models via meta-learning.
Findings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), pages 6483–6495
[ACL Anthology (PDF)]

Kailai Sun, Zuchao Li*, Hai Zhao*. 2022.
Reorder and then Parse, Fast and Accurate Discontinuous Constituency Parsing.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), pages 10575-10588
[ACL Anthology (PDF)]

Zhuosheng Zhang, Hai Zhao, Ming Zhou. 2022.
Instance Regularization for Discriminative Language Model Pre-training.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), pages 11255–11265
[ACL Anthology (PDF)]

Bohong Wu, Hai Zhao*. 2022.
Sentence Representation Learning with Generative Objective rather than Contrastive Objective.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), pages 3356–3368
[ACL Anthology (PDF)]

Yiyang Li, Hai Zhao*, Zhuosheng Zhang. 2022.
Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), pages 2761–2774
[ACL Anthology (PDF)]

Jiajia Li; Ping Wang*; Zuchao Li*; Xi Liu; Masao Utiyama; Eiichiro Sumita; Hai Zhao, Haojun Ai. 2022.
A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation.
IEEE Access, Volume: 10: 92467 - 92480
[IEEE]

Yiyang Li, Hongqiu Wu, Hai Zhao*. 2022.
Semantic-Preserving Adversarial Code Comprehension.
The 29th International Conference on Computational Linguistics (COLING 2022), pages 3017–3028
[ACL Anthology (PDF)]

Jialin Chen, Zhuosheng Zhang, Hai Zhao*. 2022.
Modeling Hierarchical Reasoning Chains by Linking Discourse Units and Key Phrases for Reading Comprehension.
The 29th International Conference on Computational Linguistics (COLING 2022), pages 1467–1479
[ACL Anthology (PDF)]

Ziming Cheng, Zuchao Li*, Hai Zhao*. 2022.
BiBL: AMR Parsing and Generation with Bidirectional Bayesian Learning.
The 29th International Conference on Computational Linguistics (COLING 2022), pages 5461-547
[ACL Anthology (PDF)]

Yifei Yang#, Zuchao Li#, Hai Zhao*. 2022.
Nested Named Entity Recognition as Corpus Aware Holistic Structure Parsing.
The 29th International Conference on Computational Linguistics (COLING 2022), pages 2472–2482
[ACL Anthology (PDF)]

Yifei Yang, Hai Zhao*. 2022.
Aspect-based Sentiment Analysis as Machine Reading Comprehension.
The 29th International Conference on Computational Linguistics (COLING 2022), pages 2461–2471
[ACL Anthology (PDF)]

Zuchao Li, Junru Zhou, Hai Zhao*, Zhisong Zhang, Haoran Li, Yuqi Ju. 2022.
Neural Character-Level Syntactic Parsing for Chinese.
Journal of Artificial Intelligence Research, Vol.73 (2022): 461-509
[ACM JAIR(PDF)]

Zuchao Li, Kevin Parnow, Hai Zhao*. 2022.
Incorporating rich syntax information in Grammatical Error Correction.
Information Processing and Management, Volume 59, Issue 3, pp.1-20.
[ACM]

Zuchao Li, Masao Utiyama*, Eiichiro Sumita, Hai Zhao*. 2022.
Restricted or Not: A General Training Framework for Neural Machine Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 245–251.
[ACL Anthology (PDf)]

Zuchao Li, Yiran Wang, Masao Utiyama*, Eiichiro Sumita, Hai Zhao*, Taro Watanabe. 2022.
What Works and Doesn’t Work, A Deep Decoder for Neural Machine Translation.
Findings of the Association for Computational Linguistics: ACL 2022, pages 459–471.
[ACL Anthology (PDf)]

Bohong Wu, Zhuosheng Zhang, Jinyuan Wang and Hai Zhao*. 2022.
Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), pages 1062–1074.
[ACL Anthology (PDf)]

Baorong Huang#, Zhuosheng Zhang#, Hai Zhao*. 2022.
Tracing Origins: Coreference-aware Machine Reading Comprehension.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), pages 1281–1291.
[ACL Anthology (PDf)]

Yilin Zhao, Hai Zhao*, Libin Shen, and Yinggong Zhao. 2022.
Lite Unified Modeling for Discriminative Reading Comprehension.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), pages 8682–8695.
[ACL Anthology (PDf)]

Jiayi Wang, Rongzhou Bao, Zhuosheng Zhang, and Hai Zhao*. 2022.
Distinguishing Non-natural from Natural Adversarial Samples for More Robust Pre-trained Language Model.
Findings of the Association for Computational Linguistics: ACL 2022, pages 905-915.
[ACL Anthology (PDf)]

Xinbei Ma, Zhuosheng Zhang, and Hai Zhao*. 2022.
Structural Characterization for Dialogue Disentanglement.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 285–297.
[ACL Anthology (PDf)]

Zuchao Li, Hai Zhao*, Fengshun Xiao, Masao Utiyama, Eiichiro Sumita. 2022.
Explicit Alignment Learning for Neural Machine Translation.
IJCAI 2022, pages: 4230-4237.
[IJCAI (PDf)]

Zhuosheng Zhang, Hai Zhao*, Rui Wang. 2020.
DUMA_Reading_Comprehension_With_Transposition_Thinking
IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol.30: 269-279, DOI: 10.1109/TASLP.2021.3138683
[IEEE]
[arXiv preprint: 2001.09415(PDF)]

Ying Luo#, Hai Zhao*, Zhuosheng Zhang#, Bingjie Tang. 2022.
Open Named Entity Modeling from Embedding Distribution.
IEEE Transactions on Knowledge and Data Engineering, Vol.34(11):5472 - 5483, DOI: 10.1109/TKDE.2021.3049654
[IEEE] [arXiv(PDF)]

Zuchao Li, Zhuosheng Zhang, Hai Zhao*, Rui Wang*, Kehai Chen, Masao Utiyama, and Eiichiro Sumita. 2021.
Text Compression-aided Transformer Encoding.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol.44(7): 3840 - 3857. DOI: 10.1109/TPAMI.2021.3058341
[IEEE] [arXiv(PDF)]

Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao*, Rui Wang. 2020.
SG-Net: Syntax Guided Transformer for Language Representation.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol.44(6): 3285 - 3299, DOI: 10.1109/TPAMI.2020.3046683
[IEEE] [arXiv(PDF)]

[2021]

Zhuosheng Zhang, Hai Zhao*, Rui Wang. 2020.
Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond.
arXiv preprint: 2005.06249.
[arXiv(PDF)]

Kailai Sun#, Zuchao Li#, Hai Zhao. 2021.
Multilingual Pre-training with Universal Dependency Learning.
35th Conference on Neural Information Processing Systems (NeurIPS 2021), Sydney, Australia
[NeurIPS]

Zhuosheng Zhang#, Hai Zhao, Longxiang Liu#. 2023.
Which Apple Keeps Which Doctor Away? Colorful Word Representations With Visual Oracles.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Volume: 30, Page(s): 49 - 59, DOI: 10.1109/TASLP.2021.3130972
[Openreview]

Rongzhou Bao, Zhuosheng Zhang, Hai Zhao*. 2021.
Span Fine-tuning for Pre-trained Language Models.
Findings of the Association for Computational Linguistics: EMNLP 2021, pp.1970-1979.
[ACL Anthology (PDF)]

Yiyang Li, Hai Zhao*. 2021.
Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Reading Comprehension.
Findings of the Association for Computational Linguistics: EMNLP 2021, pp.2053–2063.
[ACL Anthology (PDF)]

Jiawei Wang, Hai Zhao*, Yinggong Zhao, Libin Shen. 2021.
What If Sentence-hood is Hard to Define: A Case Study in Chinese Reading Comprehension.
Findings of the Association for Computational Linguistics: EMNLP 2021, pp.2348–2359.
[ACL Anthology (PDF)]

Zhuosheng Zhang, Siru Ouyang, Hai Zhao*, Masao Utiyama, Eiichiro Sumita. 2021.
Smoothing Dialogue States for Open Conversational Machine Reading.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), pp.3685–3696.
[ACL Anthology (PDF)]

Hongjiang Jing#, Zuchao Li, Hai Zhao*, Shu Jiang#. 2021.
Seeking Common but Distinguishing Difference, A Joint Aspect-based Sentiment Analysis Model.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), pp.3910–3922.
[ACL Anthology (PDF)]

Zuchao Li, Masao Utiyama, Eiichiro Sumita, Hai Zhao*. 2021.
Unsupervised Neural Machine Translation with Universal Grammar.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), pp.3249–3264.
[ACL Anthology (PDF)]

Kashif Munir, Hai Zhao*, Zuchao Li. 2021.
Learning Context-Aware Convolutional Filters for Implicit Discourse Relation Classification.
IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol.29: 2421-2433, DOI: 10.1109/TASLP.2021.3096041

Shu Jiang, Rui Wang, Zuchao Li, Masao Utiyama, Kehai Chen, Eiichiro Sumita, Hai Zhao*, Bao-Liang Lu*. 2021.
Document-level Neural Machine Translation with Associated Memory Network.
IEICE Transactions on Information and Systems, Vol.E104-D,No.10,pp.-,Oct. 2021..

Zuchao Li, Hai Zhao*, Shexia He and Jiaxun Cai. 2021.
Syntax Role for Neural Semantic Role Labeling.
Computational Linguistics, vol.43:1-48.
[arXiv (PDF)]

Shu Jiang, Zhuosheng Zhang, Hai Zhao*, Jiangtong Li, Yang Yang, Bao-Liang Lu, Ning Xia. 2021.
When SMILES smiles, Practicality Judgment and Yield Prediction of Chemical Reaction via Deep Chemical Language Processing.
IEEE Access, Vol.9: 85071-85083, DOI: 10.1109/ACCESS.2021.3083838
[IEEE]

Kashif Munir, Hai Zhao*, and Zuchao Li. 2021.
Neural Unsupervised Semantic Role Labeling.
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP).
[arXiv (PDF)]

Mingxuan Wang, Hongxiao Bai, Lei Li, Hai Zhao. 2021.
Cross-lingual Supervision Improves Unsupervised Neural Machine Translation.
Proceedings of NAACL-2021: HLT: Industry Papers, pp.89-96.
[ACL Anthology (PDF)]

Rui Wang and Hai Zhao. 2021.
Advances and Challenges in Unsupervised Neural Machine Translation. 2021.
Proceedings of EACL: Tutorial.

Yian Li and Hai Zhao*. 2021.
Pre-training Universal Language Representation.
Proceedings of ACL-IJCNLP, Vol.1: 5122–5133.
[ACL Anthology (PDF)]

Zhuosheng Zhang and Hai Zhao*. 2021.
Structural Pre-training for Dialogue Comprehension.
Proceedings of ACL-IJCNLP, Vol.1: 5134–5145.
[ACL Anthology (PDF)]

Hongqiu Wu, Hai Zhao*, Min Zhang. 2021.
Code Summarization with Structure-induced Transformer.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp.1078–1090.
[ACL Anthology (PDF)]

Yi Xu and Hai Zhao*. 2021.
Dialogue-oriented Pre-training.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp.2663–2673.
[ACL Anthology (PDF)]

Jeonghyeok Park and Hai Zhao*. 2021.
Enhancing Language Generation with Effective Checkpoints of Pre-trained Language Model.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp.2686–2694.
[ACL Anthology (PDF)]

Siru Ouyang, Zhuosheng Zhang, Hai Zhao*. 2021.
Dialogue Graph Modeling for Conversational Machine Reading.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp.3158–3169.
[ACL Anthology (PDF)]

Rongzhou Bao, Jiayi Wang, Hai Zhao*. 2021.
Defending Pre-trained Language Models from Adversarial Word Substitution Without Performance Sacrifice.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp.3248–3258.
[ACL Anthology (PDF)]

Kevin Parnow, Zuchao Li, Hai Zhao*. 2021.
Grammatical Error Correction as GAN-like Sequence Labeling.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp.3284–3290.
[ACL Anthology (PDF)]

Shuailiang Zhang, Hai Zhao*, Junru Zhou, Xi Zhou, Xiang Zhou. 2021.
Semantics-Aware Inferential Network for Natural Language Understanding.
AAAI-2021. Online. Feb. 02-09, 2021.
[arXiv(PDF)]

Yi Xu, Hai Zhao*, Zhuosheng Zhang. 2021.
Topic-Aware Multi-turn Dialogue Modeling.
AAAI-2021. Online. Feb. 02-09, 2021.
[arXiv(PDF)]

Longxiang Liu, Zhuosheng Zhang, Hai Zhao*, Xi Zhou, Xiang Zhou. 2021.
Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue.
AAAI-2021. Online. Feb. 02-09, 2021.
[arXiv(PDF)]

Zhuosheng Zhang, Junjie Yang, Hai Zhao*. 2021.
Retrospective Reader for Machine Reading Comprehension.
AAAI-2021. Online. Feb. 02-09, 2021.
[PDF (Both ranking first on SQuAD2.0 among all ensemble and single models)]

Zhuosheng Zhang, Junlong Li, Hai Zhao*. 2021.
Multi-turn Dialogue Comprehension with Pivot Turns and Knowledge.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol. 29. DOI: 10.1109/TASLP.2021.3058616
[IEEE] [arXiv(PDF)]

Kashif Munir, Hai Zhao*, and Zuchao Li. 2021.
Adaptive Convolution for Semantic Role Labeling.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol.29: 782-791, 2021, DOI: 10.1109/TASLP.2020.3048665
[IEEE] [arXiv(PDF)]

[2020]

Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao*, Rui Wang. 2020.
SG-Net: Syntax Guided Transformer for Language Representation.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), DOI: 10.1109/TPAMI.2020.3046683
[IEEE] [arXiv(PDF)]

Junru Zhou, Zhuosheng Zhang, Hai Zhao*, and Shuailiang Zhang. 2020.
LIMIT-BERT : Linguistics Informed Multi-Task BERT.
The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). ACL Findings. pp.4450-4461. Online, November 16-20, 2020.
[arXiv(PDF)] [ACL Anthology (PDF)]

Junru Zhou, Zuchao Li and Hai Zhao*. 2020.
Parsing All: Syntax and Semantics, Dependencies and Spans.
The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). ACL Findings. pp.4438-4449. Online, November 16-20, 2020.
[arXiv(PDF)] [ACL Anthology (PDF)]

Zuchao Li, Hai Zhao*, Rui Wang*, Masao Utiyama and Eiichiro Sumita . 2020.
Reference Language based Unsupervised Neural Machine Translation.
The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). ACL Findings. pp.4151-4162. Online, November 16-20, 2020.
[arXiv(PDF)] [ACL Anthology (PDF)]

Zuchao Li, Hai Zhao*, Rui Wang and Kevin Parnow. 2020.
High-order Semantic Role Labeling.
The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). ACL Findings. pp.1134-1151. Online, November 16-20, 2020.
[ACL Anthology (PDF)]

Sufeng Duan and Hai Zhao*. 2020.
Attention Is All You Need for Chinese Word Segmentation.
The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). pp.3862-3872. Online, November 16-20, 2020.
[ACL Anthology (PDF)]

Ying Luo, Hai Zhao*, Junlang Zhan. 2020.
Named Entity Recognition Only from Word Embeddings.
The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). pp.8995-9005. Online, November 16-20, 2020.
[ACL Anthology (PDF)]

Ying Luo, Hai Zhao*. 2020.
Bipartite Flat-Graph Network for Nested Named Entity Recognition.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). pp.6408–6418. Online, July 5-10, 2020.
[arXiv(PDF)] [ACL Anthology]

Zuchao Li,Chaoyu Guan, Hai Zhao*, Rui Wang, Kevin Parnow, and Zhuosheng Zhang. 2020.
Memory Network for Linguistic Structure Parsing.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol.28: 2743-2755, doi: 10.1109/TASLP.2020.3030500.
[IEEE] [arXiv(PDF)]

Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao, Muyun Yang, and Hai Zhao. 2020.
Towards More Diverse Input Representation for Neural Machine Translation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol.28: 1586-1597, doi: 10.1109/TASLP.2020.2996077.
[IEEE] [PDF]

Zuchao Li, Rui Wang*, Kehai Chen, Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao*. 2020.
Data-dependent Gaussian Prior Objective for Language Generation.
ICLR-2020.
[PDF (one of 30 papers with FULL review score among 2,594 submissions of)] [code]

Zhuosheng Zhang, Kehai Chen, Rui Wang*, Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao*. 2020.
Neural Machine Translation with Universal Visual Representation.
ICLR-2020.
[PDF]

Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao*, Rui Wang*. 2020.
SG-Net: Syntax-Guided Machine Reading Comprehension.
AAAI-2020, Pages 9636-9643.
[arXiv:1908.05147v3(PDF)] [AAAI (PDF)]

Zhuosheng Zhang, Yuwei Wu, Hai Zhao*, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou. 2020.
Semantics-aware BERT for Natural Language Understanding.
AAAI-2020, Pages 9628-9635.
[arXiv:1909.02209v2(PDF)] [AAAI (PDF)]

Shuailiang Zhang, Hai Zhao*, Yuwei Wu, Zhuosheng Zhang, Xi Zhou, Xiang Zhou. 2020.
DCMN+: Dual Co-Matching Network for Multi-choice Reading Comprehension.
AAAI-2020, Pages 9563-9570.
[arXiv:1908.11511v2(PDF)] [AAAI (PDF)]

Zuchao Li, Rui Wang*, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao*. 2020.
Explicit Sentence Compression for Neural Machine Translation.
AAAI-2020, Pages 8311-8318.
[AAAI (PDF)]

Zuchao Li, Hai Zhao*, Kevin Parnow. 2020.
Global Greedy Dependency Parsing.
AAAI-2020, Pages 8319-8326.
[arXiv:1911.08673(PDF)] [AAAI (PDF)]

Junlang Zhan, Hai Zhao*. 2020.
Span Model for Open Information Extraction on Accurate Corpus.
AAAI-2020, Pages 9523-9530.
[arXiv:1901.10879v5(PDF)] [AAAI (PDF)]

Ying Luo, Fengshun Xiao, Hai Zhao*. 2020.
Hierarchical Contextualized Representation for Named Entity Recognition.
AAAI-2020, Pages 8441-8448.
[arXiv:1911.02257v2(PDF)] [AAAI (PDF)]

Xinsong Zhang, Tianyi Liu, Pengshuai Li,Weijia Jia, and Hai Zhao. 2020.
Robust Neural Relation Extraction via Multi-Granularity Noises Reduction.
IEEE Transactions on Knowledge and Data Engineering. DOI: 10.1109/TKDE.2020.2964747
[IEEE] [PDF]

[2019]

Hongxiao Bai, Hai Zhao*. 2019.
SJTU at MRP 2019: A Transition-Based Multi-Task Parser for Cross-Framework Meaning Representation Parsing.
CoNLL-2019, pp.86-94.
[PDF]

Zuchao Li, Hai Zhao*, Zhuosheng Zhang, Rui Wang*, Masao Utiyama, and Eiichiro Sumita. 2019.
SJTU-NICT at MRP 2019: Multi-Task Learning for End-to-End Uniform Semantic Graph Parsing.
CoNLL-2019, pp.45-54.[News]
[PDF]

Shexia He, Zuchao Li, Hai Zhao*. 2019.
Syntax-aware Multilingual Semantic Role Labeling.
EMNLP-2019, pp.5353-5362, Hong Kong, China, November 3–7, 2019.
[PDF]

Sufeng Duan, Hai Zhao*, Junru Zhou and Rui Wang. 2019.
Syntax-aware Transformer Encoder for Neural Machine Translation.
IALP-2019.

Yiqing Zhang, Hai Zhao*, Zhuosheng Zhang. 2019.
Examination-Style Reading Comprehension with Neural augmented Retrieval.
IALP-2019.

Juncheng Cao, Hai Zhao*, Kai Yu. 2019.
Cross Aggregation of Multi-Head Attention for Neural Machine Translation.
NLPCC-2019.

Zuchao Li, Junru Zhou, Hai Zhao*, Rui Wang. 2019.
Cross-Domain Transfer Learning for Dependency Parsing.
NLPCC-2019.

Jeonghyeok Park, Hai Zhao*. 2019.
Korean-to-Chinese Machine Translation using Chinese Character as Pivot Clue.
PACLIC-33.

Zhuosheng Zhang, Yuwei Wu, Zuchao Li, Hai Zhao*. 2019.
Explicit Contextual Semantics for Text Comprehension.
PACLIC-33.

Zuchao Li, Jiaxun Cai, Hai Zhao*.
Effective Representation for Easy-First Dependency Parsing.
The 16th Pacific Rim International Conference on Artificial Intelligence (PRICAI 2019), Yanuca Island, Cuvu, Fiji, August 26-30, 2019.
[PDF]

Zhuosheng Zhang, Hai Zhao*, Kangwei Ling, Jiangtong Li, Zuchao Li, Shexia He, Guohong Fu.
Effective Subword Segmentation for Text Comprehension.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol.27(11): 1664-1674, Nov. 2019, doi: 10.1109/TASLP.2019.2922537.
[IEEE] [PDF]

Junru Zhou, Hai Zhao*.
Head-driven Phrase Structure Grammar Parsing on Penn Treebank.
The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), pp.2396–2408, Florence, Italy, July 28th to August 2nd, 2019.
[PDF]

Zhuosheng Zhang, Yafang Huang, Hai Zhao*.
Open Vocabulary Learning for Neural Chinese Pinyin IME.
The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), pp.1584-1594, Florence, Italy, July 28th to August 2nd, 2019.
[PDF] [code and dataset]

Fengshun Xiao, Jiangtong Li, Hai Zhao*, Rui Wang and Kehai Chen.
Lattice-based Transformer Encoder for Neural Machine Translation.
The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), pp.pp.3090–3097, Florence, Italy, July 28th to August 2nd, 2019.
[PDF]

Pengshuai Li, Xinsong Zhang, Weijia Jia*, Hai Zhao*.
GAN Driven Semi-distant Supervision for Relation Extraction.
2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp.3026-3035, June 2–7, 2019, Minneapolis, USA.
[PDF]

Chaoyu Guan, Yuhao Cheng, Hai Zhao*.
Semantic Role Labeling with Associated Memory Network.
2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp.3361-3371, June 2–7, 2019, Minneapolis, USA.
[PDF]

Huan Zhang, Hai Zhao*.
Minimum Divergence vs. Maximum Margin: An Empirical Comparison on Seq2seq Models.
Proceedings of ICLR 2019, May 6-9, New Orleans, USA.
[PDF] [Official PDF]

Zuchao Li, Shexia He, Hai Zhao*, Yiqing Zhang, Zhuosheng Zhang, Xi Zhou, Xiang Zhou.
Dependency or Span, End-to-End Uniform Semantic Role Labeling.
Proceedings of AAAI 2019, January 27 - February 1, 2019, Honolulu, Hawaii, USA.
[PDF] [arXiv]
[code]

Xinsong Zhang, Pengshuai Li, Weijia Jia*, Hai Zhao*.
Multi-labeled Relation Extraction with Attentive Capsule Network.
Proceedings of AAAI 2019, January 27 - February 1, 2019, Honolulu, Hawaii, USA.
[PDF]

Xiaobin Wang, Deng Cai, Linlin Li*, Guangwei Xu, Hai Zhao*, Luo Si.
Unsupervised Learning helps Supervised Neural Word Segmentation.
Proceedings of AAAI 2019, January 27 - February 1, 2019, Honolulu, Hawaii, USA.
[PDF]

[2018]

Yiqun Xiao, Jiaxun Cai, Yang Yang*, Hai Zhao,and Hongbin Shen,
Prediction of MicroRNA Subcellular Localization by Using a Sequence-to-Sequence Model,
ICDM 2018, Nov. 17-20, 2018, Singapore.

Yafang Huang and Hai Zhao*,
Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet,
Proceedings of EMNLP 2018, pp.2923–2929, October 31 - November 4, 2018, Brussels, Belgium.
[PDF]

Zuchao Li, Shexia He, Jiaxun Cai, Zhuosheng Zhang and Hai Zhao*, Gongshen Liu, Linlin Li, Luo Si
A Unified Syntax-aware Framework for Semantic Role Labeling,
Proceedings of EMNLP 2018, pp.2401–2411, October 31 - November 4, 2018, Brussels, Belgium.
[PDF]

Zhisong Zhang, Rui Wang*, Masao Utiyama, Eiichiro Sumita and Hai Zhao*,
Exploring Recombination for Efficient Decoding of Neural Machine Translation,
Proceedings of EMNLP 2018, pp.4785–4790, October 31 - November 4, 2018, Brussels, Belgium.
[PDF]

Yingting Wu, Hai Zhao*, Jia-Jun Tong,
Multilingual Universal Dependency Parsing from Raw Text with Low Resource Language Enhancement,
Proceedings of CoNLL 2018, pp.74-80, October 31 - November 1, 2018, Brussels, Belgium.
[PDF]

Zuchao Li, Shexia He, Zhuosheng Zhang, Hai Zhao*,
Joint Learning for Universal Dependency Parsing,
Proceedings of CoNLL 2018, pp.65-73, October 31 - November 1, 2018, Brussels, Belgium.
[PDF]

Yingting Wu, Hai Zhao*,
Finding Better Subword Segmentation for Neural Machine Translation,
The Seventeenth China National Conference on Computational Linguistics, CCL 2018, LNAI Vol.11221: 53-64, October 19-21, 2018, Changsha China.
[PDF]

Zhuosheng Zhang, Yafang Huang, Pengfei Zhu, Hai Zhao*,
Effective Character-augmented Word Embedding for Machine Reading Comprehension,
Proceedings of The Seventh CCF Conference on Natural Language Processing and Chinese Computing (NLPCC 2018), LNAI Vol.11108: 27-39, August 26-30, 2018, Hohhot, China.

Pengfei Zhu, Zhuosheng Zhang, Jiangtong Li, Yafang Huang, Hai Zhao*,
Lingke: A Fine-grained Multi-turn Chatbot for Customer Service,
Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), System Demonstrations, pp.108–112, August 20-26, 2018, Santa Fe, New Mexico, USA.
[PDF]

Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu, Hai Zhao*, Gongshen Liu
Modeling Multi-turn Conversation with Deep Utterance Aggregation,
Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), pp.3740–3752, August 20-26, 2018, Santa Fe, New Mexico, USA.
[PDF]

Zhuosheng Zhang, Yafang Huang and Hai Zhao*
Subword-augmented Embedding for Cloze Reading Comprehension,
Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), pp.1802–1814, August 20-26, 2018, Santa Fe, New Mexico, USA.
[PDF]

Zhuosheng Zhang and Hai Zhao*
One-shot Learning for Question-Answering in Gaokao History Challenge,
Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), pp.449–461, August 20-26, 2018, Santa Fe, New Mexico, USA.
[PDF]

Hongxiao Bai and Hai Zhao*
Deep Enhanced Representation for Implicit Discourse Relation Recognition,
Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), pp. 571–583, August 20-26, 2018, Santa Fe, New Mexico, USA.
[PDF]

Jiaxun Cai, Shexia He, Zuchao Li and Hai Zhao*
A Full End-to-End Semantic Role Labeler, Syntax-agnostic or Syntax-aware?
Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), pp.2753–2765, August 20-26, 2018, Santa Fe, New Mexico, USA.
[PDF]

Zuchao Li, Jiaxun Cai, Shexia He and Hai Zhao*
Seq2seq Dependency Parsing,
Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), pp.3203–3214, August 20-26, 2018, Santa Fe, New Mexico, USA.
[PDF]

Huang Yafang, Li Zuchao, Zhang Zhuosheng, Hai Zhao*
Neural-based Chinese Pinyin Aided Input Method with Customizable Association,
Proceedings of ACL 2018, System Demonstrations, pp.140-145, Melbourne, Australia, July 15-20, 2018
[PDF]

Lianhui Qin, Lemao Liu, Victoria Bi, Yan Wang, Xiaojiang Liu, Zhiting Hu, Hai Zhao and Shuming Shi
Automatic Article Commenting: the Task and Dataset,
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018, Volume 2: Short Papers), pp.151-156, Melbourne, Australia, July 15-20, 2018
[PDF]

Shexia He, Zuchao Li, Hai Zhao*, Hongxiao Bai, Gongshen Liu
Syntax for Semantic Role Labeling, to Be, or Not to Be,
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018, Volume 1: Long Papers), pp.2061-2071, Melbourne, Australia, July 15-20, 2018
[PDF]

Zhuosheng Zhang, Jiangtong Li, Hai Zhao*, Bingjie Tang
SJTU-NLP at SemEval-2018 Task 9: Neural Hypernym Discovery with Term Embeddings,
Proceedings of The 12th International Workshop on Semantic Evaluation, pp.903-908, New Orleans, Louisiana, June 1-6, 2018
[PDF]

Rui Wang, Hai Zhao*, Sabine Ploux, Bao-Liang Lu, Masao Utiyama, Eiichiro Sumita
Graph-based Bilingual Word Embedding for Statistical Machine Translation,
ACM Transaction on Asian and Low-Resource Language Information Processing (TALLIP), Vol.17(4), Article 31, July 2018, DOI: 10.1145/3203078.
[ACM] [PDF]

Haonan Li, Zhisong Zhang, Yuqi Ju, Hai Zhao*
Neural Character-level Dependency Parsing for Chinese,
Proceedings of The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), pp.5205-5212, New Orleans, Louisiana, USA, February 2–7, 2018
[PDF]

[2017]

赵海, 蔡登, 黄昌宁, 揭春雨
中文分词十年再回顾(2007-2017)
实证及语料库语言学前沿, 揭春雨刘美君主编, 中国社会科学出版社, 北京, 2017年7月
[PDF] [arXiv]

Hao Wang, Hai Zhao*, Zhisong Zhang
A Transition-based System for Universal Dependency Parsing,
CoNLL 2017, pp.191-197, Vancouver, Canada, July 2017
[PDF]

Deng Cai, Hai Zhao*, Zhisong Zhang, Yuan Xin, Yongjian Wu, Feiyue Huang
Fast and Accurate Neural Word Segmentation for Chinese,
ACL 2017, pp.608-615, Vancouver, Canada, July 2017
[PDF]

Lianhui Qin, Zhisong Zhang, Hai Zhao*, Zhiting Hu, Eric P. Xing
Adversarial Connective-exploiting Network for Implicit Discourse Relation Classification,
ACL 2017, pp.1006-1017, Vancouver, Canada, July 2017
[PDF]

Deng Cai, Hai Zhao*
Pair-Aware Neural Sentence Modeling for Implicit Discourse Relation Classification,
IEA/AIE (2) 2017, LNCS, volume 10351: 458-466
[PDF]

Deng Cai, Hai Zhao*, Yang Xin, Yuzhu Wang, Zhongye Jia
A Hybrid Model for Chinese Spelling Check,
ACM Transactions on Asian Low-Resource Language Information Process, 2017

[2016]

Rui Wang, Hai Zhao*, Bao-Liang Lu, Masao Utiyama* and Eiichro Sumita,
Connecting Phrase based Statistical Machine Translation Adaptation,
COLING-2016, pp.3135-3145, Osaka, Japan, December, 2016
[PDF]

Lianhui Qin, Zhisong Zhang, and Hai Zhao*
Implicit Discourse Relation Recognition with Context-aware Character-enhanced Embeddings,
COLING-2016, pp.1914-1924, Osaka, Japan, December, 2016
[PDF]

Lianhui Qin, Zhisong Zhang, and Hai Zhao*
A stacking gated neural architecture for implicit discourse relation classification.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.2263-2270, Austin, USA, November, 2016
[PDF]

Chenxi Pang, Hai Zhao*, Zhongyi Li,
I Can Guess What You Mean: A Monolingual Query Enhancement for Machine Translation,
LNCS Vol.10035: 50-63, CCL-2016, Yantai, China, Oct 15-16, 2016
[PDF]

Zhongyi Li, Hai Zhao*, Chenxi Pang, Lili Wang, Huan Wang
A Constituent Syntactic Parse Tree based Discourse Parser,
CoNLL-2016 Shared Task, pp.60-64, Berlin, Germany, August 7-12, 2016

Lianhui Qin, Zhisong Zhang, Hai Zhao*
Shallow Discourse Parsing using Convolutional Neural Network,
CoNLL-2016 Shared Task, pp.70-77, Berlin, Germany, August 7-12, 2016

Hai Zhao#, Deng Cai#, Yang Xin, Yuzhu Wang, Zhongye Jia.
A Hybrid Model for Chinese Spelling Check.
ACM Transactions on Asian Low-Resource Language Information Process (TALLIP), Vol.16(3), Article 21, March 2017, doi: 10.1145/3047405.
[ACM] [PDF]

Zhisong Zhang, Hai Zhao*, Lianhui Qin
Probabilistic Graph-based Dependency Parsing with Convolutional Neural Network,
ACL-2016, pp. 1382-1392, Berlin, Germany, August 7-12, 2016
[PDF]

Rui Wang, Hai Zhao*, Sabine Ploux*, Bao-Liang Lu, Masao Utiyama
A Bilingual Graph-based Semantic Model for Statistical Machine Translation,
IJCAI-2016, pp.2950-2956, New York, USA, July 9-15, 2016
[PDF]

Peilu Wang, Yao Qian,Hai Zhao*, Frank K. Soong, Lei He, Ke Wu
Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network,
NAACL-2016, pp.527-533, San Diego, USA, June 12-15, 2016
[PDF]

Rui Wang, Masao Utiyama, Isao Goto, Eiichiro Sumita, Hai Zhao*, Bao-Liang Lu,
Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation,
ACM Transactions on Asian Low-Resource Language Information Process (TALLIP), Vol.15(3), Article 11, January 2016, doi: 10.1145/2843942.
[ACM] [PDF]

Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Hai Zhao, Graham Neubig, Satoshi Nakamura,
Learning local word reorderings for hierarchical phrase-based statistical machine translation,
Machine Translation, Spinger, 2016
[PDF]

[2015]

Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao
Word Embedding for Recurrent Neural Betwork based TTS Synthesis,
Proc. of Acoustics, Speech and Signal Processing (ICASSP), pp. 4879-4883, Brisbane, Australia, 2015

葛岩, 赵海, 秦裕林等
国家、地区媒体形象的数据挖掘
学术月刊，第47卷第7期，163-170，2015年7月

Changge Chen, Hai Zhao*, Yang Yang
Deceptive Opinion Spam Detection using Deep Level Linguistic Features,
The 4th CCF Conference on Natural Language Processing & Chinese Computing(NLPCC 2015),
October 9-13, 2015, Nanchang, China

Shuo Zang, Hai Zhao*, Chunyang Wu, Rui Wang,
A Novel Word Reordering Method for Statistical Machine Translation,
The 2015 11th International Conference on Natural Computation (ICNC'15) and the 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD'15),
August 15-17, 2015, Zhangjiajie, China

Changge Chen, Peilu Wang, Hai Zhao*,
Shallow Discourse Parsing Using Constituent Parsing Tree,
CoNLL 2015, July 30, 2015, Beijing, China

Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Hai Zhao*,
Learning Word Reorderings for Hierarchical Phrase-based Statistical Machine Translation,
ACL-IJCNLP 2015, pp.542-548, July 26-31, 2015, Beijing, China
[PDF]

Rui Wang, Hai Zhao*, Bao-Liang Lu, Masao Utiyama and Eiichiro Sumita,
Bilingual Continuous-Space Language Model Growing for Statistical Machine Translation,
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol.23(7): 1209-1220, July 2015, doi: 10.1109/TASLP.2015.2425220.
[IEEE] [PDF]

[2014]

Rui Wang, Hai Zhao, Bao-Liang Lu, Masao Utiyama and Eiichro Sumita
Neural Network Based Bilingual Language Model Growing for Statistical Machine Translation
EMNLP 2014: 189-195, Doha, Qatar, October, 2014

Jingyi Zhang, Masao Utiyama and Eiichro Sumita, Hai Zhao
Learning Hierarchical Translation Spans
EMNLP 2014: 183-188, Doha, Qatar, October, 2014

Yang Xin, Hai Zhao, Yuzhu Wang and Zhongye Jia
An Improved Graph Model for Chinese Spell Checking
SIGHAN-2014, Wuhan, China, October, 2014

Xiaolin Wang, Hai Zhao, Bao-Liang Lu
A Meta-Top-down Method for Large-scale Hierarchical Classification
IEEE Transactions on Knowledge and Data Engineering, Vol.26(3):500-513,March 2014, doi: 10.1109/TKDE.2013.30.
[IEEE] [PDF]

Xiaolin Wang, Yangyang Chen, Hai Zhao, Bao-Liang Lu
Parallelized Extreme Learning Machine Ensemble Based on Min-Max Modular Network
Neurocomputing, Vol.128:31-41, March 2014

Jia, Zhongye, Hai Zhao
A Joint Graph Model for Pinyin-to-Chinese Conversion with Typo Correction
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Vol.1: 1512-1523, Baltimore, Maryland
[PDF]

Wang, Peilu and Jia, Zhongye and Hai Zhao
Grammatical Error Detection and Correction using a Single Maximum Entropy Model
Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL-2014), pages 74--82, Baltimore, Maryland, June

[2013]

Rui Wang, Masao Utiyama, Isao Goto, Eiichro Sumita, Hai Zhao, and Bao-Liang Lu
Converting Continuous-Space Language Models into N-gram Language Models for Statistical Machine Translation
EMNLP-2013: 845-850, Seattle, USA, October, 2013

Xiao-Lin Wang, Hai Zhao, and Bao-Liang Lu
Labeled Alignment for Recognizing Textual Entailment
IJCNLP-2013: 605-613, Nagoya, Japan, October, 2013

Zhongye Jia, Hai Zhao
Kyss 1.0: a Framework for Automatic Evaluation of Chinese Input Method Engines
IJCNLP-2013: 1195-1201, Nagoya, Japan, October, 2013

Zhongye Jia, Peilu Wang, Hai Zhao
Graph Model for Chinese Spell Checking
SIGHAN-7: 88-92, Nagoya, Japan, October, 2013

Zhongye Jia, Peilu Wang, Hai Zhao
Grammatical Error Correction as Multiclass Classification with Single Model
CoNLL-2013: 74-81, Sofia, Bulgaria, August, 2013

Jingyi Zhang, Hai Zhao
Improving Function Word Alignment with Frequency and Syntactic Information
IJCAI-2013: 2211-2217, Beijing, China, August, 2013
[PDF]

Xiaolin Wang, Hai Zhao, Bao-Liang Lu
BCMI-NLP Labeled-Alignment-Based Entailment System for NTCIR-10 RITE-2 Task
NTCIR-10: 474-478, Tokyo, Japan, June, 2013

Hai Zhao, Jingyi Zhang, Masao Utiyama and Eiichro Sumita
An Improved Patent Machine Translation System Using Adaptive Enhancement for NTCIR-10 PatentMT Task
NTCIR-10: 376-379, Tokyo, Japan, June, 2013

Hai Zhao, Xiaotian Zhang, and Chunyu Kit
Integrative Semantic Dependency Parsing via Efficient Large-scale Feature Selection
Journal of Artificial Intelligence Research, Volume 46:203-233, 2013
[PDF]

Hai Zhao, Masao Utiyama, Eiichro Sumita, and Bao-Liang Lu
An Empirical Study on Word Segmentation for Chinese Machine Translation
A. Gelbukh (Ed.): CICLing 2013, Part II, LNCS 7817, pp. 248-263, 2013
[PDF]

[2012]

Shaohua Yang, Hai Zhao, Xiaolin Wang and Bao-liang Lu
Spell Checking for Chinese
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), pages 730-736, Istanbul, Turkey, May, 2012

Chunyang Wu and Hai Zhao
Regression with Phrase Indicators for Estimating MT Quality
Proceedings of the 7th Workshop on Statistical Machine Translation of NAACL-2012, pages 152-156,Montreal, Quebec, Canada, June 7 - 8, 2012

Heming Shou and Hai Zhao
Hybrid Rule-based Algorithm for Coreference Resolution
Proceedings of the Joint Conference on EMNLP and CoNLL, pages 118-121, Jeju Island, Korea, July, 2012

Xiaotian Zhang, Chunyang Wu and Hai Zhao
Chinese Coreference Resolution via Ordered Filtering
Proceedings of the Joint Conference on EMNLP and CoNLL, pages 95-99, Jeju Island, Korea, July, 2012

Shaohua Yang, Hai Zhao and Bao-Liang Lu
A Machine Translation Approach for Chinese Whole-Sentence Pinyin-to-Character Conversion
PACLIC-26, Bali, Indonesia, November, 2012

Xiaotian Zhang, Yao Qian, Hai Zhao, Frank Soong
Break index labeling of Mandarin text via syntactic-to-prosodic tree mapping
The 8th International Symposium on Chinese Spoken Language Processing (ISCSLP-2012), Hong Kong, December 5-8, 2012

Xiaotian Zhang, Hai Zhao and Cong Hui
A Machine Learning Approach to Convert CCGbank to Penn Treebank
the 24th International Conference on Computational Linguistics (COLING 2012), pp.535-542, Mumbai, India, 8-15 December 2012

Qiongkai Xu and Hai Zhao
Using Deep Linguistic Features for Finding Deceptive Opinion Spam
the 24th International Conference on Computational Linguistics (COLING 2012), Mumbai, India, 8-15 December 2012

Xuezhe Ma and Hai Zhao
Fourth-Order Dependency Parsing
the 24th International Conference on Computational Linguistics (COLING 2012), Mumbai, India, 8-15 December 2012
[PDF]

[2011]

Jian Zhang, Hai Zhao, Liqing Zhang, Bao-Liang Lu
An Empirical Comparative Study on Two Large-Scale Hierarchical Text Classification Approaches
International Journal Computer Processing of Oriental Language (IJCPOL) 23(4):309-326 (2011)

Xiaolin Wang, Hai Zhao and Bao-Liang Lu
Enhance Top-down method with Meta-Classification for Very Large-scale Hierarchical Classification
IJCNLP-2011, Chiang Mai, Thailand, November 9-11, 2011

张晓甜赵海
基于树结构模式挖掘的非监督中文短语结构句法分析
第11届中国计算语言学大会, 洛阳，2011年8月20-22

Hai Zhao and Chunyu Kit
Integrating unsupervised and supervised word segmentation: The role of goodness measures
Information Sciences, Vol.181(1): 163-183, 2011, Elsevier
[PDF]

[2010]

赵海
作为人工智能分支的自然语言处理：停滞的技术
第七届自然语言处理青年学者研讨会,沈阳,2010年9月18-19日
[PPT] [MP3(29M)]

Xuezhe Ma, Xiaotian Zhang, Hai Zhao, Bao-Liang Lu
Dependency Parser for Chinese Constituent Parsing
CIPS-SIGHAN-2010, August, 2010, Beijing, China

Yan Song, Chunyu Kit and Hai Zhao
Reranking with Multiple Features for Better Transliteration
NEWS-2010, pp.62-65, July, 2010, Uppsala, Sweden

Cong Hui, Hai Zhao, Yan Song, Bao-Liang Lu
An Empirical Study on Development Set Selection Strategy for Machine Translation Learning
WMT-2010, pp.67-71, July, 2010, Uppsala, Sweden

Shaodian Zhang, Hai Zhao, Guodong Zhou and Bao-liang Lu
Hedge Detection and Scope Finding by Sequence Labeling with Procedural Feature Selection
CoNLL-2010, pp.92-99, July, 2010, Uppsala, Sweden

Jian Zhang, Hai Zhao, and Bao-Liang Lu
A Comparative Study on Two Large-Scale Hierarchical Text Categorization Tasks' Solutions
IWWIP-2010, July, 2010, Qingdao, China

Hai Zhao, Chang-Ning Huang, Mu Li, Bao-Liang Lu.
A Unified Character-Based Tagging Framework for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP), Vol.9(2), Article 5, June 2010, DOI: 10.1145/1781134.1781135.
[ACM] [PDF]

Gang Jin, Qi Kong, Jian Zhang, Xiaolin Wang, Cong Hui, Hai Zhao, and Bao-Liang Lu
Multiple Strategies for NTCIR-08 Patent Mining at BCMI
NTCIR-8, June, 2010, Tokyo, Japan

Minzhang Huang, Hai Zhao, Bao-Liang Lu
Pruning Training Samples Using a Supervised Clustering Algorithm
ISNN (2) 2010: 250-257, June, 2010, Shanghai, China

Hai Zhao, Yan Song, Chunyu Kit
How Large a Corpus Do We Need: Statistical Method Versus Rule-based Method.
LREC 2010, May, 2010, Malta
[PDF]

[2009]

宋彦, 蔡东风, 张桂平, 赵海
一种基于字词联合解码的中文分词方法
软件学报, Vol.20, No.9, pp.2366-2375, 2009

Hai Zhao, Wenliang Chen, Chunyu Kit
Semantic Dependency Parsing of NomBank and PropBank: An Efficient Integrated Approach via a Large-scale Feature Selection
EMNLP 2009: conference on Empirical Methods in Natural Language Processing, pp.30-30, Singapore, August 6-7, 2009

Junhui Li, Guodong Zhou, Hai Zhao, Qiaoming Zhu, Peide Qian
Improving Nominal SRL in Chinese Language with Verbal SRL Information and Automatic Predicate Recognition
EMNLP 2009: conference on Empirical Methods in Natural Language Processing, pp.1280-1288, Singapore, August 6-7, 2009

Hai Zhao, Yan Song, Chunyu Kit, and Guodong Zhou
Cross Language Dependency Parsing using a Bilingual Lexicon
Joint conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), pp.55-63, Singapore, August 2-5, 2009

赵海，揭春雨，宋彦
基于字依存树的中文词法-句法一体化分析
全国第十届计算语言学学术会议(CNCCL-2009), pp.82-88, 烟台, 7月24-26, 2009
[PDF]

Hai Zhao, Wenliang Chen, Jun’ichi Kazama, Kiyotaka Uchimoto, and Kentaro Torisawa
Multilingual Dependency Learning: Exploiting Rich Features for Tagging Syntactic and Semantic Dependencies
Thirteenth Conference on Computational Natural Language Learning, (CoNLL-09), pp. 61-66, Boulder, CO, USA, June 4-5, 2009
[PDF]

Hai Zhao, Wenliang Chen, Chunyu Kit, and Guodong Zhou
Multilingual Dependency Learning: A Huge Feature Engineering Method to Semantic Dependency Parsing
Thirteenth Conference on Computational Natural Language Learning, (CoNLL-09), pp. 55-60, Boulder, CO, USA, June 4-5, 2009
[PDF]

Hai Zhao
Character-Level Dependencies in Chinese: Usefulness and Learning
The 12th Conference of the European Chapter of the Association for Computational Linguistics, (EACL-09), pp.879-887, Athens, Greece, March 30 - April 3, 2009
[PDF]

Hai Zhao and Chunyu Kit
A Simple and Efficient Model Pruning Method for Conditional Random Fields
The 22nd International Conference on the Computer Processing of Oriental Languages (ICCPOL 2009), LNCS, Vol.5459, pp.149-159, Hong Kong, March 26-27, 2009
[PDF]

[2008]

Hai Zhao and Chunyu Kit
Parsing Syntactic and Semantic Dependencies with Two Single-Stage Maximum Entropy Models
Twelfth Conference on Computational Natural Language Learning, (CoNLL-2008), pp.203-207, Manchester, UK, August 16-17, 2008
[PDF]

Hai Zhao and Chunyu Kit
Scaling Conditional Random Fields by One-Against-the-Other Decomposition
Journal of Computer Science and Technology, Vol. 23(4): 612-619, July, 2008

Hai Zhao and Chunyu Kit
Exploiting Unlabeled Text with Different Unsupervised Segmentation Criteria for Chinese Word Segmentation
The 9th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2008), Haifa, Israel, February 17-23, 2008
Also in Research in Computing Science, Vol. 33: 93-104, 2008
[PDF][照片]

Hai Zhao and Chunyu Kit
Unsupervised Segmentation Helps Supervised Learning of Character Tagging for Word Segmentation and Named Entity Recognition
The Sixth SIGHAN Workshop on Chinese Language Processing (SIGHAN-6), pp.106-111, Hyderabad, India, January 11-12, 2008
[PDF]

Hai Zhao and Chunyu Kit
An Empirical Comparison of Goodness Measures for Unsupervised Chinese Word Segmentation with a Unified Framework
The Third International Joint Conference on Natural Language Processing (IJCNLP-2008), Vol. 1: 9-16, Hyderabad, India, January 8-10, 2008
[PDF][照片]

[2007]

赵海, 揭春雨
基于有效子串标注的中文分词
中文信息学报，Vol. 21(5): 8-13, 2007
[PDF]

Hai Zhao and Chunyu Kit
Incorporating Global Information into Supervised Learning for Chinese Word Segmentation
The 10th Conference of the Pacific Association for Computational Linguistics (PACLING-2007),pp.66-74, Melbourne, Australia, September 19-21, 2007
[PDF][照片]

Hai Zhao and Chunyu Kit
Scaling Conditional Random Field with Application to Chinese Word Segmentation
The Third International Conference on Natural Computation (ICNC'07), Vol. 5: 95-99, Haikou, China, August 24-27, 2007

赵海, 揭春雨
基于子串标注的中文分词：寻找更佳的标注单元
第九届全国计算语言学学术会议, 大连，2007年8月6-8日，收录到孙茂松、陈群秀编《内容计算的研究与应用前沿》，pp.45-51，清华大学出版社
[照片]

黄昌宁, 赵海
中文分词十年回顾 (应邀论文)
中文信息学报，Vol. 21(3): 8-20，2007

[2006]

黄昌宁, 赵海
由字构词—中文分词新方法 (应邀论文)
中国中文信息学会成立二十五周年学术会议，2006年11月21-22日，北京，收录到曹右琦、孙茂松编《中文信息处理前沿进展》，清华大学出版社
[PPT]

Hai Zhao, Chang-Ning Huang, Mu Li, and Bao-Liang Lu
Effective Tag Set Selection in Chinese Word Segmentation via Conditional Random Field Modeling
The 20th Pacific Asia Conference on Language, Information and Computation (PACLIC-20), pp.87-94, Wuhan, China, November 1-3, 2006
[PDF]

Chang-Ning Huang and Hai Zhao
Which Is Essential for Chinese Word Segmentation: Character versus Word (Invited paper)
The 20th Pacific Asia Conference on Language, Information and Computation (PACLIC-20), pp.1-12, Wuhan, China, November 1-3, 2006

Hai Zhao, Chang-Ning Huang, and Mu Li
An Improved Chinese Word Segmentation System with Conditional Random Field
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing (SIGHAN-5), pp.162-165, Sydney, Australia, July 22-23, 2006
[PDF]

Hai Zhao and Bao-Liang Lu
A Modular Reduction Method for k-NN Algorithm with Self-Recombination Learning
The Third International Symposium on Neural Networks (ISNN-2006), LNCS Vol. 3971: 530-536, Chengdu, China, May 30 - June 1, 2006
[PDF]

更多

软件发布

这里发布的是一些能够完成基础自然语言处理任务的软件，它们大都是我们曾经参与的公开评测的系统的简化版本。发布这些软件的目的是希望它们有所帮助，但是绝无担保。它们可以免费用于非盈利研究和教育目的。同时欢迎一切错误报告以及改进意见。
中文分词排行榜 SIGHAN Bakeoff 2005
很久很久以前, SIGHAN Bakeoff 2005 发布的四个切分语料就已经成为中文分词界的评估标准。
我们在此维护一个中文分词排行榜的目的是收集尽可能的系统结果来展示技术进步。
用户添加已经开放!
注意，本排行榜仅接受足够严肃出版物或者在线系统给出的结果。

BaseSeg: 带未登录词识别功能的多标准中文分词工具
【下载 (53.5M)】(如果你需要它的C++源代码，请给我发电子邮件)
功能: BaseSeg (当前版本1.5)是Bakeoff-3的四个切分标准上的中文分词工具（包含未登录词识别功能）。
技术: BaseSeg 基于CRF++写成。它使用我们在SIGHAN-5发表的论文中的n-gram特征设置进行训练。
性能: BaseSeg的性能居于Bakeoff-3最好结果的前三位之列。在Bakeoff-3四个测试语料AS, CityU, CTB以及MSRA上, 它给出的总体F值分别是0.954, 0.969, 0.932 以及 0.961，同时它拥有所有测试语料上的最高的未登录词识别性能。

BaseNER: 未切分中文文本的命名实体识别工具
【下载 (23.7M)】
功能: BaseNER (当前版本1.0)是一个高性能的命名实体识别分类工具，支持Bakeoff-3两个标注标准。
技术: BaseNER 基于CRF++写成。它使用我们在SIGHAN-6发表的论文中的n-gram特征设置进行训练。
性能: 对于CityU以及MSRA两个命名实体标注标准，它给出的命名实体识别分类的总体F值分别是0.8815和0.8524(Bakeoff-3对应的测试集)。

BasePoS: 中英文词性标注工具
【下载 (8.5M)】
功能: BasePoS (当前版本1.0)是中英文词性标注工具。对于中文，输入文本需要切分完毕，可以和baseSeg配合使用。
技术: BasePoS 基于一个最大熵模型写成。中文模型使用Bakeoff-4的CTB词性标注的训练语料，英文模型采用PTB语料的Section 02-21。
性能: 中文标注精度0.941(Bakeoff-4的CTB测试语料)，在PTB/Section 24上的英文标注精度是0.966。

字依存标注集
[下载请求]
功能: 字依存标注以及标注规范文档，用于构造完整的字依存树(需结合已有树库中的词依存)
技术: 基于我的EACL-2009论文, (Zhao, 2009)及后续研究


		顶部简历评测论文软件发布
		(最后更新：2023年2月)
自2009年12月10日

研究兴趣

授课

评测

[2010] NEWS-2010(和 宋彦 共同参与)

[2009] CoNLL-2009 (和陈文亮共同参与)

[2008] CoNLL-2008

[2007] Bakeoff-4

[2006] Bakeoff-3

论文

[2023]

[2022]

[2021]

[2020]

[2019]

[2018]

[2017]

[2016]

[2015]

[2014]

[2013]

[2012]

[2011]

[2010]

[2009]

[2008]

[2007]

[2006]

软件发布

中文分词排行榜 SIGHAN Bakeoff 2005

BaseSeg: 带未登录词识别功能的多标准中文分词工具

BaseNER: 未切分中文文本的命名实体识别工具

BasePoS: 中英文词性标注工具

字依存标注集

[2010] NEWS-2010(和宋彦共同参与)