• Program Overview

  • Technical Program
    • Monday, 26 November, 2018
    • Tuesday, 27 November, 2018
      • 09:00-10:00
      • 10:30-12:30
        • O-1: Speech Enhancement
          • A Front-End Speech Enhancement System for Robust Automotive Speech Recognition
            Haikun Wang, University of Science and Technology of China; Zhongfu Ye, University of Science and Technology of China; Jingdong Chen, Northwestern Polytechnical University
          • Speech Enhancement using Convolutional Neural Network with Skip Connections
            Yupeng Shi, Shenzhen University; Weicong Rong, Shenzhen University; Nengheng Zheng, Shenzhen University
          • A Novel Unified Framework for Speech Enhancement and Bandwidth Extension Based on Jointly Trained Neural Networks
            Bin Liu, Chinese Academy of Sciences; Jianhan Tao, Chinese Academy of Sciences, CASIA, University of Chinese Academy of Sciences; Yibin Zheng, Chinese Academy of Sciences, University of Chinese Academy of Sciences
          • Speech Enhancement Based on Reducing the Detail Portion of Speech Spectrograms in Modulation Domain via Discrete Wavelet Transform
            Shih-kuang Lee, National Chi Nan University; Syu-Siang Wang, Academia Sinica; Yu Tsao, Academia Sinica; Jeih-weih Hung, National Chi Nan University
          • Two-Stage Enhancement of Noisy and Reverberant Microphone Array Speech for Automatic Speech Recognition Systems Trained with Only Clean Speech
            Quandong Wang, University of Chinese Academy of Sciences, Institute of Acoustics, Georgia Institute of Technology; Sicheng Wang, Georgia Institute of Technology; Fengpei Ge, Institute of Acoustics; Chang Woo Han, Samsung Reserch; Jaewon Lee, Samsung Reserch; Lianghao Guo, Institute of Acoustics; Chin-Hui Lee, Georgia Institute of Technology
          • Utterance-level Permutation Invariant Training with Discriminative Learning for Single Channel Speech Separation
            Cunhang Fan, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Bin Liu, Chinese Academy of Sciences; Jianhua Tao, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Zhengqi Wen, Chinese Academy of Sciences; Jiangyan Yi, Chinese Academy of Sciences; Ye Bai, Chinese Academy of Sciences, University of Chinese Academy of Sciences
        • O-2: Speech Synthesis and Voice Conversion
          • A Method for Emotional Speech Synthesis Based on Speaker Adaptive Training
            Xiaoyong Lu, Northwest normal university, Engineering Research Center of Gansu Province for Intelligent Information Technology and Application; Yanqin Li, Northwest normal university, Engineering Research Center of Gansu Province for Intelligent Information Technology and Application; Hongwu Yang, Northwest normal university, Engineering Research Center of Gansu Province for Intelligent Information Technology and Application
          • Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion
            Xurong Xie, Chinese University of Hong Kong, Chinese Academy of Sciences (Shenzhen); Xunying Liu, Chinese University of Hong Kong, Chinese Academy of Sciences (Shenzhen); Tan Lee, Chinese University of Hong Kong; Lan Wang, Chinese Academy of Sciences (Shenzhen)
          • Emotional Speech Synthesis Based on DNN and PAD Emotional State Model
            Weizhao Zhang, Northwest Normal University, Engineering Research Center of Gansu Province for Intelligent Information Technology and Application; Hongwu Yang, Northwest Normal University, 2Engineering Research Center of Gansu Province for Intelligent Information Technology and Application, National and Provincial Joint Engineering Laboratory of Learning Analysis Technology in Online Education; Pengpeng Zhi, Northwest Normal University
          • Research on Dungan Speech Synthesis Based on Deep Neural Network
            LiJia Chen, Northwest Normal University; Hongwu Yang, Northwest Normal University, Engineering Research Center of Gansu Province for Intelligent Information Technology and Application, National and Provincial joint Engineering Laboratory of Learning Analysis Technology in Online Education; Hui Wang, Northwest Normal University
          • Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders
            Wen-Chin Huang, Academia Sinica, National Taiwan University; Hsin-Te Hwang, Academia Sinica; Yu-Huai Peng, National Tsing Hua University; Yu Tsao, Academia Sinica; Hsin-Min Wang, Academia Sinica
          • Frame Selection in SI-DNN Phonetic Space with WaveNet Vocoder for Voice Conversion without Parallel Training Data
            Feng-Long Xie, Harbin Institute of Technology, Microsoft Research Asia; Frank Soong, Microsoft Research Asia; Xi Wang, Microsoft Cloud and AI; Lei He, Microsoft Cloud and AI; Haifeng Li, Harbin Institute of Technology
      • 14:00-16:00
        • Poster 1: Speaker, Language and Emotion Recognition
          • Manifold-based Incremental Community Detection Method for Online Speaker Identification
            Hongcui Wang, Zhejiang University of Water Resources and Electric Power; Dongxiao HE, Tianjin University; Jianwu DANG, Tianjin University, Japan Advanced Institute of Science and Technology; Xi LIANG, Zhejiang University of Water Resources and Electric Power
          • Max Margin Cosine Loss for Speaker Identification on Short Utterances
            Ruifang Ji, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Junhua Cao, Chongqing Public Security Bureau; Xinyuan Cai, Chinese Academy of Sciences; Bo Xu, Chinese Academy of Sciences
          • Automatic Personality Perception from Speech in Mandarin
            Minxian Zhu, Beijing Institute of Technology; Xiang Xie, Beijing Institute of Technology; Liqiang Zhang, Beijing Institute of Techonology; Jing Wang, Beijing Institute of Technology
          • Text-dependent Speaker Verification Using Word-based Scoring
            Shengyu Yao, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Houjun Huang, Chinese Academy of Sciences; Ruohua Zhou, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Yonghong Yan, Chinese Academy of Sciences
          • End-to-end Language Identification using NetFV and NetVLAD
            Jinkun Chen, Sun Yat-sen University; Weicheng Cai, Duke Kunshan University, Sun Yat-sen University; Danwei Cai, Duke Kunshan University; Zexin Cai, Duke Kunshan University, China; Haibin Zhong, Jiangsu Jinling Science and Technology Group Limited; Ming Li, Duke Kunshan University
          • Robust Front-End Processing For Emotion Recognition In Noisy Speech
            Meghna Pandharipande, TCS Research and Innovation; Rupayan Chakraborty, TCS Research and Innovation; Ashish Panda, TCS Innovation Labs, Mumbai, India; Sunil Kumar Kopparapu, TCS Research and Innovation
          • Replay Attacks Detection Using Phase and Magnitude Features with Various Frequency Resolutions
            Meng Liu, Tianjin University; Longbiao Wan, Tianjin University; Zeyan Oo, Nagaoka University of Technology; Jianwu Dang, Tianjin University, Japan Advanced Institute of Science and Technology; Dongbo Li, Tianjin University; Seiichi Nakagawa, Chubu University
          • Novel Demodulation-Based Features using Classifier-level Fusion of GMM and CNN for Replay Detection
            Madhu R. KAMBLE, Dhirubhai Ambani Institute of Information and Communication Technology(DA-IICT); Hemlata Tak, Dhirubhai Ambani Institute of Information and Communication Technology(DA-IICT); Maddala V. Siva Krishna, Indian Institute of Information Technology (IIIT); Hemant A. Patil, Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT)
          • Chinese Causal Relation: Conjunction, Order and Focus-to-Stress Assignment
            Liang Zhang, China Univeristy of Political Science and Law; Aijun Li, Institute of Linguistics, CASS; Yingyi Luo, Institute of Linguistics, Chinese Academy of Social Sciences
          • Parallel Double Audio Fingerprinting
            Tianyu Liang, Tsinghua University; Xianhong Chen, Tsinghua University; Can Xu, Tsinghua University; Liang He, Tsinghua University
          • LSTM-Based Pitch Range Estimation from Spectral Information of Brief Speech Input
            Wei Zhang, Beijing Language and Culture University; Qi Zhang, Beijing Language and Culture University; Yanlu Xie, Beijing Language and Culture University; Jinsong Zhang, Beijing Language and Culture University
          • Acoustic and Kinematic Examination of Dysarthria in Cantonese patients of Parkinson's disease
            Yue Sun, Guizhou University, Chinese Academy of Sciences; Manwa L. Ng, University of Hong Kong; Chongyuan Lian, Chinese Academy of Sciences; Lan Wang, Chinese Academy of Sciences; Feng Yang, Shenzhen Children's Hospital; Nan Yan, Chinese Academy of Sciences
        • Poster 2: Speech Recognition II
          • Enhanced Denoising Auto-Encoder for Robust Speech Recognition in Unseen Noise Conditions
            Sonal Joshi, TCS Innovation Labs; Ashish Panda, TCS Innovation Labs; Biswajit Das, TCS Innovation Labs
          • Bidirectional LSTM with Extended Input Context
            Gaofeng Cheng, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Lu Huang, Tsinghua University ; Jiasong Sun, Tsinghua University; Yonghong Yan, Chinese Academy of Sciences, University of Chinese Academy of Sciences
          • A Comparable Study of Modeling Units for End-to-End Mandarin Speech Recognition
            Wei Zou, Didi Chuxing; Dongwei Jiang, Didi Chuxing; Shuaijiang Zhao, Didi Chuxing; Guilin Yang, Didi Chuxing; Xiangang Li, Didi Chuxing
          • Keyword Spotting Based On CTC and RNN For Mandarin Chinese Speech
            Yiyan Wang, Beijing Unisound Information Technology Co., Ltd.; Yanhua Long, Shanghai Normal University
          • Space-Time Residual LSTM Architechture for Distant Speech Recognition
            Long Wu, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Li Wang, Chinese Academy of Sciences ; Pengyuan Zhang, Chinese Academy of Sciences, University of Chinese Academy of Sciences; Ta Li, Chinese Academy of Sciences; Yonghong Yan, Chinese Academy of Sciences, University of Chinese Academy of Sciences
          • An Analysis of Decoding for Attention-based End-to-end Mandarin Speech Recognition
            Dongwei Jiang, Didi Chuxing; Wei Zou, Didi Chuxing; Shuaijiang Zhao, Didi Chuxing; Guilin Yang, Didi Chuxing; Xiangang Li, Didi Chuxing
          • A Study on Acoustic Modeling for Child Speech Based on Multi-Task Learning
            Jiarui Wang, The Chinese University of Hong Kong; Si Ioi Ng, The Chinese University of Hong Kong; Dehua Tao, The Chinese University of Hong Kong; Wing Yee NG, The Chinese University of Hong Kong; Tan Lee, The Chinese University of Hong Kong
          • Distant-talking Speech Recognition Based on Multi-objective Learning using Phase and Magnitude-based Feature
            Dongbo Li, Tianjin University; Longbiao Wang, Tianjin University; Jianwu Dang, Tianjin University, Japan Advanced Institute of Science and Technology; Meng Ge, Tianjin University; Haotian Guan, Intelligent Spoken Language Technology (Tianjin) Co., Ltd.
          • Speech Enhancement Based on A New Architecture of Wasserstein Generative Adversarial Networks
            ShuaiShuai Ye, Beijing University of Posts and Telecommunications; Ting Jiang, Beijing University of Posts and Telecommunications; Shan Qin, Beijing University of Posts and Telecommunications; Weixia Zou, Beijing University of Posts and Telecommunications; Chengyun Deng, Beijing University of Posts and Telecommunications
          • An Investigation of Transfer Learning Mechanism for Acoustic Scene Classification
            Hengshun Zhou, University of Science and Technology of China; Xue Bai, University of Science and Technologoy of China; Jun Du, University of Science and Technologoy of China
          • Microphone Array Acoustic Source Localization System Based on Deep Learning
            Junhao Ding, Shenzhen University; Bin Ren, Dongguan University of Technology; Nengheng Zheng, Shenzhen University
          • Evaluating Modeling Units and Sub-word Features in Language Models for Turkish ASR
            Chang Liu, Chinese Academy of Science, University of Chinese Academy of Sciences; Yike Zhang, Chinese Academy of Science, University of Chinese Academy of Sciences; Pengyuan Zhang, Chinese Academy of Science, University of Chinese Academy of Sciences; Yaofeng Wang, East China Normal University
          • Chinese Poetry Generation with Flexible Styles
            Jiyuan Zhang, Tsinghua University, Beijing National Research Center for Information Science and Technology; Dong Wang, Tsinghua University, Beijing National Research Center for Information Science and Technology
        • Poster 3: Phonetics, Phonology and Speech Prosody
          • Perceivable information structure in discourse prosody–Detecting prominent prosodic words in spoken discourse using F0 contour
            Chao-yu Su, Academia Sinica; Chiu-yu Tseng, Academia Sinica
          • Declination and boundary effect in Cantonese declarative sentence
            Chunyu Ge, Chinese Academy of Social Sciences; Aijun Li, Chinese Academy of Social Sciences
          • Interaction of Syntax, Semantics and Pragmatics on Discourse Prosody in Standard Chinese
            Xinyi Wen, Chinese Academy of Social Sciences, University of Chinese Academy of Social Sciences; Yuan Jia, Chinese Academy of Social Sciences; Aijun Li, Chinese Academy of Social Sciences
          • A Preliminary Study on Quantitative Calculation of Prosodic Strength in Mandarin Speech
            Wei Zhang, Beijing Language and Culture University; Yanlu Xie, Beijing Language and Culture University; Jinsong Zhang, Beijing Language and Culture University
          • L2 Mispronunciation Verification Based on Acoustic Phone Embedding and Siamese Networks
            Zhenyu Wang, Beijing Language and Culture University; Jinsong Zhang, Beijing Language and Culture University; Yanlu Xie, Beijing Language and Culture University
          • Comparing Mandarin Lexical Stress Produced by Native Speakers and L2 Learners in Hong Kong
            Lei Liu, School of Chinese Language and Culture; Xuemei Zhai, School of Foreign Languages and Cultures; Wentao Gu, School of Chinese Language and Culture
          • A study on the pitch realization of focus in Chinese
            Ziyu Xiong, Chinese Academy of Social Sciences; Maolin Wang, Jinan University
          • Effect of Anticipatory Vowel-to-Vowel Coarticulation at Different Prosodic Boundaries in Chinese
            Ziyu Xiong, Chinese Academy of Social Sciences; Maolin Wang, Jinan University
          • Co-articulation between Consonant and Vowel in Cantonese and Taiwanese CVC Syllables
            Wai-Sum Lee, City University of Hong Kong; Yueh-chin Chang, National Tsing Hua University; Feng-fan Hsieh, National Tsing Hua University
          • Cross-Dialectal Perception of the Third-Tone Sandhi in Standard Chinese –Evidence from Eye Movements
            Qian Li, Chinese Academy of Social Sciences; Yingyi Luo, Chinese Academy of Social Sciences; Aijun Li, Chinese Academy of Social Sciences
          • An Acoustic Comparison between Two Pairs of Assimilatory and Dissimilatory Tone Sandhi Processes in Nanjing Mandarin in Categoricalness/Gradience
            Xin Li, Utrecht University; René Kager, Utrecht University
          • Response Acts in Chinese Conversation: the Coding Scheme and Analysis
            Aijun Li, Chinese Academy of Social Sciences
          • End-to-End Mongolian Text-to-Speech System
            Jingdong Li, Inner Mongolia University; Hui Zhang, Inner Mongolia University; Rui Liu, Inner Mongolia University; Xueliang Zhang, Inner Mongolia University; Feilong Bao, Inner Mongolia University
          • Syntactic Structure and Communicative Function of Echo Questions in Chinese Dialogues
            Gan Huang, Chinese Academy of Social Sciences; Lin Zhu, Beijing International Studies University; Aijun Li, Chinese Academy of Social Sciences
        • Demos
          • An Automated Assessment Tool for Child Speech Disorders
            Si Ioi Ng, The Chinese University of Hong Kong; Dehua Tao, The Chinese University of Hong Kong; Jiarui Wang, The Chinese University of Hong Kong; Yi Jiang, The Chinese University of Hong Kong; Wing Yee Ng, The Chinese University of Hong Kong; Tan Lee, The Chinese University of Hong Kong
          • Hearing aids APP design based on deep learning technology
            Ji Yan Han, National Yang Ming University; Wei Zhong Zheng, National Yang Ming University; Ren Jie Huan, National Yang Ming University; Tsao Yu, Academia Sinica; Ying-Hui Lai, National Yang Ming University
          • IOS-based Ear Scale application for Clinical Audiology and Otology Usage
            Wen-Huei Liao, Taipei Veterans General Hospital; Pei-Chun Li, Mackay Medical College; Shuenn Tsong Young, Mackay Medical College; Ying-Hui Lai, National Yang-Ming University; Yu Tsao, Academia Sinica
          • Voice Conversion Challenge 2018
            Zhen-Hua Ling, University of Science and Technology of China; Junichi Yamagishi, National Institute of Informatics & University of Edinburgh; Jaime Lorenzo-Trueba, National Institute of Informatics; Tomoki Toda, Nagoya University; Daisuke Saito, University of Tokyo; Fernando Villavicencio, ObEN; Tomi Kinnunen, University of Eastern Finland
      • 16:30-18:30
        • O-3: Deep Learning for Speech and Language Processing I
          • Disordered Speech Assessment Using Kullback-Leibler Divergence Features with Multi-Task Acoustic Modeling
            Yuanyuan Liu, The Chinese University of Hong Kong; Ying Qin, The Chinese University of Hong Kong; Siyuan Feng, The Chinese University of Hong Kong; Tan Lee, The Chinese University of Hong Kong; P.-C. Ching, The Chinese University of Hong Kong
          • An End-to-End Approach to Automatic Speech Assessment for People with Aphasia
            Ying Qin, The Chinese University of Hong Kong; Tan Lee, The Chinese University of Hong Kong; Yuzhong Wu, The Chinese University of Hong Kong; Anthony Pak Hin Kong, University of Central Florida
          • Non-intrusive Speech Quality Assessment Using Deep Belief Network and Backpropagation Neural Network
            Yahui Shan, Beijing Institute of Technology; Jing Wang, Beijing Institute of Technology; Xiang Xie, Beijing Institute of Technology; Liuchen Meng, Beijing Institute of Technology; Jingming Kuang, Beijing Institute of Technology
          • A Progressive Deep Learning Approach to Child Speech Separation
            Xin Wang, University of Science and Technology of China; Jun Du, University of Science and Technologoy of China; Lei Sun, University of Science and Technologoy of China; Qing Wang, University of Science and Technology of China; Chin-Hui Lee, Georgia Institute of Technology
          • Convolutional Neural Turing Machine for Speech Separation
            Jen-Tzung Chien, National Chiao Tung University; Kai-Wei Tsou, National Chiao Tung University
          • Multilingual Speech Recognition Training and Adaptation with Language-Specific Gate Units
            Danyang Liu, Institute of Acoustics, University of Chinese Academy of Sciences; Xinxin Wan, The National Computer Network Emergency Response Technical Team/Coordination Center of China; Ji Xu, Institute of Acoustics; Pengyuan Zhang, Institute of Acoustics, University of Chinese Academy of Sciences
        • O-4: Corpus-Based Linguistics and Education
          • Acquisition of English Tense-lax Vowels by Chinese EFL Learners from Different Dialectal Regions
            Yuan Jia, Chinese Academy of Social Sciences; Cuiping Li, Chinese Academy of Social Sciences, Hunan University
          • An Acoustic Study of English Monophthongs Acquisition by Chinese EFL Learners from Northeast Region
            Yuan Jia, Chinese Academy of Social Sciences; Huimin Zhang, Shaanxi Normal University
          • Chinese EFL Learners' Acquisition of English Monophthongs-A Typological Study of Fuzhou, Ningbo, and Beijing
            Yuan Jia, Chinese Academy of Social Sciences; Xinyin Sun, Shaanxi Normal University
          • An Empirical Study of English Vowels Acquisition of EFL Learners in Tianjin and Zibo
            Bin Li, Graduate School of Chinese Academy of Social Sciences; Yuan Jia, Chinese Academy of Social Sciences
          • A Refined Query-by-Example Approach to Spoken-Term-Detection on ESL Learners' Speech
            Jingyong Hou, Northwestern Polytechnical University; Wenping Hu, Microsoft Research Asia; Frank K. Soong, Microsoft Research Asia; Lei Xie, Northwestern Polytechnical University
          • Improve the Accuracy of Non-native Speech Annotation with a Semi-automatic Approach
            Wei Wang, Beijing Advanced Innovation Center for Language Resources, Xinjiang University; Wei Wei, Beijing Advanced Innovation Center for Language Resources; Yanlu Xie, Beijing Advanced Innovation Center for Language Resources;Minghao Guo, Beijing Advanced Innovation Center for Language Resources; Jinsong Zhang, Beijing Advanced Innovation Center for Language Resources

    • Wednesday, 28 November, 2018
      • 09:00-10:00
      • 10:30-12:30
        • O-5: Speech Recognition I
          • Data Augmentation Using Conditional Generative Adversarial Networks for Robust Speech Recognition
            Peiyao Sheng, Shanghai Jiao Tong University; Zhuolin Yang, Shanghai Jiao Tong University; Hu Hu, Shanghai Jiao Tong University; Tian Tan, Shanghai Jiao Tong University; Yanmin Qian, Shanghai Jiao Tong University
          • Improving Gated Recurrent Unit Based Acoustic Modeling with Batch Normalization and Enlarged Context
            Jie Li, Kwai; Yahui Shan, Beijing Institute of Technology; Xiaorui Wang, Kwai; Yan Li, Kwai
          • Gated Module Neural Network for Multilingual Speech Recognition
            Yuan-Fu Liao, National Taipei University of Technology; Matus Pleva, Technical University of Kosice; Daniel Hladek, Technical University of Kosice; Jan Stas, Technical University of Kosice; Peter Viszlay, Technical University of Kosice; Martin Lojka, Technical University of Kosice; Jozef Juhár, Technical University of Kosice
          • Subspace Based Sequence Discriminative Training of LSTM Acoustic Models with Feed-Forward Layers
            Lahiru Samarakoon, Fano Labs; Brian Mak, The Hong Kong University of Science and Technology; Albert Y.S. Lam, Fano Labs
          • WaveNet MH-SRU: Deep and Wide Multiple-history Simple Recurrent Unit for Speech Recognition
            Hengguang Huang, The Hong Kong University of Science and Technology; Brian Mak, The Hong Kong University of Science and Technology
          • Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units
            Zhangyu Xiao, Tsinghua University; Zhijian Ou, Tsinghua University; Wei Chu, Liulishuo; Hui Lin, Liulishuo
        • O-6: Speech Analysis and Assessment
          • Combining Phase-based Features for Replay Spoof Detection System
            Kantheti Srinivas, Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT); Rohan Kumar Das, National University of Singapore; Hemant A. Patil, Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT)
          • Pitch Synchronized Relative Phase with Peak Error Detection For Noise-robust Speaker Recognition
            Meng Ge, Tianjin University; Longbiao Wang, Tianjin University; Seiichi Nakagawa, Toyohashi University of Technology; Yuta Kawakami, Nagaoka University of Technology; Jianwu Dang, Tianjin University, Japan Advanced Institute of Science and Technology; Xiangang Li, Didi Chuxing
          • Visual Information Affects Auditory Frequency Discrimination with Random Stimulus Sequences: Evidence from ERPs
            Lei Wang, Southern University of Science and Technology, The University of Hong Kong; Fei Chen, Southern University of Science and Technology,
          • Investigation of the Comprehension Process during Silent Reading based on Eye Movements
            Di Zhou, Japan Advanced Institute of Science and Technology; Jinfeng Huang, Japan Advanced Institute of Science and Technology; Jianwu Dang, Japan Advanced Institute of Science and Technology, Tianjin University
          • A Multi-modal Soft Targets Approach for Pronunciation Erroneous Tendency Detection
            Ju Lin, Beijing Language and Culture University; Wei Zhang, Beijing Language and Culture University; Linxuan Wei, Beijing Language and Culture University; Yanlu Xie, Beijing Language and Culture University; Jinsong Zhang, Beijing Language and Culture University
          • A Study on Landmark Verification of Mandarin Alveolar-palatal Consonants
            Zhenyu Wang, Beijing Language and Culture University; Qi Zhang, Beijing Language and Culture University; Shuang zheng, Beijing Language and Culture University; Jinsong Zhang, Beijing Language and Culture University; Yanlu Xie, Beijing Language and Culture University
      • 13:30-15:30
        • Panel: Chinese Spoken Language Processing - Retrospections and Future Prospects
          Moderator: Prof. Lin-shan Lee, National Taiwan University
          Panelists: Prof. Chin-Hui Lee, Georgia Institute of Technology; Prof. Haizhou Li, National University of Singapore; Prof. Helen Meng, The Chinese University of Hong Kong; Prof. Chiu-yu Tseng, Academia Sinica; Dr. Frank Soong, Microsoft Research Asia; Dr. Yiqing Zu, iFLYTEK Research; Prof. Jianhua Tao, Chinese Academy of Sciences

    • Thursday, 29 November, 2018
      • 09:00-10:00
      • 10:30-12:30
        • O-7: Speaker Recognition
          • DNN i-vector based Fishervoice and PLDA SVM scoring for NIST SRE 2016
            Jinghua Zhong, The Chinese University of Hong Kong; Helen Meng, The Chinese University of Hong Kong
          • Novel AmplitudeWeighted Frequency Modulation Features for Replay Spoof Detection
            Madhu R. Kamble, Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT); Hemant A. Patil, Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT)
          • Angular Softmax Loss for End-to-end Speaker Verification
            Yutian Li, Tsinghua University; Feng Gao, Tsinghua University; Zhijian Ou, Tsinghua University; Jiasong Sun, Tsinghua University
          • Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition
            Shuai Wang, Shanghai Jiao Tong University; Zili Huang, Shanghai Jiao Tong University; Yanmin Qian, Shanghai Jiao Tong University; Kai Yu, Shanghai Jiao Tong University
          • Exploring a Unified Attention-Based Pooling Framework for Speaker Verification
            Yi Liu, Tsinghua University; Liang He, Tsinghua University; Weiwei Liu, Chinese People’s Liberation Army; Jia Liu, Tsinghua University
          • Generative Adversarial Networks based X-vector Augmentation for Robust Probabilistic Linear Discriminant Analysis in Speaker Verification
            Yexin Yang, Shanghai Jiao Tong University; Shuai Wang, Shanghai Jiao Tong University; Man Sun, Shanghai Jiao Tong University; Yanmin Qian, Shanghai Jiao Tong University; Kai Yu, Shanghai Jiao Tong University
        • O-8: Speech Prosody, Production and Perception
          • Emphasis Detection for Voice Dialogue Applications Using Multi-channel Convolutional Bidirectional Long Short-Term Memory Network
            Long Zhang, Tsinghua University; Jia Jia, Tsinghua University; Fanbo Meng, Beijing Sogou Technology Co. Ltd.; Suping Zhou, Tsinghua University; Wei Chen, Beijing Sogou Technology Co. Ltd.; Cunjun Zhang, Tsinghua University, Beijing Sogou Technology Co. Ltd.; Runnan Li, Tsinghua University
          • Topic and Prosody Interaction in Chinese Discourse
            Yuan Jia, Chinese Academy of Social Science; Xiaoxiao Ma, Chinese Academy of Social Science, Nankai University
          • Measuring Prosodic Transfer in Vector Space by Weighted Tonal Events
            Chen Xuanda, Chinese Academy of Social Sciences; Jia Yuan, Chinese Academy of Social Sciences; Xiong Ziyu, Chinese Academy of Social Sciences
          • An ERP Study to Evaluate the Quality of Speech Processed by Wiener Filtering
            Fang Yu, Southern University of Science and Technology; Chin-Tuan Tan, The University of Texas at Dallas; Fei Chen, Southern University of Science and Technology
          • Estimation of glottal source waveforms and vocal tract shapes from speech signals based on ARX-LF model
            Yongwei Li, Japan Advanced Institute of Science and Technology; Ken-Ichi Sakakibara, Health Science University of Hokkaido; Masato Akagi, Japan Advanced Institute of Science and Technology
          • The DKU-JNU-EMA Electromagnetic Articulography Database on Mandarin and Chinese Dialects with Tandem Feature based Acoustic-to-Articulatory Inversion
            Zexin Cai, Duke Kunshan University; Xiaoyi Qin, Sun Yat-sen University; Danwei Cai, Duke Kunshan University; Ming Li, Duke Kunshan University; Xinzhong Liu, Jinan University; Haibin Zhong, Jiangsu Jinling Science and Technology Group Limited
      • 14:00-16:00
        • O-9: Deep Learning for Speech and Language Processing II
          • GTDNN-Based Voice Conversion Using DAEs with Binary Distributed Hidden Units
            Yi-Yang Ding, University of Science and Technology of China; Ya-Jun Hu, University of Science and Technology of China; Zhen-Hua Ling, University of Science and Technology of China
          • Unsupervised query by example spoken term detection using features concatenated with Self-Organizing Map distances
            Haiwei Wu, Sun Yat-sen University; Ming Li, Duke Kunshan University; Zexin Cai, Duke Kunshan University; Haibin Zhong, Jiangsu Jinling Science and Technology Group Limited
          • Multi-Head Attention for End-to-End Neural Machine Translation
            Ivan Fung, The Hong Kong University of Science and Technology; Brian Mak, The Hong Kong University of Science and Technology
          • Unusable Spoken Response Detection with BLSTM Neural Networks
            Zhaoheng Ni, City University of New York; Rutuja Ubale, Educational Testing Service Research; Yao Qian, Educational Testing Service Research; Michael Mandel, City University of New York; Su-Youn Yoon, Educational Testing Service Research; Abhinav Misra, Educational Testing Service Research; David Suendermann-Oeft, Educational Testing Service Research
          • Speech Super-Resolution Using ParallelWaveNet
            Mu Wang, Tsinghua University; Zhiyong Wu, Tsinghua University, The Chiense University of Hong Kong; Shiyin Kang; Xixin Wu; Jia Jia, Tsinghua University; Dan Su; Dong Yu; Helen Meng, Tsinghua University, The Chiense University of Hong Kong
          • Speech Emotion Recognition using Convolutional Neural Network with Audio Word-based Embedding
            Kun-Yi Huang, National Cheng Kung University; Chung-Hsien Wu, National Cheng Kung University; Qian-Bei Hong, National Cheng Kung University, Academia Sinica; Ming-Hsiang Su, National Cheng Kung University; Yuan-Rong Zeng, National Cheng Kung University
        • O-10: Spoken Language Technology
          • Formosa Speech Recognition Challenge 2018: Data, Plan and Baselines
            Yuan-Fu Liao, National Taipei University of Technology; Wu-Hua Hsu, National Taipei University of Technology; Yu-Chen Lin, National Taipei University of Technology; Yung-Hsiang Shawn Chang, National Taipei University of Technology; Matus Pleva, Technical University of Kosice; Jozef Juhar, Technical University of Kosice; Guang-Feng Deng, Institute for Information Industry
          • CLMAD: A Chinese Language Model Adaptation Dataset
            Ye Bai, Chinese Academy of Sciences; Jianhua Tao, Chinese Academy of Sciences; Jiangyan Yi, Chinese Academy of Sciences; Zhengqi Wen, Chinese Academy of Sciences; Cunhang Fan, Chinese Academy of Sciences
          • From Speech Signals to Semantics – Tagging Performance at Acoustic, Phonetic andWord Levels
            Yao Qian, Educational Testing Service Research; Rutuja Ubale, Educational Testing Service Research; Patrick Lange, Educational Testing Service Research; Keelan Evanini, Educational Testing Service Research; Frank Soong, Microsoft Research Asia
          • Using Dempster-Shafer Evidence Theory for Dialog State Tracking
            Minglu Liu, Iflytek Co., Ltd.; Miao Li, Tsinghua University; Ji Wu, Tsinghua University; Xiangling Fu; Ji Gao, Beijing University of Posts and Telecommunications
          • Prediction of Voice Disorder Severity: Contributions from Sustained Vowels and Continuous Speech
            Yuanyuan Liu, The Chinese University of Hong Kong; Tan Lee, The Chinese University of Hong Kong; Thomas Law, The Chinese University of Hong Kong; Kathy Lee, The Chinese University of Hong Kong; P.C. Ching, The Chinese University of Hong Kong
          • A Maximum Likelihood Approach to Masking-based Speech Enhancement Using Deep Neural Network
            Qing Wang, University of Science and Technology of China; Jun Du, University of Science and Technology of China; Li Chai, University of Science and Technology of China; Li-Rong Dai, University of Science and Technology of China; Chin-Hui Lee, Georgia Institute of Technology