业务需求:

  • 在人工客服服务日志中抽取问答对,配置到机器人知识库中

调研:

业界方案一:抽取式:
  • QA matching:以question为出发点,即假设question已经确定,从上下文(主要是上文)中找到该question的答案,组成问答对
    • 【ECAI-2020】Matching Questions and Answers in Dialogues from Online Forums:https://arxiv/pdf/2005.09276.pdf
    • 【LREC-2020】Cross-sentence Pre-trained Model for Interactive QA matching:https://www.aclweb/anthology/2020.lrec-1.666/
  • conversation structure modeling(dialogue structure analysis/conversation structure discovery/conversation disentanglement/discourse parsing):以answer为出发点,即假设answer已经确定,从上下文(主要是上文)中找到该answer对应的问题(reply-to relation),组成问答对,目前针对multi-party dialogue的研究较多
    two-party dialogue
    • 【AAAI-2017】Discovering Conversational Dependencies between Messages in Dialogs
    • 【AAAI-2019】Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks:https://ojs.aaai//index.php/AAAI/article/view/3778
      multi-party dialogue
    • 【NAACL-2018】Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking:multi-party, SHCNN, CISIR
    • 【AAAI-2019】A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues:multi-party
    • 【NAACL-2019】Context-Aware Conversation Thread Detection in Multi-Party Chat:multi-party
    • 【AAAI-2020】Who Did They Respond to? Conversation Structure Modeling Using Masked Hierarchical Transformer:https://ojs.aaai/index.php/AAAI/article/view/6524
业界方案二:生成式:(暂不考虑该方案)
  • 论文
    【ACL-2020】Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs:https://www.aclweb/anthology/2020.acl-main.20/
    【ACL-2020】Harvesting and Refining Question-Answer Pairs for Unsupervised QA:https://www.aclweb/anthology/2020.acl-main.600/
    【EMNLP-2020】QADiscourse - Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines:https://www.aclweb/anthology/2020.emnlp-main.224/
    【ACL-2020】A Smart System to Generate and Validate Question Answer Pairs for COVID-19 Literature:https://www.aclweb/anthology/2020.sdp-1.4.pdf
    【AAAI-2020】On the Generation of Medical Question-Answer Pairs:https://ojs.aaai//index.php/AAAI/article/view/6410
    【ACL-2016】QA-It: Classifying Non-Referential It for Question Answer Pairs:https://www.aclweb/anthology/P16-3020
    【NAACL-2016】Watson Discovery Advisor: Question-answering in an industrial setting:https://www.aclweb/anthology/W16-0101.pdf
    【ACL-2019】Synthetic QA Corpora Generation with Roundtrip Consistency:https://www.aclweb/anthology/P19-1620.pdf
    【EMNLP-2017】Question Generation for Question Answering:https://www.aclweb/anthology/D17-1090.pdf
    【ACL-2018】Neural Models for Key Phrase Extraction and Question Generation:https://www.aclweb/anthology/W18-2609.pdf

############################################################################################

【ECAI-2020】Matching Questions and Answers in Dialogues from Online Forums:https://arxiv/pdf/2005.09276.pdf

Abstract. Matching question-answer relations between two turns in conversations is not only the first step in analyzing dialogue structures, but also valuable for training dialogue systems. This paper presents a QA matching model considering both distance information and dialogue history by two simultaneous attention mechanisms called mutual attention. Given scores computed by the trained model between each non-question turn with its candidate questions, a greedy matching strategy is used for final predictions. Because existing dialogue datasets such as the Ubuntu dataset are not suitable for the QA matching task, we further create a dataset with 1,000 labeled dialogues and demonstrate that our proposed model outperforms the state-of-the-art and other strong baselines, particularly for matching long-distance QA pairs.

  1. 挑战:

    • mix-matched
    • incremental QA
    • 一问多答,多问一答,多问多答,有问无答
    • long-distance qa
  2. 问题定义:

    • QA matching:将对话中的问题(question)及其对应的答案(answer,可能有多句)对应(match)起来;
    • 论文假设已经有足够好的模型可以对Q与NQ进行分类,因此论文要解决的问题是:给定Q,找出其对应的NQ(可能有多个);
  3. model

    • sentence representation
      • Q及NQ取每个word的hidden state,H(Q)与H(NQ)取最后一个word的hidden state
    • mutual attention
      • 针对Q-NQ,及对应的H(Q)-H(NQ),Q与H(NQ)做attention得到Q’(问题表示),NQ与H(Q)做attention得到NQ’(答案表示)
    • match lstm
      • Q’对NQ’(i)做attention后输入到lstm,取最后一个word的hidden state作为Q-NQ-H(Q)-H(NQ)的表示P
    • prediction layer
      • 二分类
      • P与距离one-hot编码d拼接输入FC
    • greedy matching
      • 针对每个NQ,取score最大且score大于0.5的Q
  4. 结果

  5. 开源代码:https://github/JiaQiSJTU/QAmatching
    ############################################################################################

【AAAI-2019】Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks:https://ojs.aaai//index.php/AAAI/article/view/3778

Abstract. Customers ask questions, and customer service staffs answer those questions. It is the basic service manner of customer service (CS). The progress of CS is a typical multi-round conversation. However, there are no explicit corresponding relations among conversational utterances. This paper focuses on obtaining explicit alignments of question and answer utterances in CS. It not only is an important task of dialogue analysis, but also able to obtain lots of valuable train data for learning dialogue systems. In this work, we propose end-to-end models for aligning question (Q) and answer (A) utterances in CS conversation with recurrent pointer networks (RPN). On the one hand, RPN-based alignment models are able to model the conversational contexts and the mutual influence of different Q-A alignments. On the other hand, they are able to address the issue of empty and multiple alignments for some utterances in a unified manner. We construct a dataset from an in-house online CS. The experimental results demonstrate that the proposed models are effective to learn the alignments of question and answer utterances.

  1. introduction

    • 人工客服服务的对话没有明确的对应关系(explicit corresponding relation),表现为:one-to-many、None

    • 解决该类问题最简单的方法:找出与各个答案(server’s answer)最匹配的问题(content matching),但依然存在如下问题:

      • (Du, Poupart, and Xu 2017; Jiang et al. 2018)等只使用一个问答对做分类,没使用问答对的上下文

      • 问题之间存在相似或不相似性,可以由Q1与Q2相似(或不相似)以及Q1与A1是问答对,推导出Q2与A1是问答对相似(或不相似)

      • 存在empty alignment(None)、One-to-Many

  2. task discription

    • 人工对话中除了顾客问-客服答的模式外,还可能出现客服问-顾客答的情况,本文主要focus在前者
    • 挑战:
      • 一个答案可能会对应到多个问题上
      • 闲聊不会对应到任何问题上
      • 多种对应类型
  3. 方法

    • 包含三个部分:
      • utterance encoder
        • CNN做句子编码
        • 输出:CNN编码 + low dimension role embedding
      • conversation encoder
        • answer只考虑前文的question,因此使用单向的rnn比较合适
        • 考虑到顾客问题和客服回答包含不同的知识点,使用两个rnn单独编码customer utterance和server utterance:as shown in right side of the above model structure
        • model-II:customer encoder的输入包括customer utterance encoder的输出和历史conversation decoder的隐层状态,server utterance encoder同理
      • alignment decoder
        • 只考虑server utterance
        • 考虑到存在None的情况,模型中增加了一个"None" placeholder

        • 分类
          • 在每个server answer处,通过softmax(aj)得到概率,取概率最大的,以及与最大概率差距在一定threshold内的customer question作为alignment结果
        • 回归
          • 使用sigmoid(aij)得到概率pij,pij大于0.5,则认为是i是j的question
      • training
        • 使用seq2seq的loss,即:(M表示会话总数)
        • 分类的loss:交叉熵,
        • 回归的loss:L2 Norm,
  4. 结果

############################################################################################

【AAAI-2020】Who Did They Respond to? Conversation Structure Modeling Using Masked Hierarchical Transformer:https://ojs.aaai/index.php/AAAI/article/view/6524

Abstract.Conversation structure is useful for both understanding the nature of conversation dynamics and for providing features for many downstream applications such as summarization of conversations. In this work, we define the problem of conversation structure modeling as identifying the parent utterance(s) to which each utterance in the conversation responds to. Previous work usually took a pair of utterances to decide whether one utterance is the parent of the other. We believe the entire ancestral history is a very important information source to make accurate prediction. Therefore, we design a novel masking mechanism to guide the ancestor flow, and leverage the transformer model to aggregate all ancestors to predict parent utterances. Our experiments are performed on the Reddit dataset (Zhang, Culbertson, and Paritosh 2017) and the Ubuntu IRC dataset (Kummerfeld et al. 2019). In addition, we also report experiments on a new larger corpus from the Reddit platform and release this dataset. We show that the proposed model, that takes into account the ancestral history of the conversation, significantly outperforms several strong baselines including the BERT model on all datasets.

  1. introduction
    • 要解决的问题:给定对话内容,分析对话的结构
    • 多人对话的对话结构可以看做是一个消息pair的"reply_to"的关系
  2. related work
    • next utterance prediction:
      • 根据历史utterance,生成下文
      • 根据历史utterance,从一系列response utterance中检索出下文
    • conversation disentanglement
      • “reply_to” problem
    • discourse parsing
      • 考虑"quesiton-answer"等分类问题
  3. model
    • 要解决的问题:给定对话中的句子Sl,在其上文S0,S1,…,Sl-1中,找出哪个句子是Sl的parent utterance
    • 本质上是做pair classification
    • Masked Hierarchical Transformer Model
      • FC的维度:1x300,FC的输出只有一维
      • mask矩阵:矩阵大小LxL,Mij=1表示utterance i can attend to utterance j,否则则没有attention关系
      • two stage training:shared utterance encoder使用预训练的bert,masked transformer则是随机初始化的,如果一开始使用大的learning rate,则预训练bert会出现catastrophic forgetting,而如果使用太小的learning rate,训练时间太长,因此在第一阶段,固定shared utterance encoder的参数,使用大的learning rate训练masked transformer,在第二阶段再使用小的learning rate做整体的训练
  4. 结果:
  5. ablation study
  6. training时和prediction时的gap:
    • sampling
    • 使用beam search替代greedy decode

############################################################################################

【ACL-2020】Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

Abstract. One of the most crucial challenges in question answering (QA) is the scarcity of labeled data, since it is costly to obtain question-answer (QA) pairs for a target text domain with human annotation. An alternative approach to tackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts (e.g. Wikipedia). In this work, we propose a hierarchical conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizing the mutual information between generated QA pairs to ensure their consistency. We validate our Information Maximizing Hierarchical Conditional Variational AutoEncoder (InfoHCVAE) on several benchmark datasets by evaluating the performance of the QA model (BERT-base) using only the generated QA pairs (QA-based evaluation) or by using both the generated and human-labeled pairs (semisupervised learning) for training, against stateof-the-art baseline models. The results show that our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training

  1. introduction
    • 问题一:one-to-many problem
      • 目前的QG system基本都有一个问题:one-to-many problem,context包含很多信息,但是seq2seq却只能生成generic seq
      • 为了解决上述问题,提出分层条件变分自编码器HCVAE:question和answer分别对应两个latent space,答案的空间是问题的条件空间
      • 先生成答案,在结合答案和上下文生成问题,每个时刻focus在不同的context上可以生成多种多样的
    • 问题二:QG的另一个挑战是:生成的问题和答案的一致性
      • 根据answer和context生成的问题是要能用该answer和context回答上的
      • 本文通过最大化生成的answer和question的互信息
  2. related work
    • QG&QAG
      • 加入段落作为信息补充
      • 强化学习生成question
      • 预训练模型生成question
    • semi-supervised QA with QG
    • variational autoencoder
      • 用于LM、dialogue generation、machine translation
  3. mothed
    • 建模P(x,y|c),c表示context,x表示question,y表示answer

  4. experiment
    • 评价方法:QAE/R-QAE:QAE高且R-QAE低,说明生成的qa pair好
    • 结果

更多推荐

【原创】人工客服会话日志挖掘论文调研