admin管理员组

文章数量:1130349

Abstract task: social relation extraction(SRE):aims to infer the social relation between two people method: FL-MSRE:a few-shot learning based approach to extractingsocial relations from both texts and face images datasets: presents three multimodal datasets annotated from four classical masterpieces and corresponding TV series Introduction SRE be of great value in reality:can capture social connections and enable machines better understand human behaviors clarify the following two questions: Can introducing face image information into a text-basedmodel improve the performance for SRE? Can facial features extracted from different images achieve similar performance as from the same image? contributions: present multimodal social relation datasets propose a novel approach FL-MSRE for SRE Extensive experiments demonstrate...... Related Work task: pres:unimodal(text only or image only) -> ours: multimodal(both text and image) pres: multimodal(with only one image) -> ours: multimoday(a lists of images) few-shot learning: prototypical network Multimodal Learning :略 Multimodal Social Relation Datasets 略 The Proposed Approach FL-MSRE Problem Formulation Every entity e consists of two parts: a bounding boxbe = (x1, y1, x2, y2) a character name: ce follow the N way K shot setting input tuple:(s, h, t, gh, gt, r) denotes sentence、head entity、tail entity、the image containing the face of h、the image containing the face of t、the relation between h and t Overview Multimodal Encoder Sentence Encoder Prototypical Network Experiments The Baseline Approach

Experiments The Baseline Approach :BERT (The BERT encoder is also fifine-tuned with prototypical network Image Sampling two methods for image sampling: the same image&&different images(s have their own advantages Experiment Setting Dataset Analysis and Splits Implementation Details Result Analysis

Cross-Dataset Analysis

 Answering the Two Questions

Case Study Conclusion 

 

Abstract task: social relation extraction(SRE):aims to infer the social relation between two people method: FL-MSRE:a few-shot learning based approach to extractingsocial relations from both texts and face images datasets: presents three multimodal datasets annotated from four classical masterpieces and corresponding TV series Introduction SRE be of great value in reality:can capture social connections and enable machines better understand human behaviors clarify the following two questions: Can introducing face image information into a text-basedmodel improve the performance for SRE? Can facial features extracted from different images achieve similar performance as from the same image? contributions: present multimodal social relation datasets propose a novel approach FL-MSRE for SRE Extensive experiments demonstrate...... Related Work task: pres:unimodal(text only or image only) -> ours: multimodal(both text and image) pres: multimodal(with only one image) -> ours: multimoday(a lists of images) few-shot learning: prototypical network Multimodal Learning :略 Multimodal Social Relation Datasets 略 The Proposed Approach FL-MSRE Problem Formulation Every entity e consists of two parts: a bounding boxbe = (x1, y1, x2, y2) a character name: ce follow the N way K shot setting input tuple:(s, h, t, gh, gt, r) denotes sentence、head entity、tail entity、the image containing the face of h、the image containing the face of t、the relation between h and t Overview Multimodal Encoder Sentence Encoder Prototypical Network Experiments The Baseline Approach

Experiments The Baseline Approach :BERT (The BERT encoder is also fifine-tuned with prototypical network Image Sampling two methods for image sampling: the same image&&different images(s have their own advantages Experiment Setting Dataset Analysis and Splits Implementation Details Result Analysis

Cross-Dataset Analysis

 Answering the Two Questions

Case Study Conclusion 

 

本文标签: ShotlearningflMSREbased