[논문리뷰] Simple task-specific bilingual word embeddings

티스토리 뷰

인공지능, 자연어처리

[논문리뷰] Simple task-specific bilingual word embeddings

CoShin 2021. 3. 2. 12:50

Simple task-specific bilingual word embeddings

1. Summary

본 논문에서는 이종간 언어에 대한 Word Embedding 방법을 다루고 있습니다. 본 논문에서 제시하는 방법론의 장점으로는 (a) word embedding algorithm에 독립적이고 (b) 병렬적인 데이터를 필요로 하지 않는다는 것에 있습니다. 본 논문의 방법론으로 이종간 언어의 POS tagging에 대한 결과를 실험 검증하였습니다.

2. Methods

Input으로 source corpus, target corpus, bilingual equivqlences R을 받습니다. 이때 source corpus에 있는 word w를 target corpus에 있는 w'대체하는데, w와 w'는 R에 의하여 같은 의미를 갖은 단어라는 것이 보장되어야 합니다. 대체하는 수는 word embedding의 차원중 절반을 랜덤하게 선택하여 대체합니다. 이때 여러 의미의 구문이 있다면 POS equivalence가 동일한 class에서 어느 단어로든 대체해도 상관 없습니다.

3. Experiments

- 10만개의 word language pair

- window size 4인 CBoW word embedding 사용

- POS tagging evaluation dataset으로는 Google universal tagset 사용

- Translation equivalent는 Google Translate를 이용해서 만듦

'인공지능, 자연어처리' 카테고리의 다른 글

[논문리뷰]Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (0)	2021.03.14
[논문리뷰] Distributed Representations of Words and Phrases and their Compositionality(Word2Vec, Mikolov) (0)	2021.03.08
[논문리뷰] Efficient Estimation of Word Representations in Vector Space(Mikoloiv, Word2Vec) (0)	2021.03.06
젠심-텍스트 벡터화, 변환 및 N-그램 (0)	2021.02.28
토픽모델링 (0)	2021.02.27

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

코딩하는 신학생 인스타그램

TAG more

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

글 보관함

코딩하는 신학생의 넋두리

티스토리 뷰