Learn NLP with Transformer (Chapter 8)

8. Sequence labeling Task08 text classification This study refers to Datawhale open source learning: https://github.com/datawhalechina/learn-nlp-with-transformers The content is generally derived from the original text and adjusted in combination with their own learning ideas. Personal summary: first, the structure of sequence annotation task ...

Posted by cspgsl on Mon, 27 Sep 2021 02:22:23 -0700

20210925_NLP transformer_ Text classification of NLP

6, Text classification source Datewhle29 issue__ NLP transformer: Erenup (more notes), Peking University, principalZhang Fan, Datawhale, Tianjin University, Chapter 4Zhang Xian, Harbin Institute of technology, Chapter 2Li luoqiu, Zhejiang University, Chapter 3CAI Jie, Peking University, Chapter 4hlzhang, McGill University, Chapter 4T ...

Posted by kbc1 on Sat, 25 Sep 2021 02:32:28 -0700

Read the BERT model (program)

Google Bert model Environment and installation Environmental requirements Google Bert model download address: https://github.com/google-research/bert Environmental requirements: TensorFlow 1.11.0, python 2 and or Python 3 (TensorFlow 1.12.0, python 3.6 are actually operable) Project deployment First, download the relevant files on github, ...

Posted by Qnuts on Sat, 18 Sep 2021 20:44:48 -0700

Detailed SVD and common Embedding applications

The reason for writing this article is that after embedding with SVD and deepwall in a recommended task, the effect of the model has been improved, and the application of SVD is beyond the knowledge of dimension reduction and there is a lot to think about, so some methods of SVD and embedding are summarized. 1. Singular Value Decomposition SVD ...

Posted by EsOne on Sat, 18 Sep 2021 04:18:58 -0700

Natural language processing - use of Jieba word splitter

1. jieba Chinese word segmentation import jieba text = "In most cases, vocabulary is the basis of our understanding of sentences and articles, so we need a tool to decompose the complete text into finer grained words." cut_result = jieba.cut(text, cut_all=True) # Full mode print(cut_result) print("\n Full mode : " + "/ ".join(cut_result)) ...

Posted by bgomillion on Sat, 18 Sep 2021 04:13:59 -0700

Lesson 1: cloud picture of words

1. Download jieba participle and wordcloud Pip3 install jieba (3 may need to be removed) 2. Open + name the text to generate word cloud Use with open as 3. Participle Import custom dictionary (load_userdict; sep_list) 4. Statistics of word frequency Define an empty dictionary; Use cycle         &nb ...

Posted by TutorMe on Fri, 17 Sep 2021 16:10:35 -0700

Datawhale September Group Learning--Emotional Analysis--Task01

Tip 1: Learning Address Point Here Tip 1: Word embeddings: how to transform text into numbers Preface _Task01 mainly uses RNN framework (note: this paper does not give a detailed explanation of RNN principles), IMDB dataset to build a Baseline model of text affective analysis tasks. 1. Model building process 1.1 Data Preprocessing ...

Posted by semtex on Wed, 15 Sep 2021 09:34:10 -0700

Data Driven Testing

The relationship between data-driven testing and automation: It's not about automated testing, it's about data-driven testing.On the premise of automated testing, data-driven testing is a way to optimize code. - The intent of automated testing is to maintain automated scripts. Learning objectives: Common automated test patterns Data Driven ...

Posted by shiggins on Sat, 11 Sep 2021 14:43:06 -0700

Similarity Retrieval Faiss Model

1. faiss role The general solution to the problem of similarity retrieval for TopK is violent retrieval, which iterates through all vectors to compute similarity and derive TopK. However, when the number of vectors is large, this method and its time-consuming, Faiss's appearance solves this problem well. 2. Introduction to faiss The full nam ...

Posted by hjunw on Sat, 04 Sep 2021 09:27:17 -0700