Learn NLP with Transformer (Chapter 8)
8. Sequence labeling
Task08 text classification This study refers to Datawhale open source learning: https://github.com/datawhalechina/learn-nlp-with-transformers The content is generally derived from the original text and adjusted in combination with their own learning ideas.
Personal summary: first, the structure of sequence annotation task ...
Posted by cspgsl on Mon, 27 Sep 2021 02:22:23 -0700
20210925_NLP transformer_ Text classification of NLP
6, Text classification
source
Datewhle29 issue__ NLP transformer:
Erenup (more notes), Peking University, principalZhang Fan, Datawhale, Tianjin University, Chapter 4Zhang Xian, Harbin Institute of technology, Chapter 2Li luoqiu, Zhejiang University, Chapter 3CAI Jie, Peking University, Chapter 4hlzhang, McGill University, Chapter 4T ...
Posted by kbc1 on Sat, 25 Sep 2021 02:32:28 -0700
Read the BERT model (program)
Google Bert model
Environment and installation
Environmental requirements
Google Bert model download address: https://github.com/google-research/bert Environmental requirements: TensorFlow 1.11.0, python 2 and or Python 3 (TensorFlow 1.12.0, python 3.6 are actually operable)
Project deployment
First, download the relevant files on github, ...
Posted by Qnuts on Sat, 18 Sep 2021 20:44:48 -0700
Detailed SVD and common Embedding applications
The reason for writing this article is that after embedding with SVD and deepwall in a recommended task, the effect of the model has been improved, and the application of SVD is beyond the knowledge of dimension reduction and there is a lot to think about, so some methods of SVD and embedding are summarized.
1. Singular Value Decomposition SVD ...
Posted by EsOne on Sat, 18 Sep 2021 04:18:58 -0700
Natural language processing - use of Jieba word splitter
1. jieba Chinese word segmentation
import jieba
text = "In most cases, vocabulary is the basis of our understanding of sentences and articles, so we need a tool to decompose the complete text into finer grained words."
cut_result = jieba.cut(text, cut_all=True) # Full mode
print(cut_result)
print("\n Full mode : " + "/ ".join(cut_result))
...
Posted by bgomillion on Sat, 18 Sep 2021 04:13:59 -0700
Lesson 1: cloud picture of words
1. Download jieba participle and wordcloud
Pip3 install jieba (3 may need to be removed)
2. Open + name the text to generate word cloud
Use with open as
3. Participle
Import custom dictionary (load_userdict; sep_list)
4. Statistics of word frequency
Define an empty dictionary; Use cycle &nb ...
Posted by TutorMe on Fri, 17 Sep 2021 16:10:35 -0700
Datawhale September Group Learning--Emotional Analysis--Task01
Tip 1: Learning Address Point Here Tip 1: Word embeddings: how to transform text into numbers
Preface
_Task01 mainly uses RNN framework (note: this paper does not give a detailed explanation of RNN principles), IMDB dataset to build a Baseline model of text affective analysis tasks.
1. Model building process
1.1 Data Preprocessing
...
Posted by semtex on Wed, 15 Sep 2021 09:34:10 -0700
Data Driven Testing
The relationship between data-driven testing and automation:
It's not about automated testing, it's about data-driven testing.On the premise of automated testing, data-driven testing is a way to optimize code. - The intent of automated testing is to maintain automated scripts.
Learning objectives:
Common automated test patterns
Data Driven ...
Posted by shiggins on Sat, 11 Sep 2021 14:43:06 -0700
Similarity Retrieval Faiss Model
1. faiss role
The general solution to the problem of similarity retrieval for TopK is violent retrieval, which iterates through all vectors to compute similarity and derive TopK. However, when the number of vectors is large, this method and its time-consuming, Faiss's appearance solves this problem well.
2. Introduction to faiss
The full nam ...
Posted by hjunw on Sat, 04 Sep 2021 09:27:17 -0700