Cbow & skip gram of Word2Vec
We introduced the distribution hypothesis before, mainly through the context to construct a co-occurrence matrix, cosine similarity, Jaccard similarity and point mutual information can be used to measure the similarity or relevance of words based on the co-occurrence matrix. In order to avoid the statistical unrelia ...
Posted by The Jackel on Mon, 04 Oct 2021 10:10:36 -0700
Machine learning_ 1:K-nearest neighbor algorithm
Machine learning_ 1:K-nearest neighbor algorithm
Experimental background
This experiment is based on the classical k-nearest neighbor algorithm of machine learning. I will first introduce the principle of k-nearest neighbor algorithm and basic classification experiments, and then introduce how to use k-nearest neighbor algorithm for handw ...
Posted by glitch003 on Sun, 03 Oct 2021 18:32:21 -0700
[image segmentation] brain CT image segmentation based on FCM and improved fuzzy clustering FCM matlab source code
FCM algorithm is a clustering algorithm based on partition. Its idea is to maximize the similarity between objects divided into the same cluster and minimize the similarity between different clusters. Fuzzy C-means algorithm is an improvement of ordinary C-means algorithm. Ordinary C-means algorithm is hard for data division, while FCM is a f ...
Posted by leony on Sun, 03 Oct 2021 15:39:00 -0700
Principle and implementation of knn algorithm
Lei Zhang
For the implementation of python code of basic machine learning algorithm, please refer to:
zlxy9892/ml_code
1 Principle
knn is a very basic algorithm in the field of machine learning, which can solve classification or regression problems. If it is just beginning to learn machine learning, knn is a very good entry choice. It ...
Posted by maxkbr on Sun, 03 Oct 2021 15:06:37 -0700
[watermelon book reading notes] 04 decision tree part I
Chapter IV decision tree
1 basic process
2 division selection
With the continuous division process, we hope that the samples contained in the branch nodes of the decision tree belong to the same category as much as possible, that is, the "purity" of the nodes is getting higher and higher.
2.1 information gain
2.1.1 what is info ...
Posted by thunder708 on Sat, 02 Oct 2021 19:02:22 -0700
[natural language processing] Introduction to PyTorch (essential basic knowledge)
PyTorch Foundation
In this book, we widely use PyTorch to implement our deep learning model. PyTorch is an open source, community driven deep learning framework. Unlike Theano, Caffe and TensorFlow, PyTorch implements a "tape based automatic differentiation" method that allows us to dynamically define and execute computational gr ...
Posted by persia on Fri, 01 Oct 2021 16:39:16 -0700
pclpy statistical filtering
1, Algorithm principle
1. Principle overview
the point cloud data set generated by lidar sometimes receives echo signals incorrectly, which will lead to some wrong values. Due to the influence of measurement errors caused by scanning results, errors will inevitably occur in other processing of the point cloud, and some noise ...
Posted by nimzie on Thu, 30 Sep 2021 17:47:53 -0700
Introduction to irace paratonics
Installation and usage of irace
Since many of the functions of irace can only be implemented on GNU/Linux systems, it is recommended that you run this tuning tool on Linux, which is also described below for Linux.
Note: The irace runtime requires the R language runtime environment. The R language runtime environment dependency may co ...
Posted by cvsherri on Thu, 30 Sep 2021 10:33:00 -0700
Tianchi competition - Forecast of repurchase of tmall users
Learn big data analysis and machine learning from 0, and simply write down the competition experience. The score is 0.623537, ranking 629 / 5602
1, Competition background
Merchants sometimes carry out large-scale promotional activities (such as discounts or cash coupons) on specific dates (such as boxing day sales, "Black Friday& ...
Posted by dch27 on Wed, 29 Sep 2021 13:21:38 -0700
matplotlib of data analysis
matplotlib of data analysis
Introduction to matplotlib
matplotlib official website
Data analysis: make statistics and sort out a large number of data to draw conclusions and provide data support for subsequent decision-making
Learn matplotlib?
It can visualize the data and present it more intuitivelyMake the data more objective and per ...
Posted by grudz on Tue, 28 Sep 2021 03:17:13 -0700