1. Notes on Natural Language Processing
1.generator and list differences
generator uses the next mechanism, instead of building the entire storage space, list s need to build the entire storage space
2. The meaning of u, r, b before the string
u indicate string use unicode r indicate do not use transfer.etc,print('/n') will make cursor to next raw,but print(r'/n') will print '/n' exactly b indicate python2.x's bytes
3. Processing steps
1. Download the wiki's original package 2. Parse data package with wiki process.py to get traditional character data 3.opencc for processing and converting into Simplified Chinese data 4.jieba for word segmentation and word type classification 5.gensim Modeling 6. Interface Development
4. File upload
Select a corpus file to upload:
#View Function Code Slice file=request.files['file'] file.save
5. File Download
#https://www.jianshu.com/p/8daa3d011cfd from flask import send_file, send_from_directory import os @app.route("/download/<filename>", methods=['GET']) def download_file(filename): # You need to know two parameters, the first is the path to the local directory and the second is the file name (with extension) directory = os.getcwd() # Assume in the current directory return send_from_directory(directory, filename, as_attachment=True)
6.http 413
Large upload file caused
Add client_max_body_size 200m to nginx configuration;
7.yield
Used to construct generators.Returns the result of an expression to the right of a keyword, equivalent to return,On Next Call next()Continue executing statements when def foo(num): print("starting...") while num<10: num=num+1 yield num for n in foo(0):#Implicitly calling next() print(n) ------------------------------------------ starting... 1 2 3 4 5 6 7 8 9 10
8. Navigation bar label selection effect
1.Navigation bar<li>Label Add ID 2. $(document).ready(function(){ var location=window.location.href; var id=location.substring(location.lastIndexOf('/')+1); $("#"+id).addClass("active");#If the id here is empty, jQuery will report a symbol error! });
Server Deployment
1. First (gunicorn+flask+Nginx)
https://blog.csdn.net/qq_36114862/article/details/81380956 1)install gunicorn pip install gunicorn 2)start-up gunicorn gunicorn --worker=3 main:app -b 127.0.0.1:5000 3)install Nginx sudo apt install nginx 4)start-up Nginx sudo /etc/init.d/nginx start#Generally self-starting 5)Nginx Check Profile sudo Nginx -t 6)modify Nginx configuration file sudo vim /etc/nginx/sites-available/default server { listen 80; server_name _; # External Address location / { proxy_pass http://127.0.0.1:5000; #Here is the same ip and port as your gunicore proxy_redirect off; proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } } 7)restart Nginx sudo service nginx restart
Appendix: Collection of issues encountered in deployment
1. Adjust Python Priority
sudo update-alternatives --install /usr/bin/python python /usr/bin/python2.7 10 sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.7 20
2.virtualenv usage (emphasis):
1. First, install virtualenv, just pip under the default python2:
[sudo] pip install virtualenv
2. Create virtual environments:
virtualenv -p /usr/bin/python3 py3env
3. Activate virtual environment:
source py3env/bin/activate You'll notice that the shell prompt line is much earlier (py3env), so you can safely develop with python3.Try downloading a triple library first pip install httplib2
That's it!
If you want to exit the python3 virtual environment, enter the command: deactivate
The main record is: Create a virtual environment: This step directly creates a python3 environment instead of a python2.x environment.
3.SyntaxError: Non-ASCII character '\xe6' in file test.py on line 1, but no encoding declared
Solution: Add the following code to the header of the.py file: Just do it! # -- coding: utf
4.No module named distutils.spawn virtualenv python3
Solution: sudo apt-get install python3-distutils
5.pip install gensim read timeout
Solution: Replace the mirror source PIP install-i https://pypi.mirrors.ustc.edu.cn/simple gensim
6.vim fallback
u in command line mode
7.Nginx Restart
sudo service nginx restart
8.HTPP 502
There are many reasons, and what we have found so far is 1. Inconsistency between port number of nginx profile and gunicorn listening
9.requirements.txt
1.Establish pip freeze > requirements.txt 2.install pip install -r requirements.txt
10./upload file upload request unrecognized
Add enctype='multipart/form-data'to the form form
11. Add Log Module
1.Create a good directory structure 2.Create Log Profile 3.The following code generates the log object import logging.config logging.config.fileConfig("./app/config/logger.ini") logger = logging.getLogger("main") logger.info('first log')
11. No model file was generated after file upload
The Python version that comes with the server is 3.5, and the development environment is 3.7.2.So install 3.7.2 on the server
1. Download Python
wget https://www.python.org/ftp/python/3.7.2/Python-3.7.2.tgz
2. Unzip tar-zvxf Python-3.7.2.tgz
3.../configure&sudo make &sudo make install
4. The following issues were encountered during the installation of Python 3.7.2
1.zipimport.ZipImportError: can't decompress data; zlib not available
sudo apt install zlib1g.dev sudo apt install zlib1g Question 13_after entering the above instructions
2.ModuleNotFoundError: No module named '_ctypes'
sudo apt-get install libffi-dev
3.No module named _ssl
1.vi /root/Python-3.7.1/Modules/Setup.dist 2.Uncomment the comment below for four lines Socket module helper for socket(2) _socket socketmodule.c timemodule.c 3.sudo apt-get install libssl-dev(You can try step 3 directly&4) 4.Recompile Installation ./configure --prefix=/usr/local/python37 make make install
4.No module named '_bz2'
sudo apt-get install libbz2-dev cd python3.7 ./configure make make install
12. View memory
free -m
13.Word2Vec error, execution stuck for some time killed
14. Install gensim to replace the mirror source
pip install -i https://pypi.mirrors.ustc.edu.cn/simple gensim
15. Run gensim alarm smart_open deprecated
PIP uninstall-r reuirement.txt Repackages in a working environment after uninstalling all packages