Use python to analyze funds! Make money to win in the starting line!

Keywords: Python JSON Excel Mac

You don't manage money, money doesn't care about you! Can python help you with your finances?

Effect preview

Trend chart of cumulative yield

Basic information results

How to use:

Python 3 + some third-party libraries

import requests
import pandas
import numpy
import matplotlib
import lxml

Configure config.json. Code to configure the fund code and whether useCache uses cache.

{
  "code":[
    "002736",
    "003328",
    "003547",
  ],
  "useCache":true
}

Run fund? Analysis.py

Realization principle

Data acquisition:

Open a fund from the daily fund website, and observe the loaded files in the chrome developer tool. We found a js file, which contains some basic information of the fund. This is a js file.

To obtain the cumulative yield information, you need to do some operations on the page. Click 3 years in the cumulative yield to observe the request of the developer tool, and it is easy to find out how the data source is obtained. This is a json data.

The fund rate table is on another page. We can find the information source address several times. This is html data.

Then through the analysis of the Hearders, we use the request simulation browser to get the data (if it is not clear here, please refer to the previous article). Finally, it is saved locally as a buffer. Take the cumulative yield information json as an example. The main code is as follows.

filePath = f'./cache/{fundCode}.json'
requests_url='http://api.fund.eastmoney.com/pinzhong/LJSYLZS'
headers = {
  'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36',
  'Accept': 'application/json' ,
  'Referer': f'http://fund.eastmoney.com/{fundCode}.html',
}
params={
    'fundCode': f'{fundCode}',
    'indexcode': '000300',
    'type': 'try',
}
requests_page=requests.get(requests_url,headers=headers,params=params)
with open(filePath, 'w') as f:
  json.dump(requests_page.json(), f)

Data analysis:

For the js file of basic information, read the file as a string, and get the required data through regular expression.

For example, to obtain one-year yield, you can use the following code.

syl_1n=re.search(r'syl_1n\s?=\s?"([^\s]*)"',data).group(1);

For the cumulative yield json data, directly use json to analyze and find the data needed for filtering and processing.

It uses the format of all_data [fund code] [time] = cumulative yield to store, and then fills up the empty data through the DataFrame of pandas.

df = DataFrame(all_data).sort_index().fillna(method='ffill')

For the html data of fund rate table, xpath is used. The xpath path can be obtained directly from chrome.

For management rates, refer to the following codes.

selector = lxml.html.fromstring(data);
# Management fee rate
mg_rate=selector.xpath('/html/body/div[1]/div[8]/div[3]/div[2]/div[3]/div/div[4]/div/table/tbody/tr/td[2]/text()')[0]

Data storage:

plot in DataFrame can be used to draw pictures quickly, and to excel can be used to save them in Excel table. You can refer to the following code.

# Save data 
fig,axes = plt.subplots(2, 1)
# Process basic information
df2 = DataFrame(all_data_base)
df2.stack().unstack(0).to_excel(f'result_{time.time()}.xlsx',sheet_name='out')
df2.iloc[1:5,:].plot.barh(ax=axes[0],grid=True,fontsize=25)
# Processing revenue
df=DataFrame(all_data).sort_index().fillna(method='ffill')
df.plot(ax=axes[1],grid=True,fontsize=25)
fig.savefig(f'result_{time.time()}.png')

Summary

Data acquisition mainly uses the basic method of crawler, using the requests library. Regular expression, xpath parsing library and pandas data processing library are mainly used for data analysis and preservation.

The analysis of a fund is far more than these data (such as position distribution, fund manager information, etc.), here is just a guide, hoping to give you a idea, if you have ideas or do not understand the place, welcome to leave a message or private communication!

This article is only for personal learning and communication. Please do not use it for other purposes!

Reference material

Posted by fingerprn on Fri, 06 Dec 2019 05:31:52 -0800