Recently, circles of friends and micro-blogs brushed the screen of the passive painting "The Coming of the Devil Child of Nazha".
The memory of Nazha is still in the cartoon that he watched when he was a child: it's him, it's him, our little friend Nazha.
After 14 days of screening, with a total box office of 3.19 billion, it ranked eighth in the history of Chinese film box office, and eventually reached the top five in the list without accident.
In order to give you a more intuitive feeling, I used Python to crawl and analyze the movie-related data.
Data source address: http://piaofang.baidu.com/
The code has been posted.
@classmethod
def spider(cls):
cls.session.get("https://piaofang.baidu.com/?sfrom=wise_film_box") lz_list = [] szw_list = [] for r in [datetime.now() - timedelta(days=i) for i in range(0, 14)]: params = { "pagelets[]": "index-overall", "reqID": "28", "sfrom": "wise_film_box", "date": r.strftime("%Y-%m-%d"), "attr": "3,4,5,6", "t": int(time.time() * 1000), } response = cls.session.get("https://piaofang.baidu.com/", params=params).text result = eval(re.findall("BigPipe.onPageletArrive\((.*?)\)", response)[0]) selector = Selector(text=result.get("html")) li_list = selector.css(".detail-list .list dd") for d in range(len(li_list)): dic = {} name = li_list[d].css("h3 b ::text").extract_first() if 'Na Zha' in name or "Raging fire" in name: total_box = li_list[d].css("h3 span ::attr(data-box-office)").extract_first() # Gross box office box = li_list[d].css("div span[data-index='3'] ::text").extract_first() # Real-time box office ratio = li_list[d].css("div span[data-index='4'] ::text").extract_first() # Box office share movie_ratio = li_list[d].css("div span[data-index='5'] ::text").extract_first() # Film arrangement proportion dic["date"] = r.strftime("%Y-%m-%d") dic["total_box"] = float( total_box.replace("Billion", "")) * 10000 if "Billion" in total_box else total_box.replace("ten thousand", "") dic["box"] = float(box.replace("Billion", "")) * 10000 if "Billion" in box else box.replace("ten thousand", "") dic["ratio"] = ratio dic["movie_ratio"] = movie_ratio lz_list.append(dic) if 'Na Zha' in name else szw_list.append(dic) return lz_list, szw_list
This is a class class method, because class variables are used, there is a decorator on it. You can also write in the usual way.
The above code has crawled down the relevant data from the release of "The Devil Child of Nezha" and "Hero of Fire".
Data visualization
Data visualization based on pyecharts module
Gross box office chart
Look at the box office trend, plus two days last weekend, 4 billion is not a dream.
Part of the code is as follows:
@staticmethod
def line_base(l1, l2) -> Line:
lh_list = [y["total_box"] for y in l2] lh_list.extend([0 for _ in range(3)]) # The first three days were 0. c = ( Line(init_opts=opts.InitOpts(bg_color="", page_title="Gross box office")) .add_xaxis([y["date"] for y in reversed(l1)]) .add_yaxis("The Devil Child of Nezha Comes into the World", [y["total_box"] for y in reversed(l1)], is_smooth=True, markpoint_opts=opts. MarkPointOpts(data=[opts.MarkPointItem(type_="max")])) .add_yaxis("Fire hero", reversed(lh_list), is_smooth=True, markpoint_opts=opts. MarkPointOpts(data=[opts.MarkPointItem(type_="max")])) .set_global_opts(title_opts=opts.TitleOpts(title="Gross box office", subtitle_textstyle_opts={"color": "red"}, subtitle="Company: Ten thousand yuan"), toolbox_opts=opts.ToolboxOpts()) ) return c.render("line.html")
Look at the next row.
Well, it tastes like a doughnut, as one basketball superstar said.
What about the box office share?
Only 38% of the films were filmed, but the box office accounted for half of the total.
Nazha is so strong!
More technical information: gzitcast