Programmer's algorithm fun Q62: the largest rectangle in the calendar

Keywords: Python datetime Calendar

catalogue

1. Problem description

2. Problem solving analysis

2.1 matrix representation of calendar of each month

  2.2 handling of holidays and compensatory public holidays

2.3 find the maximum rectangle

2.3.1 violent search

2.3.2 sliding window search

3. Code and test

4. Postscript

1. Problem description

 

2. Problem solving analysis

2.1 matrix representation of calendar of each month

         In python, it can be implemented with monthcalendar() of the calendar module. See another blog Common and interesting usage of Python calendar modulehttps://blog.csdn.net/chenxy_bwave/article/details/121251954 . The following are calendars printed in two ways. The first statement sets Sunday as the first day of the week.

import calendar
calendar.setfirstweekday(calendar.SUNDAY)
calendar.prmonth(2014,4)
print(np.array(calendar.monthcalendar(2014,4)))

         The printing effect is as follows:

 

 

  2.2 handling of holidays and compensatory public holidays

    First consider holiday data.
    Holiday data is stored in the file as follows: 

     You can read them in the form of string by line, then use the module in python datetime to transform them into datetime.datetime object, extract the year, month and day information, and then set the corresponding elements of the matrix obtained in the previous section to 0 (indicating non working days) based on the year, month and day information.
     Next, consider the data of public holidays that become working days due to compensatory leave. The storage format is the same as above and can be handled in the same way, except that this time the corresponding element is set to a non-0 value (for example, 1)
     In the following code, use the readfile() function to read data from the above two files, extract date information, recover the year, month and day, and then store them in a dict type variable, with (year, month) as the key, and value is a list containing the days in the corresponding year and year. Among them, the design uses the datetime module for processing. For a brief introduction to the usage of the datetime module, see the blog: Date time processing in Python: a practical example of the datetime package https://blog.csdn.net/chenxy_bwave/article/details/120977111


2.3 find the maximum rectangle

2.3.1 violent search

         First of all, the problem of finding the maximum rectangle can certainly be solved by the method of violent search.

         For example, how many rectangles are there in a 2 * 2 matrix with the top left lattice (0,0) of the matrix as the top left corner? Exactly four. For a general n*m matrix, there are n*m rectangles with the leftmost upper lattice (0,0) of the matrix as the leftmost upper corner. Scan these n*m rectangles, exclude the rectangle with 0 element in the middle (or set its area to 0, which is simpler), and find the maximum area of the remaining rectangle, that is, the maximum area of "the rectangle with the leftmost upper corner lattice (0,0) of the matrix as the leftmost upper corner". Next, similarly, we can find the lattice (0,1), (0,2) (1,0), (1,1) is the maximum area of the rectangle at the top left corner. Then find the maximum of these maximum values to obtain the maximum rectangular area without 0 element in the current matrix.

         What is the complexity of such a violent search? For simplicity, consider that the original matrix is square and the size is n*n

         First, scan the lattice in the top left corner of the rectangle, n*n

         Secondly, the number of possible rectangles corresponding to the top left lattice candidate of each rectangle depends on its coordinates. Assuming its coordinates are (i,j), the number of possible rectangles is (n-i)*(n-j)

         In this way, the total number of rectangles whose area needs to be evaluated is:

          This scheme can only be thought of as a benchmark reference, not implemented.

2.3.2 sliding window search

         Violence search is based on the grid (considering a grid as the upper left corner of the rectangle). You can also consider the sliding window scheme from another angle, and consider sliding on the calendar rectangle with rectangular boxes of different sizes and shapes. Because the maximum rectangular area is required, the sliding rectangular window area for scanning is arranged in order from large to small. In this way, the first sliding position that does not contain 0 is found, and the maximum rectangular area required by the original problem is found.

         Because it is possible that rectangular boxes of multiple shapes have the same area, for example, the area of rectangular boxes of 4 * 2 and 2 * 4 is 8. So first build a dictionary, with the area as the key and the corresponding list of possible shapes as the value. The code is as follows:

# 2. Construct the dictionary for rectangulars area-shape pair
area_shape = dict()
for i in range(1,6):
    for j in range(1,8):
        if i*j not in area_shape:
            area_shape[i*j] = []
        area_shape[i*j].append((i,j))

         With the above preparations, the processing flow for a month is as follows:

         Note 1: when resetting the values of the corresponding elements of holidays and extra workdays, its corresponding position in the matrix needs to be determined according to the date information. First, you need to determine the weekday (day of the week) corresponding to the first day of the current month, so that you can determine the position of the first day of the current month in the matrix, and then you can deduce the position of the specified date in the matrix. This processing corresponds to the following code (the processing of extra workday is the same):

        # Set holidays to 0
        if (y,m) in h:
            holidays = h[(y,m)]
            for hday in holidays:
                # Find the position of the current holiday in month calendar matrix
                i = (hday + fst_wkday - 1)//7
                j = (hday + fst_wkday - 1)%7
                c[i,j] = 0

3. Code and test

# -*- coding: utf-8 -*-
"""
Created on Thu Nov 11 09:35:28 2021

@author: chenxy
"""

import sys
import time
from datetime import datetime
import math
# import random
from   typing import List
from   collections import deque
import itertools as it
import numpy as np
import calendar

#  Set SUNDAY to the first weekday
calendar.setfirstweekday(calendar.SUNDAY)
calendar.prmonth(2014,4)
print(np.array(calendar.monthcalendar(2014,4)))

def readfile(filename:str)->dict:
    '''
    Read holiday file and extra-workday file
    Parameters
    ----------
    filename : string        
    Returns
    -------
    A dictionary to store the data

    '''    
    print('Read {0} line by line, and store the holidays into a dictionary...'.format(filename))
    dat = dict()
    f=open(filename,'r')
    if f.mode == 'r':
        f_lines = f.readlines()
        for line in f_lines:
            # print(line,end='')
            date_object = datetime.strptime(line[:10], "%Y/%m/%d") # Strip the last '\n' in line
            # print("date_object ={}-{}-{}".format(date_object.year,date_object.month,date_object.day))        
            y,m,d = date_object.year,date_object.month,date_object.day
            if (y,m) not in dat:
                dat[(y,m)] = []
            dat[(y,m)].append(d)        
    f.close()
    return dat

# 1. Read the data file
h = readfile('q62-holiday.txt')
e = readfile('q62-extra-workday.txt')
    
# 2. Construct the dictionary for rectangulars area-shape pair
area_shape = dict()
for i in range(1,6):
    for j in range(1,8):
        if i*j not in area_shape:
            area_shape[i*j] = []
        area_shape[i*j].append((i,j))
        
# 3. loop over year/month to find the maximum rectangular of each month
max_area = dict()
for y in range(2014,2015):
    for m in range(4,7):
        # calendar.prmonth(y,m)
        c = np.array(calendar.monthcalendar(y,m))
        # Set the first and the last column to 0
        c[:,0] = 0
        c[:,6] = 0
        
        # print('The original month calendar:\n',c)
        # find the first weekday of the current month
        fst_wkday, num_days = calendar.monthrange(y, m)
        fst_wkday = (fst_wkday + 1)%7 # Because the SUNDAY is set to the first weekday
        
        # Set holidays to 0
        if (y,m) in h:
            holidays = h[(y,m)]
            for hday in holidays:
                # Find the position of the current holiday in month calendar matrix
                i = (hday + fst_wkday - 1)//7
                j = (hday + fst_wkday - 1)%7
                c[i,j] = 0

        # Set extra-workday to 100--any positive value is OK
        if (y,m) in e:
            extras = e[(y,m)]
            for eday in extras:
                # Find the position of the current extra workday in month calendar matrix
                i = (eday + fst_wkday - 1)//7
                j = (eday + fst_wkday - 1)%7
                c[i,j] = 100        
        # print('The month calendar after holidays and extra workdays setting:\n',c)
        # Search for the maximum rectangular only covering workday
        found = False
        for a in range(35,0,-1):
            # print(a)
            if a in area_shape:
                ij_list = area_shape[a]
                for (i,j) in ij_list:
                    for i0 in range(5-i+1):
                        for j0 in range(7-j+1):
                            rect = c[i0:i0+i,j0:j0+j]
                            # print(a,i,j,i0,j0, rect)
                            if np.all(rect):
                                max_area[(y,m)] = a
                                found = True
                                break
                        if found:
                            break
                    if found:
                        break
                if found:
                    break

print(max_area)

         Operation result: {(2014, 4): 16, (2014, 5): 20, (2014, 6): 16}  

4. Postscript

        Because I am not familiar with the processing of date and calendar, I spent some time learning the two modules of calendar and datetime in python. The problem of finding the maximum rectangle, which should be the core algorithm of this problem, is dwarfed by the processing of date and date.

        Previous: Q61: do not cross a stroke

        Next: Q63: Maze rendezvous

         For the general catalogue of this series, see: Programmer's interesting algorithm: detailed analysis and Python complete solution

Posted by senorfrog on Fri, 12 Nov 2021 08:06:40 -0800