Go Excel to protobuf and upgrade to python3

Keywords: Excel Attribute Google Python

Title: go Excel to protobuf to python3
categories: Go
tags: [go, excel, protobuf, python3]
date: 2020-01-15 16:27:24
comments: false

Go Excel to protobuf and upgrade to python3

Previous articles

The XLS? Deploy? Tool.py conversion tool found on the Internet is python2 version. Here, upgrade it to python3

Upgrade to python3

  1. py3 supports Chinese, does not need unicode conversion, kill

  2. String count is modified, comment.count("\n") is modified to str.count(comment, "\n")

  3. Log printing, print '' changed to print("")

  4. Exception exception handling? Handle exception baseexception, e change to exception baseexception as e

  5. protoc needs to use more than 3.0 to generate binary data of the serialized Pb = B'sss' string in XX [PB2. Py. Reference: https://github.com/protocolbuffers/protobuf/issues/4272

  6. Error reporting: No module named 'google'. Several Google libraries need to be installed. Please refer to: https://stackoverflow.com/questions/36183486/importerror-no-module-named-google

    $ pip3 install google
    $ pip3 install google-cloud   
    $ pip3 install google-cloud-vision
    
  7. Error reporting: No module named 'xlrd', excel parsing library needs to be installed

    $ pip3 install xlrd
    

Modified source XLS? Deploy? Tool? Py3.py

#! /usr/bin/env python
#coding=utf-8

##
# @file:   xls_deploy_tool.py
# @author: jameyli <lgy AT live DOT com>
# @Brief: XLS configuration import table tool

# Main functions:
#     1. Generate configuration definition, and automatically generate PB definition of configuration according to excel
#     2. Import configuration data to generate PB's serialized binary data or text data
#
# Explain:
#   1. The first four lines of Excel are used for structure definition, and the rest are data. They are distinguished according to the first line and explained respectively:
#       required must have attribute
#       optional properties
#           Line 2: attribute type
#           Line 3: property name
#           Line 4: Comments
#           Data lines: property values
#       Repeated indicates that the next attribute is repeated, that is, array
#           The second line: the maximum number of repeat. The attribute will be listed repeatedly in excel
#           On November 29, 2011, the second row was modified. If it is a type definition, it means that the column is repeated
#           But currently only plastic surgery is supported
#           Line 3: useless
#           Line 4: Comments
#           Data lines: actual number of repetitions
#       Required struct
#       Optional struct optional structure properties
#           Second line: number of structural elements
#           Line 3: structure name
#           Line 4: attribute name in the upper structure
#           Data line: not required

#    1  | required/optional | repeated  | required_struct/optional_struct   |
#       | ------------------| ---------:| ---------------------------------:|
#    2 | attribute type | number of structural elements|
#    3 | property name | structure type name|
#    4 | comment describes the property name of | in the upper structure|
#    5 attribute value|

#
#
# The design is ideal at the beginning. I hope the configuration definition and configuration data are consistent, and the same excel is used
# I don't know if it can be realized
#
# The function is basically implemented, and it is verified that ohye can be resolved through CPP
#
# 2011-06-17 revised:
#   Table name use uppercase
#   Structure definitions are underlined in uppercase
# 2011-06-20 bug modification:
#   Spaces in excel naming
#   Situation when repeated num = 0
# 2011-11-24 add function
#   Default value
# 2011-11-29 add function
# The second row of repeated indicates that the column is repeated if it is a type definition
# But currently only plastic surgery is supported

# TODO::
# 1 humanization of time allocation
# 2. Distinguish server/client configuration
# 3 repeated optimization
# 4 struct optimization

# Dependence:
# 1 protobuf
# 2 xlrd
##


import xlrd # for read excel
import sys
import os
import codecs

# Number of spaces in TAP
TAP_BLANK_NUM = 4

FILED_CS_BELONG_ROW = 0
FIELD_RULE_ROW = 1
# This line also indicates the maximum number of repetitions, or the number of structural elements
FIELD_TYPE_ROW = 2
FIELD_NAME_ROW = 3
FIELD_COMMENT_ROW = 4

OUTPUT_FULE_PATH_BASE="cfg_"

########################################LogHelp###########################################
class LogHelp :
    """Log auxiliary class"""
    _logger = None
    _close_imme = True

    @staticmethod
    def set_close_flag(flag):
        LogHelp._close_imme = flag

    @staticmethod
    def _initlog():
        import logging

        LogHelp._logger = logging.getLogger()
        logfile = 'xls_deploy_tool.log'
        hdlr = logging.FileHandler(logfile)
        formatter = logging.Formatter('%(asctime)s|%(levelname)s|%(lineno)d|%(funcName)s|%(message)s')
        hdlr.setFormatter(formatter)
        LogHelp._logger.addHandler(hdlr)
        LogHelp._logger.setLevel(logging.NOTSET)
        # LogHelp._logger.setLevel(logging.WARNING)

        LogHelp._logger.info("\n\n\n")
        LogHelp._logger.info("logger is inited!")

    @staticmethod
    def get_logger() :
        if LogHelp._logger is None :
            LogHelp._initlog()

        return LogHelp._logger

    @staticmethod
    def close() :
        if LogHelp._close_imme:
            import logging
            if LogHelp._logger is None :
                return
            logging.shutdown()

# log macro
LOG_DEBUG=LogHelp.get_logger().debug
LOG_INFO=LogHelp.get_logger().info
LOG_WARN=LogHelp.get_logger().warn
LOG_ERROR=LogHelp.get_logger().error


########################################SheetInterpreter###########################################
class SheetInterpreter:
    """adopt excel Configure build configured protobuf Definition file"""
    def __init__(self, xls_file_path, sheet_name, op2):
        self._xls_file_path = xls_file_path
        self._sheet_name = sheet_name
        self._cs_belong = op2.lower()

        try :
            self._workbook = xlrd.open_workbook(self._xls_file_path)
        except BaseException as e :
            print("open xls file(%s) failed!"%(self._xls_file_path))
            raise

        try :
            self._sheet =self._workbook.sheet_by_name(self._sheet_name)
        except BaseException as e :
            print("open sheet(%s) failed!"%(self._sheet_name))

        # Number of rows and columns
        self._row_count = len(self._sheet.col_values(0))
        self._col_count = len(self._sheet.row_values(0))

        self._row = 0
        self._col = 0

        # Write all the output to a list first, and then to a file uniformly
        self._output = []
        # Number of indented spaces in typesetting
        self._indentation = 0
        # Use list when field number structure is nested
        # Add a structure, add an element to the line, and pop up after the structure definition is completed
        self._field_index_list = [1]
        # Whether the current line is output to avoid duplicate definition of the same structure
        self._is_layout = True
        # Save the names of all structures
        self._struct_name_list = []

        self._pb_file_name = OUTPUT_FULE_PATH_BASE + sheet_name.lower() + ".proto"


    def Interpreter(self) :
        """External interface"""
        LOG_INFO("begin Interpreter, row_count = %d, col_count = %d", self._row_count, self._col_count)

        #File header of proto
        self._LayoutFileHeader()

        # Specify the syntax as proto2
        self._output.append("syntax = \"proto2\";\n\n")

        #Package name of proto
        self._output.append("package datacfg;\n")

        #Structural head
        self._LayoutStructHead(self._sheet_name)

        #Increase indentation
        self._IncreaseIndentation()

        while self._col < self._col_count :
            self._FieldDefine(0)

        #decrease indent
        self._DecreaseIndentation()

        #Generate structure tail
        self._LayoutStructTail()

        #Output array definition
        self._LayoutArray()

        #output to a file
        self._Write2File()

        LogHelp.close()

        # Convert PB to py
        try :
            command = "protoc --python_out=. " + self._pb_file_name
            os.system(command)
        except BaseException as e :
            print("protoc failed!")
            raise

    #Definition of each field
    def _FieldDefine(self, repeated_num) :
        LOG_INFO("row=%d, col=%d, repeated_num=%d", self._row, self._col, repeated_num)

        belong_rule = str(self._sheet.cell_value(FILED_CS_BELONG_ROW, self._col)).lower()
        field_rule = str(self._sheet.cell_value(FIELD_RULE_ROW, self._col))

        if belong_rule != self._cs_belong and belong_rule != "cs" and belong_rule != "":
            self._col += 1
            return False
        if (field_rule ==""):
            self._col += 1
            return False

        if field_rule == "required" or field_rule == "optional" :
            field_type = str(self._sheet.cell_value(FIELD_TYPE_ROW, self._col)).strip()
            field_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
            field_comment = (self._sheet.cell_value(FIELD_COMMENT_ROW, self._col))

            LOG_INFO("%s|%s|%s|%s", field_rule, field_type, field_name, field_comment)
            comment = field_comment
            self._LayoutComment(comment)

            if repeated_num >= 1:
                field_rule = "repeated"

            self._LayoutOneField(field_rule, field_type, field_name)

            actual_repeated_num = 1 if (repeated_num == 0) else repeated_num
            self._col += actual_repeated_num

        elif field_rule == "repeated" :
            # 2011-11-29 revised
            # If the second row of repeated is a type definition, it means that the current field is repeated and the data is separated by semicolons in a single column
            second_row = str(self._sheet.cell_value(FIELD_TYPE_ROW, self._col)).strip()
            LOG_DEBUG("repeated|%s", second_row);
            # exel may have a decimal point
            if second_row.isdigit() or second_row.find(".") != -1 :
                # There will be a structure at the back
                repeated_num = int(float(second_row))
                LOG_INFO("%s|%d", field_rule, repeated_num)
                self._col += 1
                self._FieldDefine(repeated_num)
            else :
                # Generally, it is a simple single field, with semicolons separating the values
                field_type = second_row
                field_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
                field_comment = (self._sheet.cell_value(FIELD_COMMENT_ROW, self._col))
                LOG_INFO("%s|%s|%s|%s", field_rule, field_type, field_name, field_comment)

                comment = field_comment
                self._LayoutComment(comment)

                self._LayoutOneField(field_rule, field_type, field_name)

                self._col += 1

        elif field_rule == "required_struct" or field_rule == "optional_struct":
            field_num = int(self._sheet.cell_value(FIELD_TYPE_ROW, self._col))
            # struct_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
            # field_name = str(self._sheet.cell_value(FIELD_COMMENT_ROW, self._col)).strip()
            field_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
            struct_name = "InternalType_" + field_name;
            field_comment = (self._sheet.cell_value(FIELD_COMMENT_ROW, self._col))
            comment = field_comment

            LOG_INFO("%s|%d|%s|%s", field_rule, field_num, struct_name, field_name)


            if (self._IsStructDefined(struct_name)) :
                self._is_layout = False
            else :
                self._struct_name_list.append(struct_name)
                self._is_layout = True

            self._col += 1
            col_begin = self._col
            self._StructDefine(struct_name, field_num, comment)
            col_end = self._col

            self._is_layout = True

            if repeated_num >= 1:
                field_rule = "repeated"
            elif field_rule == "required_struct":
                field_rule = "required"
            else:
                field_rule = "optional"

            self._LayoutOneField(field_rule, struct_name, field_name)

            actual_repeated_num = 1 if (repeated_num == 0) else repeated_num
            self._col += (actual_repeated_num-1) * (col_end-col_begin)
        else :
            self._col += 1
            return True
        return True

    def _IsStructDefined(self, struct_name) :
        for name in self._struct_name_list :
            if name == struct_name :
                return True
        return False

    def _StructDefine(self, struct_name, field_num, comment) :
        """nested structure definitions """
        # self._col += 1
        self._LayoutComment(comment)
        self._LayoutStructHead(struct_name)
        self._IncreaseIndentation()
        self._field_index_list.append(1)

        while field_num > 0 :
            if self._FieldDefine(0):
                field_num -= 1
        self._field_index_list.pop()
        self._DecreaseIndentation()
        self._LayoutStructTail()

    def _LayoutFileHeader(self) :
        """generate PB Description of the document"""
        self._output.append("/**\n")
        self._output.append("* @file:   " + self._pb_file_name + "\n")
        self._output.append("* @author: jameyli <jameyli AT tencent DOT com>\n")
        self._output.append("* @brief:  This file is generated automatically by the tool. It is not recommended to modify it manually \ n“)
        self._output.append("*/\n")
        self._output.append("\n")


    def _LayoutStructHead(self, struct_name) :
        """Generate structure header"""
        if not self._is_layout :
            return
        self._output.append("\n")
        self._output.append(" "*self._indentation + "message " + struct_name + "{\n")

    def _LayoutStructTail(self) :
        """Generate structure tail"""
        if not self._is_layout :
            return
        self._output.append(" "*self._indentation + "}\n")
        self._output.append("\n")

    def _LayoutComment(self, comment) :
        # Use C-style notes instead to prevent branches
        if not self._is_layout :
            return
        if str.count(comment, "\n") > 1 :
            if comment[-1] != '\n':
                comment = comment + "\n"
                comment = comment.replace("\n", "\n" + " " * (self._indentation + TAP_BLANK_NUM),
                        str.count(comment, "\n")-1 )
                self._output.append(" "*self._indentation + "/** " + comment + " "*self._indentation + "*/\n")
        else :
            self._output.append(" "*self._indentation + "/** " + comment + " */\n")

    def _LayoutOneField(self, field_rule, field_type, field_name) :
        """Output one line definition"""
        if not self._is_layout :
            return
        if field_name.find('=') > 0 :
            name_and_value = field_name.split('=')
            self._output.append(" "*self._indentation + field_rule + " " + field_type \
                    + " " + str(name_and_value[0]).strip() + " = " + self._GetAndAddFieldIndex()\
                    + " [default = " + str(name_and_value[1]).strip() + "]" + ";\n")
            return

        if (field_rule != "required" and field_rule != "optional") :
            self._output.append(" "*self._indentation + field_rule + " " + field_type \
                    + " " + field_name + " = " + self._GetAndAddFieldIndex() + ";\n")
            return

        if field_type == "int32" or field_type == "int64"\
                or field_type == "uint32" or field_type == "uint64"\
                or field_type == "sint32" or field_type == "sint64"\
                or field_type == "fixed32" or field_type == "fixed64"\
                or field_type == "sfixed32" or field_type == "sfixed64" \
                or field_type == "double" or field_type == "float" :
                    self._output.append(" "*self._indentation + field_rule + " " + field_type \
                            + " " + field_name + " = " + self._GetAndAddFieldIndex()\
                            + " [default = 0]" + ";\n")
        elif field_type == "string" or field_type == "bytes" :
            self._output.append(" "*self._indentation + field_rule + " " + field_type \
                    + " " + field_name + " = " + self._GetAndAddFieldIndex()\
                    + " [default = \"\"]" + ";\n")
        elif field_type == "DateTime" :
            field_type = "uint64"
            self._output.append(" "*self._indentation + field_rule + " " + field_type \
                    + " /*DateTime*/ " + field_name + " = " + self._GetAndAddFieldIndex()\
                    + " [default = 0]" + ";\n")
        elif field_type == "TimeDuration" :
            field_type = "uint64"
            self._output.append(" "*self._indentation + field_rule + " " + field_type \
                    + " /*TimeDuration*/ " + field_name + " = " + self._GetAndAddFieldIndex()\
                    + " [default = 0]" + ";\n")
        else :
            self._output.append(" "*self._indentation + field_rule + " " + field_type \
                    + " " + field_name + " = " + self._GetAndAddFieldIndex() + ";\n")
        return

    def _IncreaseIndentation(self) :
        """Increase indentation"""
        self._indentation += TAP_BLANK_NUM

    def _DecreaseIndentation(self) :
        """decrease indent"""
        self._indentation -= TAP_BLANK_NUM

    def _GetAndAddFieldIndex(self) :
        """Get the sequence number of the field, And add the serial number"""
        index = str(self._field_index_list[- 1])
        self._field_index_list[-1] += 1
        return index

    def _LayoutArray(self) :
        """Output array definition"""
        self._output.append("message " + self._sheet_name + "Array {\n")
        self._output.append("    repeated " + self._sheet_name + " items = 1;\n}\n")
        #self._output.append("    optional bytes xlsFileInfo = 2;\n}\n")

    def _Write2File(self) :
        """output to a file"""
        pb_file = codecs.open(self._pb_file_name, "w+", 'utf-8')
        pb_file.writelines(self._output)
        pb_file.close()


########################################DataParser###########################################
class DataParser:
    """analysis excel Data"""
    def __init__(self, xls_file_path, sheet_name, op2):
        self._xls_file_path = xls_file_path
        self._sheet_name = sheet_name
        self._cs_belong = op2

        try :
            self._workbook = xlrd.open_workbook(self._xls_file_path)
        except BaseException as e :
            print("open xls file(%s) failed!"%(self._xls_file_path))
            raise

        try :
            self._sheet =self._workbook.sheet_by_name(self._sheet_name)
        except BaseException as e :
            print("open sheet(%s) failed!"%(self._sheet_name))
            raise

        self._row_count = len(self._sheet.col_values(0))
        self._col_count = len(self._sheet.row_values(0))

        self._row = 0
        self._col = 0

        try:
            self._module_name = OUTPUT_FULE_PATH_BASE + self._sheet_name.lower() + "_pb2"
            sys.path.append(os.getcwd())
            exec('from '+self._module_name + ' import *');
            self._module = sys.modules[self._module_name]
        except BaseException as e :
            print("load module(%s) failed"%(self._module_name))
            raise

    def Parse(self) :
        """External interface:Analytical data"""
        LOG_INFO("begin parse, row_count = %d, col_count = %d", self._row_count, self._col_count)

        item_array = getattr(self._module, self._sheet_name+'Array')()

        # Find the column that defines the ID first
        id_col = 0
        for id_col in range(0, self._col_count) :
            info_id = str(self._sheet.cell_value(self._row, id_col)).strip()
            if info_id == "" :
                continue
            else :
                break

        #Since a row of CS ﹐ long has been added, start with Row 6 (index 5)
        for self._row in range(5, self._row_count) :
            # There's no need to skip and comment out, because the id in the best configuration table is not the real id, so it's not strictly required
            # If the id is empty, skip the line change,
            #info_id = str(self._sheet.cell_value(self._row, id_col)).strip()

            #if info_id == "" :
                #LOG_WARN("%d is None", self._row)
                #continue
            item = item_array.items.add()
            self._ParseLine(item)


        LOG_INFO("parse result:\n%s", item_array)


        #unicode(field_value).encode('utf-8')xls_file_path.decode('gb2312')
        #item_array.xlsFileInfo = (self._xls_file_path).decode('gb2312').encode('utf-8') + ": " + self._sheet_name.encode('utf-8')
        self._WriteReadableData2File(str(item_array))


        data = item_array.SerializeToString()

        self._WriteData2File(data)


        #Comment this line for test. By Kevin at 17:23:35, January 12, 2013
        LogHelp.close()


    def _ParseLine(self, item) :
        LOG_INFO("%d", self._row)

        self._col = 0
        while self._col < self._col_count :
            self._ParseField(0, 0, item)

    def _ParseField(self, max_repeated_num, repeated_num, item) :
        LOG_INFO("repeated_num: %s", repeated_num)
        # Parse skip logic
        belong_rule = str(self._sheet.cell_value(FILED_CS_BELONG_ROW, self._col)).lower()
        if belong_rule != self._cs_belong and belong_rule != "cs" and belong_rule != "":
            self._col += 1
            return False

        field_rule = str(self._sheet.cell_value(FIELD_RULE_ROW, self._col)).strip()

        if (field_rule == ""):
            self._col += 1
            return False

        if field_rule == "required" or field_rule == "optional" :
            field_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
            if field_name.find('=') > 0 :
                name_and_value = field_name.split('=')
                field_name = str(name_and_value[0]).strip()
            field_type = str(self._sheet.cell_value(FIELD_TYPE_ROW, self._col)).strip()

            LOG_INFO("%d|%d", self._row, self._col)
            LOG_INFO("%s|%s|%s", field_rule, field_type, field_name)

            if max_repeated_num == 0 :
                field_value = self._GetFieldValue(field_type, self._row, self._col)
                # Set value only when there is value
                if field_value != None :
                    LOG_INFO("filed_name: %s, field_value is: %s", field_name,field_value)
                    item.__setattr__(field_name, field_value)
                else :
                    LOG_INFO("None value!")
                self._col += 1
            else :
                if repeated_num == 0 :
                    if field_rule == "required" :
                        print("required but repeated_num = 0")
                        raise
                else :
                    for col in range(self._col, self._col + repeated_num):
                        field_value = self._GetFieldValue(field_type, self._row, col)
                        # Set value only when there is value
                        if field_value != None :
                            item.__getattribute__(field_name).append(field_value)
                self._col += max_repeated_num

        elif field_rule == "repeated" :
            # 2011-11-29 revised
            # If the second row of repeated is a type definition, it means that the current field is repeated and the data is separated in a single column
            second_row = str(self._sheet.cell_value(FIELD_TYPE_ROW, self._col)).strip()
            LOG_DEBUG("repeated|%s", second_row);
            # exel may have a decimal point
            if second_row.isdigit() or second_row.find(".") != -1 :
                # There will be a structure at the back
                max_repeated_num = int(float(second_row))
                read = self._sheet.cell_value(self._row, self._col)
                repeated_num = 0 if read == "" else int(self._sheet.cell_value(self._row, self._col))

                LOG_INFO("%s|%d|%d", field_rule, max_repeated_num, repeated_num)

                if max_repeated_num == 0 :
                    print("max repeated num shouldn't be 0")
                    raise

                if repeated_num > max_repeated_num :
                    repeated_num = max_repeated_num

                self._col += 1
                self._ParseField(max_repeated_num, repeated_num, item)

            else :
                # Generally, it is a simple single field, with semicolons separating the values
                # Generally, it can only be numerical type
                field_type = second_row
                field_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
                field_value_str = (self._sheet.cell_value(self._row, self._col))
                #field_value_str = unicode(self._sheet.cell_value(self._row, self._col)).strip()

                # LOG_INFO("%d|%d|%s|%s|%s",
                #         self._row, self._col, field_rule, field_type, field_name, field_value_str)

                #2013-01-24 jamey
                #Increase length judgment
                if len(field_value_str) > 0:
                    if field_value_str.find(";\n") > 0 :
                        field_value_list = field_value_str.split(";\n")
                    else :
                        field_value_list = field_value_str.split(";")

                    for field_value in field_value_list :
                        if field_type == "bytes" or field_type == "string" :
                            if (field_type == "bytes"):
                                item.__getattribute__(field_name).append(field_value.encode("utf8"))
                            else:
                                isFloatString = False
                                try:
                                    fValue = float(field_value)
                                    isFloatString = True
                                except BaseException as e :
                                    fvalue = 0.0
                               
                                if  isFloatString:
                                    check_field_value = int(fValue)
                                    if abs(fValue) - abs(check_field_value) < 0.0000000001:
                                        field_value = str(int(fValue))
                                item.__getattribute__(field_name).append(field_value.encode("utf8"))       
                        else:
                            if field_type == "double" or field_type == "float":
                                item.__getattribute__(field_name).append(float(field_value))
                            else:
                                item.__getattribute__(field_name).append(int(float(field_value)))
                self._col += 1

        elif field_rule == "required_struct" or field_rule == "optional_struct":
            field_num = int(self._sheet.cell_value(FIELD_TYPE_ROW, self._col))
            # struct_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
            # field_name = str(self._sheet.cell_value(FIELD_COMMENT_ROW, self._col)).strip()
            field_name = str(self._sheet.cell_value(FIELD_NAME_ROW, self._col)).strip()
            struct_name = "InternalType_" + field_name;

            LOG_INFO("%s|%d|%s|%s", field_rule, field_num, struct_name, field_name)


            self._col += 1
            col_begin = self._col

            # Cycle at least once
            if max_repeated_num == 0 :
                struct_item = item.__getattribute__(field_name)
                self._ParseStruct(field_num, struct_item)

            else :
                if repeated_num == 0 :
                    if field_rule == "required_struct" :
                        print("required but repeated_num = 0")
                        raise
                    # Read before delete
                    struct_item = item.__getattribute__(field_name).add()
                    LOG_INFO("add struct_item 1")
                    self._ParseStruct(field_num, struct_item)
                    item.__getattribute__(field_name).__delitem__(-1)

                else :
                    for num in range(0, repeated_num):
                        struct_item = item.__getattribute__(field_name).add()
                        LOG_INFO("add struct_item 2")
                        self._ParseStruct(field_num, struct_item)

            col_end = self._col

            max_repeated_num = 1 if (max_repeated_num == 0) else max_repeated_num
            actual_repeated_num = 1 if (repeated_num==0) else repeated_num
            self._col += (max_repeated_num - actual_repeated_num) * ((col_end-col_begin)/actual_repeated_num)

        else :
            self._col += 1
            return True
        return True

    def _ParseStruct(self, field_num, struct_item) :
        """Nested structure data reading"""

        # Skip structure definition
        # self._col += 1
        while field_num > 0 :
            if self._ParseField(0, 0, struct_item) == True:
                field_num -= 1

    def _GetFieldValue(self, field_type, row, col) :
        """take pb Type to python type"""

        field_value = self._sheet.cell_value(row, col)
        LOG_INFO("%d|%d|%s", row, col, field_value)

        try:
            if field_type == "int32" or field_type == "int64"\
                    or  field_type == "uint32" or field_type == "uint64"\
                    or field_type == "sint32" or field_type == "sint64"\
                    or field_type == "fixed32" or field_type == "fixed64"\
                    or field_type == "sfixed32" or field_type == "sfixed64" :
                        if len(str(field_value).strip()) <=0 :
                            return None
                        else :
                            return int(field_value)
            elif field_type == "double" or field_type == "float" :
                    if len(str(field_value).strip()) <=0 :
                        return None
                    else :
                        return float(field_value)
            elif field_type == "string" :
                field_value = (field_value)
                isFloatString = False
                try:
                    fValue = float(field_value)
                    isFloatString = True
                except BaseException as e :
                    fvalue = 0.0
               
                if  isFloatString:
                    check_field_value = int(fValue)
                    if abs(fValue) - abs(check_field_value) < 0.0000000001:
                        field_value = str(int(fValue))

                if len(field_value) <= 0 :
                    return None
                else :
                    return field_value
            elif field_type == "bytes" :
                field_value = (field_value).encode('utf-8')
                if len(field_value) <= 0 :
                    return None
                else :
                    return field_value
            elif field_type == "DateTime" :
                field_value = (field_value).encode('utf-8')
                if len(field_value) <= 0 :
                    return 0
                else :
                    import time
                    time_struct = time.strptime(field_value, "%Y-%m-%d %H:%M:%S")
                    timt_stamp = int(time.mktime(time_struct))
                    return timt_stamp
            elif field_type == "TimeDuration" :
                field_value = (field_value).encode('utf-8')
                if len(field_value) <= 0 :
                    return 0
                else :
                    import datetime
                    import time
                    time_struct=0
                    try :
                        time_struct = time.strptime(field_value, "%HH")
                    except BaseException as e :
                        time_struct = time.strptime(field_value, "%jD%HH")
                    return 3600 * (time_struct.tm_yday * 24 + time_struct.tm_hour)
            elif field_type == "bool" :
                return bool(field_value)
            else :
                return None
        except BaseException as e :
            print("parse cell(%u, %u) error, please check it, maybe type is wrong."%(row, col))
            raise

    def _WriteData2File(self, data) :
        file_name = OUTPUT_FULE_PATH_BASE + self._sheet_name.lower() + ".bytes"
        file = open(file_name, 'wb+')
        file.write(data)
        file.close()

    def _WriteReadableData2File(self, data) :
        file_name = OUTPUT_FULE_PATH_BASE + self._sheet_name.lower() + ".txt"
        file = open(file_name, 'wb+')
        file.write(data.encode("UTF-8"))
        file.close()


########################################__main__###########################################
if __name__ == '__main__' :
    """Entrance"""
    if len(sys.argv) < 3 :
        print("Usage: %s sheet_name(should be upper) xls_file" %(sys.argv[0]))
        sys.exit(-1)

    # option 0 generates proto and data 1 generates proto 2 generates data
    op = 0
    belong_op = "c"

    if len(sys.argv) > 3 :
        belong_op= str(sys.argv[3])

    if len(sys.argv) > 4 :
        op = int(sys.argv[4])

    sheet_name =  sys.argv[1]

    '''
    if (not sheet_name.isupper()):
        print("sheet_name should be upper")
        sys.exit(-2)
    '''

    xls_file_path =  sys.argv[2]

    #print((xls_file_path.decode('gb2312')))

    if op == 0 or op == 1:
        try :
            tool = SheetInterpreter(xls_file_path, sheet_name, belong_op)
            tool.Interpreter()
        except BaseException as e :
            print("Interpreter Failed!!!")
            print(e)
            sys.exit(-3)

        print("Interpreter Success!!!")

    if op == 0 or op == 2:
        try :
            parser = DataParser(xls_file_path, sheet_name, belong_op)
            parser.Parse()
        except BaseException as e :
            print("Parse Failed!!!")
            print(e)
            sys.exit(-4)

        print("Parse Success!!!")
357 original articles published, 189 praised, 1.07 million visitors+
His message board follow

Posted by Birch on Thu, 23 Jan 2020 06:51:28 -0800