Java reads batch Excel files

Keywords: Java

1. First, basic knowledge:

Original link: https://blog.csdn.net/baidu_39298625/article/details/105842725

1: Introduction

Excel processing is often designed in development, such as exporting excel and importing excel into database. At present, there are two frameworks for operating excel, one is apache poi and the other is Java Excel

Introduction to Apache POI is a free, open source, cross platform Java API written in Java. Apache POI provides Java programs with the ability to read and write files in Microsoft Office (Excel, WORD, PowerPoint, Visio, etc.). POI is the acronym of "Poor Obfuscation Implementation", which means "poor fuzzy implementation".

Official homepage: http://poi.apache.org/index.html

API documentation: http://poi.apache.org/apidocs/index.html

Java Excel is an open source project. Through it, Java developers can read the contents of Excel files, create new Excel files, and update existing Excel files. jxl has gradually replaced POI excel because of its small and easy-to-use characteristics, and has become the first choice for more and more Java developers to generate Excel files.

Since apache poi is often used in projects, this blog only explains apache poi, not jxl

2: Classes commonly used by Apache POI

HSSF - provides the function of reading and writing files in Microsoft Excel XLS format.

XSSF - provides the function of reading and writing Microsoft Excel OOXML XLSX format files.

HWPF - provides the function of reading and writing files in Microsoft Word DOC97 format.

XWPF - provides the function of reading and writing files in Microsoft Word DOC2003 format.

HSLF - provides the ability to read and write files in Microsoft PowerPoint format.

HDGF - provides the ability to read files in Microsoft Visio format.

HPBF - provides the ability to read files in Microsoft Publisher format.

HSMF - provides the ability to read files in Microsoft Outlook format.

In the development, we often use HSSF to operate Excel to process tabular data, but we don't often use it for others.

HSSF is the abbreviation of Horrible SpreadSheet Format. Through HSSF, you can read, write and modify Excel files with pure Java code. HSSF provides two types of API s for read operations: usermodel and eventusermodel, namely "user model" and "event user model".

Common classes and methods

HSSF Workbook: a workbook that represents the entire document of an excel

  • HSSFWorkbook(); // Create a new workbook
  • HSSFWorkbook(InputStream inputStream); // Create a workbook associated with the input stream. You can package an excel file into a workbook
  • HSSFSheet createSheet(String sheetname); Create a new Sheet
  • HSSFSheet getSheet(String sheetName); Get Sheet by name
  • HSSFSheet getSheetAt(int index); // Get the Sheet through the index. The index starts from 0
  • HSSFCellStyle createCellStyle(); Create cell style
  • int getNumberOfSheets(); Get the number of sheet s
  • setActiveSheet(int index); Sets the default selected worksheet
  • write();
  • write(File newFile);
  • write(OutputStream stream);

HSSFSheet: worksheet

  • HSSFRow createRow(int rownum); To create a new line, you need to specify the line number, which starts from 0
  • HSSFRow getRow(int index); Gets the specified row based on the index
  • int addMergedRegion(CellRangeAddress region); merge cell
  • CellRangeAddress(int firstRow, int lastRow, int firstCol, int lastCol); Cell range is used to merge cells. You need to specify the first row, last row, first column and last column to merge.
  • autoSizeColumn(int column); Automatically adjust the width of the column to fit the content
  • getLastRowNum(); Get the index of the last row. If there is no row or only one row, return 0
  • setColumnWidth(int columnIndex, int width); Set the width of a column. Width = number of characters * 256. For example, the width of 20 characters is 20 * 256

HSSFRow: row

  • HSSFCell createCell(int column); Create a new cell
  • HSSFCell setCell(shot index);
  • HSSFCell getCell(shot index);
  • setRowStyle(HSSFCellStyle style); Set row style
  • short getLastCellNum(); Get the last cell number. If the cell has the first start, lastCellNum is the number of columns
  • setHeightInPoints(float height); Sets the height of the row

HSSFCell: cell

  • setCellValue(String value); Sets the value of the cell
  • setCellType(); Set the cell type, such as string, number, Boolean, etc
  • setCellStyle(); Set cell style
  • String getStringCellValue(); Gets the string value in the cell
  • setCellStyle(HSSFCellStyle style); Set cell styles, such as font, bold, formatting
  • setCellFormula(String formula); The calculation formula is set. The calculation result is used as the value of the cell. It also provides commonly used functions, such as sum(A1,C1), date function, string correlation function, CountIf and SumIf functions, random number function, etc

HSSFCellStyle: cell style

  • setFont(Font font); Set font style for cells
  • setAlignment(HorizontalAlignment align); // Set horizontal alignment
  • setVerticalAlignment(VerticalAlignment align); // Set vertical alignment
  • setFillPattern(FillPatternType fp);
  • setFillForegroundColor(short bg); set foreground color
  • setFillBackgroundColor(short bg); Set background color

HSSFFont: font

  • setColor(short color); // Set font color
  • setBold(boolean bold); // Set whether bold
  • setItalic(boolean italic); Set tilt
  • setUnderline(byte underline); Set underline

HSSFName: name

  • HSSFDataFormat: date formatting
  • HSSFHeader: the header of the Sheet
  • Hssffolder: the tail of the Sheet
  • HSSFDateUtil: Date tool
  • HSSFPrintSetup: Print Setup
  • HSSFErrorConstants: error information table

2. Simply implement an Excel tool class, which can be modified and referenced

import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
 
import java.io.File;
import java.io.FileInputStream;
import java.util.*;
 
 
 
public class ExcelUtil {
 
    private static final Logger logger = LoggerFactory.getLogger(ExcelUtil.class);
 
    private static final String XLS = ".xls";
    private static final String XLSX = ".xlsx";
 
    /**
     * Obtain the corresponding Workbook object according to the file suffix
     * @param filePath
     * @param fileType
     * @return
     */
    public static Workbook getWorkbook(String filePath,String fileType){
        Workbook workbook = null;
        FileInputStream fileInputStream = null;
        try{
            File excelFile = new File(filePath);
            if(!excelFile.exists()){
                logger.info(filePath+"file does not exist");
                return null;
            }
            fileInputStream = new FileInputStream(excelFile);
            if(fileType.equalsIgnoreCase(XLS)){
                workbook = new HSSFWorkbook(fileInputStream);
            }else if(fileType.equalsIgnoreCase(XLSX)){
                workbook = new XSSFWorkbook(fileInputStream);
            }
        }catch (Exception e){
            logger.error("Failed to get file",e);
        }finally {
            try {
                if (null != fileInputStream) {
                    fileInputStream.close();
                }
            } catch (Exception e) {
                logger.error("Error closing data stream! Error message:" , e);
                return null;
            }
        }
        return workbook;
    }
 
    public static List<Object> readFolder(String filePath){
        int fileNum = 0;
        File file = new File(filePath);
        List<Object> returnList = new ArrayList<>();
        List<Map<String,String>> resultList = new ArrayList<>();
        if (file.exists()) {
            File[] files = file.listFiles();
            for (File file2 : files) {
                if (file2.isFile()) {
                    resultList = readExcel(file2.getAbsolutePath());
                    returnList.add(resultList);
                    fileNum++;
                }
            } 
        } else {
            logger.info("Folder does not exist");
            return null;
        }
        logger.info("Common documents:"+fileNum);
        return returnList;
    }
 
    /**
     * Read Excel files in batches and return data objects
     * @param filePath
     * @return
     */
    public static List<Map<String,String>> readExcel(String filePath){
        Workbook workbook = null;
        List<Map<String,String>> resultList = new ArrayList<>();
        try{
            String fileType = filePath.substring(filePath.lastIndexOf("."));
            workbook = getWorkbook(filePath,fileType);
            if(workbook == null){
                logger.info("obtain workbook Object failed");
                return null;
            }
            resultList = analysisExcel(workbook);
            return resultList;
        }catch (Exception e){
            logger.error("read Excel File failed"+filePath+"error message",e);
            return null;
        }finally {
            try {
                if (null != workbook) {
                    workbook.close();
                }
            } catch (Exception e) {
                logger.error("Error closing data stream! Error message:" , e);
                return null;
            }
 
        }
    }
 
    /**
     * Parse Excel file and return data object
     * @param workbook
     * @return
     */
    public static List<Map<String,String>> analysisExcel(Workbook workbook){
        List<Map<String,String>> dataList = new ArrayList<>();
        int sheetCount = workbook.getNumberOfSheets();//Or take one Excel in sheet quantity
        for(int i = 0 ; i < sheetCount ; i ++){
            Sheet sheet = workbook.getSheetAt(i);
 
            if(sheet == null){
                continue;
            }
            int firstRowCount = sheet.getFirstRowNum();//Gets the sequence number of the first row
            Row firstRow = sheet.getRow(firstRowCount);
            int cellCount = firstRow.getLastCellNum();//Get the number of columns
 
            List<String> mapKey = new ArrayList<>();
 
            //Get header information and put it in List Medium standby
            if(firstRow == null){
                logger.info("analysis Excel Failed, no data was read in the first row!");
            }else {
                for (int i1 = 0; i1 < cellCount; i1++) {
                    mapKey.add(firstRow.getCell(i1).toString());
                }
            }
 
            //Parse each row of data to form a data object
            int rowStart = firstRowCount + 1;
            int rowEnd = sheet.getPhysicalNumberOfRows();
            for(int j = rowStart ; j < rowEnd ; j ++){
                Row row = sheet.getRow(j);//Get the corresponding row object
 
                if(row == null){
                    continue;
                }
 
                Map<String,String> dataMap = new HashMap<>();
                //Convert each row of data into one Map object
                dataMap = convertRowToData(row,cellCount,mapKey);
                dataList.add(dataMap);
            }
        }
        return dataList;
    }
 
    /**
     * Convert each row of data into a Map object
     * @param row Row object
     * @param cellCount Number of columns
     * @param mapKey  Header Map
     * @return
     */
    public static Map<String,String> convertRowToData(Row row,int cellCount,List<String> mapKey){
        if(mapKey == null){
            logger.info("No header information");
            return null;
        }
        Map<String,String> resultMap = new HashMap<>();
        Cell cell = null;
        for(int i = 0 ; i < cellCount ; i ++){
            cell = row.getCell(i);
            if(cell == null){
                resultMap.put(mapKey.get(i),"");
            }else {
                resultMap.put(mapKey.get(i),getCellVal(cell));
            }
        }
        return resultMap;
    }
 
    /**
     * Gets the value of the cell
     * @param cel
     * @return
     */
    public static String getCellVal(Cell cel) {
        if(cel.getCellType() == Cell.CELL_TYPE_STRING) {
            return cel.getRichStringCellValue().getString();
        }
        if(cel.getCellType() == Cell.CELL_TYPE_NUMERIC) {
            return cel.getNumericCellValue() + "";
        }
        if(cel.getCellType() == Cell.CELL_TYPE_BOOLEAN) {
            return cel.getBooleanCellValue() + "";
        }
        if(cel.getCellType() == Cell.CELL_TYPE_FORMULA) {
            return cel.getCellFormula() + "";
        }
        return cel.toString();
    }
 
 
    public static void main(String[] args) {
        //Read folders, batch parsing Excel file
        System.out.println("--------------------Read folders, batch parsing Excel file-----------------------");
        List<Object> returnList = readFolder("C:\\Users\\Administrator\\Desktop\\ExcelTest");
        for(int i = 0 ; i < returnList.size() ; i ++){
            List<Map<String,String>> maps = (List<Map<String,String>>)returnList.get(i);
            for(int j = 0 ; j < maps.size() ; j ++){
                System.out.println(maps.get(j).toString());
            }
            System.out.println("--------------------Hand fight List Cutting line-----------------------");
        }
 
        //Read a single file
        System.out.println("--------------------Read and parse individual files-----------------------");
        List<Map<String,String>> maps = readExcel("C:\\Users\\Administrator\\Desktop\\ExcelTest\\Student list.xlsx");
        for(int j = 0 ; j < maps.size() ; j ++){
            System.out.println(maps.get(j).toString());
        }
 
}

Posted by j0hn_ on Thu, 28 Oct 2021 17:05:20 -0700