Using pdfBox to convert pdf to picture and solve the problem of Chinese block code

Keywords: Apache Java Maven svn

Reference article Troubleshooting: when pdfbox is used to transfer pdf to image, stsong light font in Chinese is garbled

pdfbox version is 2.0
A log like this (for example, using fallback XXX for CID keyed font stsong light) is printed in the log, which means that stsong light font is not installed in the system. pdfbox uses XXX font instead. If the box appears, it means that there is no such font, and there is no alternative font, and the log also has corresponding other prompts.

The normal operation is to install the missing STSong light font. However, all the fonts found on the Internet are STSong fonts that have no effect after installation. My windows 10

Refer to the above article (see for sure), modify FontMapperImpl to add the mapping font stsong light - > stfangsong in questions
Download pdfbox from Apache Source code Modify \ pdfbox\src\main\java\org\apache\pdfbox\pdmodel\font\FontMapperImpl.java

final class FontMapperImpl implements FontMapper
{
    private static final FontCache fontCache = new FontCache(); // todo: static cache isn't ideal
    private FontProvider fontProvider;
    private Map<String, FontInfo> fontInfoByName;
    private final TrueTypeFont lastResortFont;

    /** Map of PostScript name substitutes, in priority order. */
    private final Map<String, List<String>> substitutes = new HashMap<String, List<String>>();

    FontMapperImpl()
    {
        // substitutes for standard 14 fonts
        substitutes.put("Courier",
                Arrays.asList("CourierNew", "CourierNewPSMT", "LiberationMono", "NimbusMonL-Regu"));
        substitutes.put("Courier-Bold",
                Arrays.asList("CourierNewPS-BoldMT", "CourierNew-Bold", "LiberationMono-Bold",
                        "NimbusMonL-Bold"));
        substitutes.put("Courier-Oblique",
                Arrays.asList("CourierNewPS-ItalicMT","CourierNew-Italic",
                        "LiberationMono-Italic", "NimbusMonL-ReguObli"));
        substitutes.put("Courier-BoldOblique",
                Arrays.asList("CourierNewPS-BoldItalicMT","CourierNew-BoldItalic",
                        "LiberationMono-BoldItalic", "NimbusMonL-BoldObli"));
        substitutes.put("Helvetica",
                Arrays.asList("ArialMT", "Arial", "LiberationSans", "NimbusSanL-Regu"));
        substitutes.put("Helvetica-Bold",
                Arrays.asList("Arial-BoldMT", "Arial-Bold", "LiberationSans-Bold",
                        "NimbusSanL-Bold"));
        substitutes.put("Helvetica-Oblique",
                Arrays.asList("Arial-ItalicMT", "Arial-Italic", "Helvetica-Italic",
                        "LiberationSans-Italic", "NimbusSanL-ReguItal"));
        substitutes.put("Helvetica-BoldOblique",
                Arrays.asList("Arial-BoldItalicMT", "Helvetica-BoldItalic",
                        "LiberationSans-BoldItalic", "NimbusSanL-BoldItal"));
        substitutes.put("Times-Roman",
                Arrays.asList("TimesNewRomanPSMT", "TimesNewRoman", "TimesNewRomanPS",
                        "LiberationSerif", "NimbusRomNo9L-Regu"));
        substitutes.put("Times-Bold",
                Arrays.asList("TimesNewRomanPS-BoldMT", "TimesNewRomanPS-Bold",
                        "TimesNewRoman-Bold", "LiberationSerif-Bold",
                        "NimbusRomNo9L-Medi"));
        substitutes.put("Times-Italic",
                Arrays.asList("TimesNewRomanPS-ItalicMT", "TimesNewRomanPS-Italic",
                        "TimesNewRoman-Italic", "LiberationSerif-Italic",
                        "NimbusRomNo9L-ReguItal"));
        substitutes.put("Times-BoldItalic",
                Arrays.asList("TimesNewRomanPS-BoldItalicMT", "TimesNewRomanPS-BoldItalic",
                        "TimesNewRoman-BoldItalic", "LiberationSerif-BoldItalic",
                        "NimbusRomNo9L-MediItal"));
        substitutes.put("Symbol", Arrays.asList("Symbol", "SymbolMT", "StandardSymL"));
        substitutes.put("ZapfDingbats", Arrays.asList("ZapfDingbatsITC", "Dingbats", "MS-Gothic"));
        // FIXME believelelf STSong-Light->STFangsong
		substitutes.put("STSong-Light", Arrays.asList("STFangsong"));
        // Acrobat also uses alternative names for Standard 14 fonts, which we map to those above
        // these include names such as "Arial" and "TimesNewRoman"
...

Compile and package my jdk7 with maven according to the official website instructions
Using the modified jar package to solve the problem of garbled code
Baidu online disk
Extraction code: ir0z

Download some descriptions of the source code from the official website:

The path I use SVN to check out is http://svn.apache.org/repos/asf/pdfbox/tags/2.0.0/
The updated version must be compiled with jdk1.8. My project is running with 1.7, so I downloaded version 2.0.0

maven compilation and packaging instructions

Follow the instructions to run mvn clean install in the. \ pdfbox folder
But test will be run, and I will report an error here, so I closed the test mvn clean install -DskipTests reference resources

I'm a rookie. If you have good solutions or ideas, please leave a message
I hope this article can help you. Thank you

Posted by tejama on Tue, 09 Jun 2020 20:53:59 -0700