在實際項目的開發(fā)過程中狱掂，有一些需求是涉及到對各種文檔文件的操作伺帘，由于在近期的工作中使用到一些 word 和 *pdf *的操作谤辜，因此在這篇文章中厉亏，我會簡單介紹下 *word *和 pdf 的相關(guān)操作方法垄提。

1伏社、pdf文檔預(yù)覽

word、excel等文檔的預(yù)覽塔淤，如果是免費(fèi)的摘昌，絕大多數(shù)都是基于openoffice去實現(xiàn)的，給大家提供一個參考：文檔預(yù)覽開源服務(wù)
（如果考慮付費(fèi)產(chǎn)品的話高蜂，可以看看：永中office聪黎、office365等）。

這里介紹使用pdf.js來預(yù)覽pdf备恤，pdf.js下載鏈接稿饰，提取碼：ew60，需要將這個pdfjs文件夾放到項目中（我這邊是放在webapp/resources/plugin里面露泊，如下圖）喉镰。

調(diào)用方法：（假設(shè)pdfjs訪問的路徑是$prefix/pdfjs）

需要給預(yù)覽頁面的iframe的src屬性設(shè)置如下：

    "$prefix/pdfjs/web/viewer.html?file=" + encodeURIComponent(url)

上面的url指的是文件的訪問地址（如：http://localhost:8080/file/abc.txt）

ps：使用pdfjs預(yù)覽時，插件中默認(rèn)會有下載和打印按鈕惭笑，如果項目中需要對這兩個按鈕設(shè)置操作權(quán)限的話侣姆，則需要在js中去處理。（如果不需要這兩個按鈕沉噩，則只需要在pdfjs/web/viewer.html中找到id="download"捺宗、id="print"、 id="secondaryPrint" 和 id="secondaryDownload"川蒙，分別加上style="display: none;"）

js動態(tài)隱藏下載和打印按鈕蚜厉，如下圖：

2、文檔解析-word文檔占位符替換

word占位符替換一般有兩種情況
①替換word段落的占位符（標(biāo)題畜眨、段落等除表格以外的元素）
②替換word表格的占位符
相關(guān)方法如下：

import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * 替換文檔中的段落和表格占位符
 *
 * @author zhuLong
 * @since 2020/6/5 9:54
 */
public class WordReplaceUtil {

    /**
     * 替換段落中的占位符
     *
     * @param doc    需要替換的文檔
     * @param params 替換的參數(shù)昼牛，key=占位符术瓮，value=實際值
     */
    public static void replaceInPara(XWPFDocument doc, Map<String, Object> params) {
        Iterator<XWPFParagraph> iterator = doc.getParagraphsIterator();
        XWPFParagraph para;
        while (iterator.hasNext()) {
            para = iterator.next();
            if (!StringUtils.isEmpty(para.getParagraphText())) {
                replaceInPara(para, params);
            }
        }
    }

    /**
     * 替換段落中的占位符
     *
     * @param para
     */
    public static void replaceInPara(XWPFParagraph para, Map<String, Object> params) {
        // 獲取當(dāng)前段落的文本
        String sourceText = para.getParagraphText();
        // 控制變量
        boolean replace = false;
        for (Map.Entry<String, Object> entry : params.entrySet()) {
            String key = entry.getKey();
            if (sourceText.indexOf(key) != -1) {
                Object value = entry.getValue();
                if (value instanceof String) {
                    // 替換文本占位符
                    sourceText = sourceText.replace(key, value.toString());
                    replace = true;
                }
            }
        }
        if (replace) {
            Integer fontSize = null;
            boolean isBold = false;
            // 獲取段落中的行數(shù)
            List<XWPFRun> runList = para.getRuns();
            for (int i = runList.size() - 1; i >= 0; i--) {
                if (runList.get(i).getFontSize() > 0) {
                    fontSize = runList.get(i).getFontSize();
                }
                if (runList.get(i).isBold()) {
                    isBold = runList.get(i).isBold();
                }

                // 刪除之前的行
                para.removeRun(i);
            }
            // 創(chuàng)建一個新的文本并設(shè)置為替換后的值 這樣操作之后之前文本的樣式就沒有了，待改進(jìn)
            XWPFRun run = para.createRun();
            run.setBold(isBold);
            run.setText(sourceText);
            if (fontSize != null) {
                run.setFontSize(fontSize);
            }
        }
    }

    /**
     * 替換表格中的占位符
     *
     * @param doc
     * @param params
     */
    public static void replaceTable(XWPFDocument doc, Map<String, Object> params) {
        // 獲取文檔中所有的表格
        Iterator<XWPFTable> iterator = doc.getTablesIterator();
        XWPFTable table;
        List<XWPFTableRow> rows;
        List<XWPFTableCell> cells;
        List<XWPFParagraph> paras;
        while (iterator.hasNext()) {
            table = iterator.next();
            if (table.getRows().size() > 1) {
                //判斷表格是需要替換還是需要插入贰健，判斷邏輯有${為替換胞四，
                if (matcher(table.getText()).find()) {
                    rows = table.getRows();
                    for (XWPFTableRow row : rows) {
                        cells = row.getTableCells();
                        for (XWPFTableCell cell : cells) {
                            paras = cell.getParagraphs();
                            for (XWPFParagraph para : paras) {
                                replaceInPara(para, params);
                            }
                        }
                    }
                }
            }
        }
    }

    /**
     * 正則匹配字符串
     *
     * @param str
     * @return
     */
    private static Matcher matcher(String str) {
        Pattern pattern = Pattern.compile("\\$\\{(.+?)\\}", Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(str);
        return matcher;
    }

    /**
     * 需要替換的內(nèi)容
     */
    private static Map<String, Object> createParamsMap() {
        Map<String, Object> map = new HashMap<String, Object>();
        map.put("${name}", "abc");
        map.put("${sex}", "男");
        return map;
    }

    public static void main(String[] args) throws Exception {
        File mainFile = new File("C:\\Users\\zhulong\\Desktop\\abc.docx");
        InputStream in = new FileInputStream(mainFile);
        OPCPackage srcPackage = OPCPackage.open(in);
        XWPFDocument doc = new XWPFDocument(srcPackage);
        WordReplaceUtil.replaceTable(doc, createParamsMap());
    }
}

3、文檔解析-word文檔添加行數(shù)據(jù)

import cn.hutool.core.util.ArrayUtil;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.List;

/**
 * word 文檔操作
 *
 * @author zhuLong
 * @since 2020/6/29 9:16
 */
public class WordUtils {
    /**
     * insertRow 在word表格中指定位置插入一行霎烙，復(fù)制指定行樣式
     *
     * @param copyrowIndex 需要復(fù)制的行位置
     * @param newrowIndex  需要新增一行的位置
     */
    public static void insertRow(XWPFTable table, int copyrowIndex, int newrowIndex, String[] datas) {
        // 在表格中指定的位置新增一行
        XWPFTableRow targetRow = table.insertNewTableRow(newrowIndex);
        // 獲取需要復(fù)制行對象
        XWPFTableRow copyRow = table.getRow(copyrowIndex);
        //復(fù)制行對象
        targetRow.getCtRow().setTrPr(copyRow.getCtRow().getTrPr());
        //或許需要復(fù)制的行的列
        List<XWPFTableCell> copyCells = copyRow.getTableCells();
        //復(fù)制列對象
        XWPFTableCell targetCell = null;
        for (int i = 0; i < copyCells.size(); i++) {
            XWPFTableCell copyCell = copyCells.get(i);
            String a = copyCell.getText();
            targetCell = targetRow.addNewTableCell();
            targetCell.getCTTc().setTcPr(copyCell.getCTTc().getTcPr());
            if (copyCell.getParagraphs() != null && copyCell.getParagraphs().size() > 0) {
                targetCell.getParagraphs().get(0).getCTP().setPPr(copyCell.getParagraphs().get(0).getCTP().getPPr());
                if (copyCell.getParagraphs().get(0).getRuns() != null
                        && copyCell.getParagraphs().get(0).getRuns().size() > 0) {
                    XWPFRun cellR = targetCell.getParagraphs().get(0).createRun();
                    cellR.setBold(copyCell.getParagraphs().get(0).getRuns().get(0).isBold());
                    if (ArrayUtil.isNotEmpty(datas)) {
                        cellR.setText(datas[i]);
                        cellR.setFontSize(10);
                    }
                }
            }
        }

    }

    public static void main(String[] args) throws Exception{
        File mainFile = new File("C:\\Users\\zhulong\\Desktop\\abc.docx");
        InputStream in = new FileInputStream(mainFile);
        OPCPackage srcPackage = OPCPackage.open(in);
        XWPFDocument doc = new XWPFDocument(srcPackage);
        // 動態(tài)插入一行
        List<XWPFTable> tables = doc.getTables();//獲取word中所有的表格
        XWPFTable table = tables.get(0);//獲取第一個表格
        String[] datas = new String[5];
        datas[0] = "a";
        datas[1] = "b";
        datas[2] = "c";
        datas[3] = "d";
        datas[4] = "e";
        WordUtils.insertRow(table, 1, 2, datas);
    }
}

4撬讽、文檔解析-pdf相關(guān)操作

import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Image;
import com.itextpdf.text.pdf.*;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.*;
import java.util.List;
import java.util.Map;

/**
 * pdf相關(guān)操作工具類
 *
 * @author zhuLong
 * @since 2020/6/18 17:03
 */
public class PdfUtil {

    private static final Logger logger = LoggerFactory.getLogger(PdfUtil.class);

    /*
     * 合并pdf文件
     * @param files 要合并文件數(shù)組(絕對路徑如{ "e:\\1.pdf", "e:\\2.pdf" ,
     * "e:\\3.pdf"}),合并的順序按照數(shù)組中的先后順序，如2.pdf合并在1.pdf后悬垃。
     * @param newfile 合并后新產(chǎn)生的文件絕對路徑游昼，如 e:\\temp\\tempNew.pdf,
     * @return boolean 合并成功返回true；否則尝蠕，返回false
     *
     */
    public static boolean mergePdfFiles(String[] files, String newfile) {
        boolean retValue = false;
        Document document = null;
        try {
            document = new Document(new PdfReader(files[0]).getPageSize(1));
            PdfCopy copy = new PdfCopy(document, new FileOutputStream(newfile));
            document.open();
            for (int i = 0; i < files.length; i++) {
                PdfReader reader = new PdfReader(files[i]);
                int n = reader.getNumberOfPages();
                for (int j = 1; j <= n; j++) {
                    document.newPage();
                    PdfImportedPage page = copy.getImportedPage(reader, j);
                    copy.addPage(page);
                }
            }
            retValue = true;
        } catch (Exception e) {
            System.out.println(e);
        } finally {
            System.out.println("執(zhí)行結(jié)束");
            if (document != null) {
                document.close();
            }
        }
        return retValue;
    }

    /**
     * pdf轉(zhuǎn)png
     *
     * @param inputStream pdf文件輸入流
     * @author zhuLong
     * @since 2020/6/19 14:17
     */
    public static File pdf2png(InputStream inputStream) {
        try {
            PDDocument doc = PDDocument.load(inputStream);
            PDFRenderer renderer = new PDFRenderer(doc);
            int pageCount = doc.getNumberOfPages();
            for (int i = 0; i < pageCount; i++) {
                BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI
                // 創(chuàng)建臨時文件
                File temp = File.createTempFile("myTempFile", ".png");
                ImageIO.write(image, "png", temp);
                return temp;
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

    /**
     * pdf轉(zhuǎn)png
     *
     * @param filePath pdf文件路徑
     * @author zhuLong
     * @since 2020/6/18 23:00
     */
    public static File pdf2png(String filePath) throws FileNotFoundException {
        // 將pdf裝圖片 并且自定義圖片得格式大小
        File file = new File(filePath);
        InputStream inputStream = new FileInputStream(file);
        return pdf2png(inputStream);
    }

    /**
     * pdf轉(zhuǎn)png
     *
     * @param file pdf文件
     * @author zhuLong
     * @since 2020/6/18 23:00
     */
    public static File pdf2png(File file) throws FileNotFoundException {
        // 將pdf裝圖片 并且自定義圖片得格式大小
        InputStream inputStream = new FileInputStream(file);
        return pdf2png(inputStream);
    }

    /**
     * @param imgPath 圖片路徑
     * @param pdf     生成的pdf
     * @author zhuLong
     * @since 2020/6/18 23:00
     */
    public static void image2pdf(String imgPath, File pdf) throws DocumentException, IOException {
        Document document = new Document();
        OutputStream os = new FileOutputStream(pdf);
        PdfWriter.getInstance(document, os);
        document.open();
        createPdf(document, imgPath);
        document.close();
    }

    private static void createPdf(Document document, String imgPath) {
        try {
            Image image = Image.getInstance(imgPath);
            float documentWidth = document.getPageSize().getWidth() - document.leftMargin() - document.rightMargin();
            System.out.println(documentWidth + "");
            float documentHeight = documentWidth / 580 * 850;//重新設(shè)置寬高
            System.out.println(documentHeight + "");
            image.scaleAbsolute(documentWidth, documentHeight);//重新設(shè)置寬高
            document.add(image);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            if (document != null) {
                document.close();
            }
        }
    }

    /**
     * 給pdf上添加水印文本
     *
     * @param contentList x烘豌、y坐標(biāo)及文本的集合
     * @param file        源pdf文件
     * @param fontSize    字體大小
     * @author zhuLong
     * @since 2020/6/19 9:40
     */
    public static File setWatermark(List<Map<String, Object>> contentList, File file, float fontSize) throws FileNotFoundException {
        InputStream in = new FileInputStream(file);
        return setWatermark(contentList, in, fontSize);
    }

    /**
     * 給pdf上添加水印文本
     *
     * @param contentList x、y坐標(biāo)及文本的集合
     * @param inputStream 源pdf文件輸入流
     * @param fontSize    字體大小
     * @author zhuLong
     * @since 2020/6/19 9:40
     */
    public static File setWatermark(List<Map<String, Object>> contentList, InputStream inputStream, float fontSize) {
        PdfReader reader = null;
        PdfStamper stamper = null;
        try {
            reader = new PdfReader(inputStream);

            // 創(chuàng)建臨時文件
            File dest = File.createTempFile("pdf", ".pdf");
            stamper = new PdfStamper(reader, new FileOutputStream(dest));
            //不可遮擋文字看彼，只操作第一頁
            PdfContentByte content = stamper.getOverContent(1);
            content.saveState();
            content.fill();
            content.restoreState();
            BaseFont base = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED);
            //開始寫入文本
            content.beginText();
            //字體大小
            content.setFontAndSize(base, fontSize);
            if (!contentList.isEmpty()) {
                for (Map<String, Object> map : contentList) {
                    //設(shè)置字體的輸出位置
                    int x = Integer.parseInt(map.get("x").toString());
                    int y = Integer.parseInt(map.get("y").toString());
                    content.setTextMatrix(x, y);
                    content.showText(map.get("contentText").toString());
                }
            }
            content.endText();
            return dest;
        } catch (Exception e) {
            logger.error("操作pdf文件異常", e);
        } finally {
            try {
                if (stamper != null) {
                    stamper.close();
                }
                if (reader != null) {
                    reader.close();
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        return null;
    }

    public static void main(String[] args) throws IOException {
        File file = new File("C:\\Users\\zhulong\\Desktop\\1.pdf");
        List<Map<String, Object>> list = new ArrayList<Map<String, Object>>();
        Map<String, Object> map1 = new HashMap<String, Object>();
        map1.put("x", 480);
        map1.put("y", 690);
        map1.put("contentText", "10001");
        list.add(map1);
        Map<String, Object> map2 = new HashMap<String, Object>();
        map2.put("x", 425);
        map2.put("y", 690);
        map2.put("contentText", LocalDateTime.now().getYear());
        list.add(map2);
        File file1 = setWatermark(list, file, 14);
        System.out.println(file1.getAbsolutePath());
    }

}

5廊佩、圖片裁剪拼接

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;

/**
 * 圖片相關(guān)工具類
 *
 * @author zhuLong
 * @since 2020/6/18 23:15
 */
public class ImageUtil {

    /**
     * 裁剪圖片
     *
     * @param imgIn  待裁剪的圖片
     * @param imgOut 裁剪后的圖片
     */
    public static void cutImg(File imgIn, File imgOut, int x, int y, int width, int height) {
        try {
            BufferedImage bufferedImage = ImageIO.read(imgIn);
            BufferedImage back = bufferedImage.getSubimage(x, y, width, height);
            ImageIO.write(back, "png", imgOut);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * Java拼接多張圖片
     *
     * @param pics    圖片路徑數(shù)組
     * @param type    圖片類型
     * @param dstFile 目標(biāo)文件
     */
    public static void merge(File[] pics, String type, File dstFile) {

        int len = pics.length;
        if (len < 1) {
            System.out.println("pics len < 1");
        }
        BufferedImage[] images = new BufferedImage[len];
        int[][] ImageArrays = new int[len][];
        for (int i = 0; i < len; i++) {
            try {
                images[i] = ImageIO.read(pics[i]);
            } catch (Exception e) {
                e.printStackTrace();
            }
            int width = images[i].getWidth();
            int height = images[i].getHeight();
            ImageArrays[i] = new int[width * height];// 從圖片中讀取RGB
            ImageArrays[i] = images[i].getRGB(0, 0, width, height,
                    ImageArrays[i], 0, width);
        }

        int dst_height = 0;
        int dst_width = images[0].getWidth();
        for (int i = 0; i < images.length; i++) {
            dst_width = dst_width > images[i].getWidth() ? dst_width
                    : images[i].getWidth();

            dst_height += images[i].getHeight();
        }
        if (dst_height < 1) {
            System.out.println("dst_height < 1");
        }

        // 生成新圖片
        try {
            BufferedImage ImageNew = new BufferedImage(dst_width, dst_height,
                    BufferedImage.TYPE_INT_RGB);
            int height_i = 0;
            for (int i = 0; i < images.length; i++) {
                ImageNew.setRGB(0, height_i, dst_width, images[i].getHeight(),
                        ImageArrays[i], 0, dst_width);
                height_i += images[i].getHeight();
            }
            // 寫圖片
            ImageIO.write(ImageNew, type, dstFile);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

6、word文檔轉(zhuǎn)換

Free Spire.Doc for Java 是一款免費(fèi)靖榕、專業(yè)的Java Word組件标锄，開發(fā)人員使用它可以輕松地將Word文檔創(chuàng)建、讀取茁计、編輯料皇、轉(zhuǎn)換和打印等功能集成到自己的Java應(yīng)用程序中。作為一款完全獨(dú)立的組件星压，F(xiàn)ree Spire.Doc for Java的運(yùn)行環(huán)境無需安裝Microsoft Office践剂。

Free Spire.Doc for Java能執(zhí)行多種Word文檔處理任務(wù)，包括生成娜膘、讀取逊脯、轉(zhuǎn)換和打印Word文檔，插入圖片竣贪，添加頁眉和頁腳军洼，創(chuàng)建表格，添加表單域和郵件合并域贾富，添加書簽歉眷，添加文本和圖片水印，設(shè)置背景顏色和背景圖片颤枪，添加腳注和尾注，添加超鏈接淑际，加密和解密Word文檔畏纲，添加批注扇住，添加形狀等。

友情提示：免費(fèi)版有篇幅限制盗胀。在加載或保存Word 文檔時艘蹋，要求 Word 文檔不超過 500 個段落，25 個表格票灰。同時將 Word 文檔轉(zhuǎn)換為 PDF 和 XPS 等格式時女阀，僅支持轉(zhuǎn)換前三頁。

在這里我們只介紹利用Free Spire.Doc for Java 將word轉(zhuǎn)換為pdf屑迂。
1> pom文件引入依賴

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <url>http://repo.e-iceblue.cn/repository/maven-public/</url>
    </repository>
</repositories>

<dependencies>
        ......
        <!-- spire操作word -->
        <dependency>
            <groupId>e-iceblue</groupId>
            <artifactId>spire.doc.free</artifactId>
            <version>2.7.3</version>
        </dependency>
</dependencies>

2>方法調(diào)用

public static void main(String[] args) {
        //加載word示例文檔
        Document document = new Document();
        document.loadFromFile("C:\\Users\\zhulong\\Desktop\\test.docx");
        //保存結(jié)果文件
        document.saveToFile("C:\\Users\\zhulong\\Desktop\\test.pdf", FileFormat.PDF); 
}

當(dāng)然浸策，excel轉(zhuǎn)pdf也是支持的，可以使用Free Spire.XLS for Java惹盼，免費(fèi)版有一定的限制庸汗，官網(wǎng)地址：https://www.e-iceblue.cn/Introduce/Free-Spire-XLS-JAVA.html

另外，也可以考慮使用openoffice組件進(jìn)行文檔的轉(zhuǎn)換手报。

7蚯舱、文檔轉(zhuǎn)換后字體亂碼問題

一般文檔轉(zhuǎn)換經(jīng)常會出現(xiàn)一種現(xiàn)象：本地（windows）上測試沒問題，但是到了服務(wù)器（linux）上測試就出現(xiàn)中文亂碼等問題掩蛤，這種現(xiàn)象基本都是因為linux服務(wù)器上沒有相關(guān)字體導(dǎo)致的枉昏。

解決方法：

第一步：在linux服務(wù)器上安裝中文字體庫
安裝參考鏈接：linux安裝中文字體庫
安裝成功后，記得把你的應(yīng)用服務(wù)重啟再試一下揍鸟，如果還是不行兄裂，說明你的源文檔中的相關(guān)字體在linux服務(wù)器上找不到，進(jìn)行第二步蜈亩。
第二步：定位你的文檔中亂碼那塊的字體名稱懦窘，然后到 C:\Windows\Fonts 這個目錄下（windows系統(tǒng)）找到對應(yīng)字體文件，復(fù)制到linux服務(wù)器上的/usr/share/fonts文件夾里面稚配，然后依次執(zhí)行如下命令：

mkfontscale //字體擴(kuò)展
mkfontdir   //新增字體目錄
fc-cache    //刷新緩存

注意：執(zhí)行完之后畅涂，依然需要重啟應(yīng)用服務(wù)。

一般由于字體問題導(dǎo)致的亂碼通過這種方法基本都可以得到解決道川。

java對文檔的相關(guān)操作