我正在使用 APACHE POI Java 转换 HTML2WORD 文档,但无法将页眉和页脚添加到新生成的 Word 文档中

问题描述:

当我使用 POI 转换 HTML 2 Word 时,可以使用 html 中使用的所有样式和格式生成新的 .doc,但问题是无法在新创建的 .doc 文档中附加页眉和页脚.POI 不支持 @page 样式的 CSS 吗?还有如何在新生成的 .doc 文档中添加页眉和页脚.

When am converting HTML 2 Word using POI am able to generate a new .doc with all styles and formatting used in html, but problem is not able to append header and footer in newly created .doc document. Does POI not support @page style of CSS? Also how to add header and footer to newly generated .doc document.

以下代码:

public void convertHtmltoWord(String html, OutputStream outputStream) throws IOException {  

    POIFSFileSystem poifs = new POIFSFileSystem();
    DirectoryEntry directory = poifs.getRoot();

    try {
        directory.createDocument("WordDocument", getInputStream(html));
        poifs.writeFilesystem(outputStream);
    } finally {
        outputStream.close();
        poifs.close();
    }
}

public static InputStream getInputStream(String inputData) {
    InputStream targetStream = IOUtils.toInputStream(inputData);
    return targetStream;
}

注意:转换后的 .doc 只会给出 .doc 的 BODY 部分,而不是页眉和页脚.就像页脚中的页码没有出现.

Note: Converted .doc will give only give BODY part of .doc not Header and Footer. Like Page number in footer is not coming.

您使用代码所做的是不是 HTMLWord转化.

What you are doing using your code is not HTML to Word conversion.

您的代码仅创建一个 POIFSFileSystem,其中包含一个 DirectoryEntry,其中包含 HTML.Microsoft Word 将解释该 HTML 并将其显示在文档正文中,但该文件不是真正的二进制 *.doc代码>文件.

Your code only creates a POIFSFileSystem having one DirectoryEntry containing the HTML. Microsoft Word will interpret that HTML and show it in the document body but the file is not a really binary *.doc file.

当您在 Word 中打开并保存后更改某些内容时可以看到这一点.它将保存为 HTML 并创建一个附加目录 [Filename]-Files.这是必需的,因为 HTML 默认不提供嵌入.所以这个目录包含所有无法嵌入的元素.例如,这是图片,但还有包含页眉和/或页脚文本的附加 HTML 文件.

You can see this when you change something after opening in Word and then save. It will be saved as HTML and a additional directory [Filename]-Files will be created. That is needed because HTML does not provide embedding by default. So this directory contains all elements which could not embedded. This are pictures for example but also additional HTML files containing header and/or footer text.

因此,使用您的方法无法向 HTML 添加页眉或页脚.它只能将 HTML 放入文档正文中.它甚至没有创建一个真正的 MIcvrosoft Word 文件,而只是创建了一个 HTML 文件,由于其伪造的文件名 而被强加到 Microsoft Word>*.doc.

So using your approach it is not possible to add header or footer to the HTML. It only can put HTML into the document body. And it not even creates a real MIcvrosoft Word file but only a HTML file which gets foisted to Microsoft Word because of its fake file name *.doc.