如何避免标记空的< TR< TD>.使用Itext 5将单元格转换为PDF

问题描述：

我使用i文本5从html作为输入生成PDF. 作为PDF可访问性的一部分，添加

I an using i text 5 to generate the PDF from html as input . As part of PDF accessibility,adding pdfwriter.settagged().

但是这里所有的空和非空标记都在标记.请您能帮助避免避免标记非空html标记

But here all the empty and non-empty tags are tagging .can you please help how to avoid to tagging the non empty html tags

答

您可以直接使用

You can do it directly with pdfHTML (basically the solution for HTML to PDF conversion in iText 7).

ConverterProperties props = new ConverterProperties();
props.setTagWorkerFactory(new DefaultTagWorkerFactory() {
                @Override
                public ITagWorker getCustomTagWorker(
                        IElementNode tag, ProcessorContext context) {
                    if (tag.name().equals(TagConstants.TD)) {
                        if (!tag.childNodes().isEmpty()) {
                            return new TdTagWorker(tag, context);
                        } else {
                            return new SpanTagWorker(tag, context);
                        }
                    }


                    return null;
                }
            });


PdfDocument doc = new PdfDocument(new PdfWriter(DEST));
doc.setTagged();

HtmlConverter.convertToPdf(new FileInputStream(ORIG), doc, props);

在上面的代码中，您可以使用文档.在这种情况下，我只是将空的TD标签更改为Span元素，即可实现所需的行为(多余的TD标签消失了).

On the code above, you can use setTagWorkerFactory to have a custom behavior for your tags as detailed in the documentation. In this specific case, I'm simply changing empty TD tags into a Span element, which achieves the desired behavior (the superfluous TD tag disappears).

(老实说，这依赖于TR工作者无法解析SPAN标签，因此它只是会被发布.如果我想出一个更优雅的解决方案，我会更新答案)

(to be completely honest, this relies on the inability of the TR worker to parse the SPAN tag, so it just jumps ship. I'll update the answer if I come up with a more elegant solution)

如何避免标记空的< TR< TD>.使用Itext 5将单元格转换为PDF

相关推荐