如何使用 iTextSharp 在 pdf 文件中获取部分目标页码?
我有一个 pdf 文件,其中包含包含目标页面部分的索引页.我可以获得章节名称(第 1.1 节,第 5.2 节),但我无法获取目标页码...
I have a pdf file which contains Index Page that includes section with target page. I could get the section name(Section 1.1, Section 5.2) but i can not get the target page number...
例如:http://www.mikesdotnetting.com/Article/84/iTextSharp-Links-和-书签
这是我的代码:
string FileName = AppDomain.CurrentDomain.BaseDirectory + "TestPDF.pdf";
PdfReader pdfreader = new PdfReader(FileName);
PdfDictionary PageDictionary = pdfreader.GetPageN(9);
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
if ((Annots == null) || (Annots.Length == 0))
return;
foreach (PdfObject oAnnot in Annots.ArrayList)
{
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);
if (AnnotationDictionary.Keys.Contains(PdfName.A))
{
PdfDictionary oALink = AnnotationDictionary.GetAsDict(PdfName.A);
if (oALink.Get(PdfName.S).Equals(PdfName.GOTO))
{
if (oALink.Keys.Contains(PdfName.D))
{
PdfObject objs = oALink.Get(PdfName.D);
if (objs.IsString())
{
string SectionName = objs.ToString(); // here i could see the section name...
}
}
}
}
}
如何获取目标页码?
我也无法访问某些 pdf 的部分名称,例如:http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf
also I couldn't access the Section name for some pdf ex: http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf
在此 PDF 中,第 9 页包含我无法获取的部分.所以请给我解决方案....
In this PDF 9th page contains a section I could not get the section. so please give me solution....
有两种可能的链接注释类型,A
或 Dest
.A
是更强大的类型,但通常是矫枉过正.Dest
类型仅指定对页面的间接引用以及一些拟合和缩放选项.
There's two possible types of Link Annotations, either A
or Dest
. The A
is the more powerful type but is often overkill. The Dest
type just specifies an indirect reference to a page along with some fitting and zooming options.
Dest
值可以是几个不同的东西,但通常(据我所见)是一个命名的字符串目的地.您可以在文档的名称目的地字典中查找命名目的地.所以在你的主循环之前添加这个以便以后可以引用它:
The Dest
value can be a couple of different things but is usually (as far as I've ever seen) a named string destination. You can look up named destinations in the document's name destination dictionary. So before your main loop add this so that it can be referenced later:
//Get all existing named destinations
Dictionary<string, PdfObject> dests = pdfreader.GetNamedDestinationFromStrings();
一旦您将 Dest
作为字符串,您就可以将该对象视为上述字典中的键.
Once you've got the Dest
as a string you can look that object up as a key in the above dictionary.
PdfArray thisDest = (PdfArray)dests[AnnotationDictionary.GetAsString(PdfName.DEST).ToString()];
返回的数组中的第一项是您习惯的间接引用.(实际上,第一项可能是代表远程文档中页码的整数,因此您可能需要检查它.)
The first item in the array returned is the indirect reference that you're used to. (Actually, the first item could be an integer representing a page number in a remote document so you might have to check for that.)
PdfIndirectReference a = (PdfIndirectReference)thisDest[0];
PdfObject thisPage = PdfReader.GetPdfObject(a);
下面是将上述大部分内容放在一起的代码,省略了您已有的一些代码.A
和 Dest
根据规范是互斥的,因此任何注释都不应该同时指定.
Below is code that puts most of the above together, omitting some of the code that you already have. A
and Dest
are mutually exclusive per the spec so no annotation should ever have both specified.
//Get all existing named desitnations
Dictionary<string, PdfObject> dests = pdfreader.GetNamedDestinationFromStrings();
foreach (PdfObject oAnnot in Annots.ArrayList) {
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);
if (AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK)) {
if (AnnotationDictionary.Contains(PdfName.A)) {
//...Do normal A stuff here
} else if (AnnotationDictionary.Contains(PdfName.DEST)) {
if (AnnotationDictionary.Get(PdfName.DEST).IsString()) {//Named-based destination
if (dests.ContainsKey(AnnotationDictionary.GetAsString(PdfName.DEST).ToString())) {//See if it exists in the global name dictionary
PdfArray thisDest = (PdfArray)dests[AnnotationDictionary.GetAsString(PdfName.DEST).ToString()];//Get the destination
PdfIndirectReference a = (PdfIndirectReference)thisDest[0];//TODO, this could actually be an integer for the case of Remote Destinations
PdfObject thisPage = PdfReader.GetPdfObject(a);//Get the actual PDF object
}
} else if(AnnotationDictionary.Get(PdfName.DEST).IsArray()) {
//Technically possible, I think the array matches the code directly above but I don't have a sample PDF
}
}
}
}