使用/ CCITTFaxDecode过滤器从PDF中提取图像

问题描述：

我有一个从扫描软件生成的pdf。 pdf每页有1个TIFF图像。我想从每个页面中提取TIFF图像。

I have a pdf that was generated from scanning software. The pdf has 1 TIFF image per page. I want to extract the TIFF image from each page.

我正在使用iTextSharp并且我已成功找到图像并可以从 PdfReader.GetStreamBytesRaw 方法。问题是，正如我之前发现的那样，iTextSharp不包含 PdfReader.CCITTFaxDecode 方法。

I am using iTextSharp and I have successfully found the images and can get back the raw bytes from the PdfReader.GetStreamBytesRaw method. The problem is, as many before me have discovered, iTextSharp does not contain a PdfReader.CCITTFaxDecode method.

什么我知道吗？即使没有iTextSharp，我也可以在记事本中打开pdf并找到 / Filter / CCITTFaxDecode 的流，我知道来自 / DecodeParams 它正在使用CCITTFaxDecode组4。

What else do I know? Even without iTextSharp I can open the pdf in notepad and find the streams with /Filter /CCITTFaxDecode and I know from the /DecodeParams that it is using CCITTFaxDecode group 4.

有没有人知道如何从我的pdf中获取CCITTFaxDecode过滤图像？

Does anyone out there know how I can get the CCITTFaxDecode filter images out of my pdf?

干杯，
Kahu

Cheers, Kahu

答

实际上，vbcrlfuser的回答对我有帮助，但是当前版本的BitMiracle.LibTiff.NET的代码不太正确，因为我可以下载它。在当前版本中，等效代码如下所示：

Actually, vbcrlfuser's answer did help me, but the code was not quite correct for the current version of BitMiracle.LibTiff.NET, as I could download it. In the current version, equivalent code looks like this:

using iTextSharp.text.pdf;
using BitMiracle.LibTiff.Classic;

...
      Tiff tiff = Tiff.Open("C:\\test.tif", "w");
      tiff.SetField(TiffTag.IMAGEWIDTH, UInt32.Parse(pd.Get(PdfName.WIDTH).ToString()));
      tiff.SetField(TiffTag.IMAGELENGTH, UInt32.Parse(pd.Get(PdfName.HEIGHT).ToString()));
      tiff.SetField(TiffTag.COMPRESSION, Compression.CCITTFAX4);
      tiff.SetField(TiffTag.BITSPERSAMPLE, UInt32.Parse(pd.Get(PdfName.BITSPERCOMPONENT).ToString()));
      tiff.SetField(TiffTag.SAMPLESPERPIXEL, 1);
      tiff.WriteRawStrip(0, raw, raw.Length);
      tiff.Close();

使用上面的代码，我终于在C中获得了有效的Tiff文件：\test.tif。谢谢你，vbcrlfuser！

Using the above code, I finally got a valid Tiff file in C:\test.tif. Thank you, vbcrlfuser!

使用/ CCITTFaxDecode过滤器从PDF中提取图像

相关推荐