使用 Apache POI 从文档中获取图像
问题描述:
我正在使用 Apache Poi 从 docx 读取图像.
I am using Apache Poi to read images from docx.
这是我的代码:
enter code here
public Image ReadImg(int imageid) throws IOException {
XWPFDocument doc = new XWPFDocument(new FileInputStream("import.docx"));
BufferedImage jpg = null;
List<XWPFPictureData> pic = doc.getAllPictures();
XWPFPictureData pict = pic.get(imageid);
String extract = pict.suggestFileExtension();
byte[] data = pict.getData();
//try to read image data using javax.imageio.* (JDK 1.4+)
jpg = ImageIO.read(new ByteArrayInputStream(data));
return jpg;
}
它可以正确读取图像,但不是按顺序读取.
It reads images properly but not in order wise.
例如,如果文档包含
image1.jpeg图像2.jpeg图像3.jpegimage4.jpegimage5.jpeg
image1.jpeg image2.jpeg image3.jpeg image4.jpeg image5.jpeg
读起来
图片4图像3图片1图像5图片2
image4 image3 image1 image5 image2
你能帮我解决吗?
我想按顺序阅读图像.
谢谢,西提克
答
public static void extractImages(XWPFDocument docx) {
try {
List<XWPFPictureData> piclist = docx.getAllPictures();
// traverse through the list and write each image to a file
Iterator<XWPFPictureData> iterator = piclist.iterator();
int i = 0;
while (iterator.hasNext()) {
XWPFPictureData pic = iterator.next();
byte[] bytepic = pic.getData();
BufferedImage imag = ImageIO.read(new ByteArrayInputStream(bytepic));
ImageIO.write(imag, "jpg", new File("D:/imagefromword/" + pic.getFileName()));
i++;
}
} catch (Exception e) {
System.exit(-1);
}
}