使用Solr索引/搜索PDF内容

问题描述:

我正在尝试Solr,遇到了这个问题:

I'm experimenting with Solr and I've encountered this issue:

我已经为PDF文档建立了索引,当我在管理控制台中搜索:"时,就会列出PDF.但是,当我在PDF中搜索内容时,没有任何结果.

I've indexed a PDF document and when I search for ":" in the admin console, the PDF is listed. However when I search for content within the PDF I get no results.

要为文档建立索引,我使用了以下复制和粘贴代码: http://wiki.apache .org/solr/ContentStreamUpdateRequestExample

To index the document, I used copy-and-paste code from: http://wiki.apache.org/solr/ContentStreamUpdateRequestExample

使用此命令

curl "http://localhost:8983/solr/update/extract?stream.file=/home/fstl/apache-solr-3.2.0/example/exampledocs/pup.pdf&stream.contentType=application/pdf&literal.id=esc.doc&commit=true"