PDF文档使用itext sharp在C#.net中读取。
问题描述:
大家好,
i在提取pdf文档段落方面遇到问题,请帮帮我。
我的代码是,
Hi all,
i am facing problem in extracting text of pdf document paragraph wise, please help me out.
my code is,
private void ReadPdf(string _filePath)
{
PdfReader rd = new PdfReader(_filePath);
int pageNumber = 1;
// TextWriter oContent = TextWriter(Char);
string oContent = "";
while (pageNumber <= rd.NumberOfPages)
{
oContent += PdfString.STREAM.ToString();
++pageNumber;
}
}
我能读取pdf文本,但它逐行提取。但我想在段落中
in the above code i am able read the pdf text but it extracts line by line. but i want in paragraph wise
答
PdfReader reader = new PdfReader(path);
StringWriter output = new StringWriter();
for (int i = 1; i <= reader.NumberOfPages; i++)
{
Paragraph o = CreateSimpleHtmlParagraph(output.ToString());
output.WriteLine(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()));
}
您好b $ b
这将对您有帮助
Hi
This will helpful for u
protected void Page_Load(object sender, EventArgs e)
{
SqlServer server = new SqlServer("Data Source=KSHIT6773-G13\\SQLEXPRESS;Initial Catalog=Test;Integrated Security=True");
string[] sql = { "SELECT E.Name, D.Name FROM Employee E, Department D WHERE D.DepartmentID = E.Department" };
string[] table = { "EMPDEPT" };
DataSet ds = new DataSet();
ds = server.GetDataSet(sql, table, false);
ReportCRtoPDF rptObj = new ReportCRtoPDF();
rptObj.SetDataSource(ds);
DiskFileDestinationOptions dsk = new DiskFileDestinationOptions();
dsk.DiskFileName = Request.PhysicalApplicationPath + "files\\CrtoPDF.pdf";
ExportOptions ex = new ExportOptions();
ex.ExportDestinationType = ExportDestinationType.DiskFile;
ex.ExportFormatType = ExportFormatType.PortableDocFormat;
ex.ExportDestinationOptions = dsk;
rptObj.Export(ex);
}