PDF文档使用itext sharp在C#.net中读取。

问题描述:

大家好,

i在提取pdf文档段落方面遇到问题,请帮帮我。



我的代码是,

Hi all,
i am facing problem in extracting text of pdf document paragraph wise, please help me out.

my code is,

private void ReadPdf(string _filePath)
        {
            PdfReader rd = new PdfReader(_filePath);
            int pageNumber = 1;
           // TextWriter oContent = TextWriter(Char);
            string oContent = "";
            while (pageNumber <= rd.NumberOfPages)
            {
                oContent += PdfString.STREAM.ToString();
                ++pageNumber;
            }
        }



我能读取pdf文本,但它逐行提取。但我想在段落中


in the above code i am able read the pdf text but it extracts line by line. but i want in paragraph wise

PdfReader reader = new PdfReader(path);
 StringWriter output = new StringWriter();
 for (int i = 1; i <= reader.NumberOfPages; i++)
 {
     Paragraph o = CreateSimpleHtmlParagraph(output.ToString());
     output.WriteLine(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()));
 }


您好b $ b

这将对您有帮助


Hi
This will helpful for u

protected void Page_Load(object sender, EventArgs e)
        {
            SqlServer server = new SqlServer("Data Source=KSHIT6773-G13\\SQLEXPRESS;Initial Catalog=Test;Integrated Security=True");
            string[] sql = { "SELECT E.Name, D.Name FROM Employee E, Department D WHERE D.DepartmentID = E.Department" };
            string[] table = { "EMPDEPT" };
            DataSet ds = new DataSet();
            ds = server.GetDataSet(sql, table, false);

            ReportCRtoPDF rptObj = new ReportCRtoPDF();
            rptObj.SetDataSource(ds);
            
            DiskFileDestinationOptions dsk = new DiskFileDestinationOptions();
            dsk.DiskFileName = Request.PhysicalApplicationPath + "files\\CrtoPDF.pdf";
            ExportOptions ex = new ExportOptions();
            ex.ExportDestinationType = ExportDestinationType.DiskFile;
            ex.ExportFormatType = ExportFormatType.PortableDocFormat;
            ex.ExportDestinationOptions = dsk;
            rptObj.Export(ex);
        }