aspose file tools*
The moose likes Blatant Advertising and the fly likes Read & Extract Text from PowerPoint Presentation PPTX Files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Other » Blatant Advertising
Bookmark "Read & Extract Text from PowerPoint Presentation PPTX Files " Watch "Read & Extract Text from PowerPoint Presentation PPTX Files " New topic
Author

Read & Extract Text from PowerPoint Presentation PPTX Files

sherazam khan
Ranch Hand

Joined: Mar 10, 2010
Posts: 300

TIn this technical tip, we will learn to extract text from different slide shapes inside PowerPoint 2007 presentations by using Aspose.Slides for Java. We will extract text from slide shapes like Placeholders, Auto Shapes, Group Shapes and tables etc.

Below is code example that can traverse through each shape belonging to every slide inside a PPTX presentation and extract text from that on portion level. Since, text is extracted on portion level, so its font related properties will be preserved.


Extracting text from a slide


[Java]


import com.aspose.slides.pptx.*;


public class TextExtract{

  public static void ReadText(TextFrameEx TxtFrame)

  {

    String prText="";

    for(int pgCount=0;pgCount<TxtFrame.getParagraphs().size();pgCount++)

    {

        ParagraphEx Paragraph=TxtFrame.getParagraphs().get(pgCount);

        for(int prCount=0;prCount<Paragraph.getPortions().size();prCount++)

        {

           prText=Paragraph.getPortions().get(prCount).getText();

          System.out.println(prText+"\n");

         }//End Portion Loop

                                                                                   

    }//End Paragraphs Loop


  }

       

  public static void main(String[] args)

  {

    try{

         //Opening presentation

        PresentationEx presentation=new PresentationEx("D:\\ppt\\TestPresentation.pptx");

        //Traversing through all slides

        SlideEx slide;

        ShapesEx shps;

              

        for (int index=0;index<presentation.getSlides().size();index++)

        {

           //Accessing Slides

           slide = presentation.getSlides().get(index);

           //Accessing all shapes in slide

           shps=slide.getShapes();

           ShapeEx shape;     

           //Traversing through all shapes

           for (int shpCount = 0; shpCount < shps.size(); shpCount++)

          {

             shape= shps.get(shpCount);

             if(shape.getPlaceholder() != null)

             {

               //Getting AutoShape from group shapes set

               AutoShapeEx aShape = (AutoShapeEx)shape;

               if (aShape.getTextFrame() != null)

               {

                  //Accessing the text frame of shape

                  TextFrameEx tfText=aShape.getTextFrame();

                  ReadText(tfText);

               }//End Text Frame IF

             }//End AutoShape Check

              else if(shape instanceof AutoShapeEx )

      {

               //Getting AutoShape from group shapes set

               AutoShapeEx aShp = (AutoShapeEx)shape;

               if (aShp.getTextFrame() != null)

               {

                  //Accessing the text frame of shape

                  TextFrameEx tfText=aShp.getTextFrame();

                  ReadText(tfText);

               }//End Text Frame IF

                                                            

             }//End AutoShape Check


              //If shape is a group shape

             else if(shape instanceof GroupShapeEx)

             {

               //Type casting shape to group shape

               GroupShapeEx gShape = (GroupShapeEx)shape;

               //Traversing through all shapes in group shape

               for (int iCount=0;iCount< gShape.getShapes().size();iCount++)

               {

                  if(gShape.getShapes().get(iCount) instanceof AutoShapeEx)

                  {

                    //Getting AutoShape from group shapes set

                    AutoShapeEx aShp = (AutoShapeEx)gShape.getShapes().get(iCount);

                    if (aShp.getTextFrame() != null)

                    {

                       TextFrameEx tfText=aShp.getTextFrame();

                       ReadText(tfText);

                    }//End Text Frame IF

                  }

               }

              }

              //If shape is instance of Table

              else if(shape instanceof TableEx)

              {

                TableEx tTable=(TableEx)shape;

                for(int iCol=0;iCol<tTable.getColumns().size();iCol++)

                {

                  for(int iRow=0;iRow<tTable.getRows().size();iRow++)

                  {

                    TextFrameEx tfText=tTable.get(iCol,iRow).getTextFrame();

                    if(tfText!=null)

                       ReadText(tfText);

                  }//End Row Loop

                }//End Col Loop

               }//End Group Shape IF

                                     

            }//End Shape Loop

                       

         }//End Slide Traversal

        }

        catch(Exception e)

        {

               e.printStackTrace();

        }

       

     }

}


More about Aspose.Slides for Java



Contact Information


Suite 119, 272 Victoria Avenue

Chatswood, NSW, 2067

Australia

Aspose - Your File Format Experts

sales@aspose.com


Phone: 888.277.6734

Fax: 866.810.9465


 
wood burning stoves
 
subject: Read & Extract Text from PowerPoint Presentation PPTX Files