Also, we will implement a converter ( DocxToStringConverter) which will convert specific XML elements (or their content) from document.xml to strings. No matter if structured data like databases, tables or spreadsheets or unstructured data like text documents, E-Mails or even scanned legacy documents: Search in many different formats and content types (text files, Word and other Microsoft Office documents or OpenOffice documents, Excel or LibreOffice Calc tables, PDF, E-Mail, CSV, doc, images. We will use the same DocxReader class from the article mentioned above to unzip the DOCX files and to read DOCX main part ( document.xml) with XmlReader. It is based on Show Word file in WPF article which explains DOCX file format and implements DOCX reader used in this tip, so I would recommend reading it before this one. The accompanying application will demonstrate how to read DOCX files, convert them to text and search for specific string or regex on that text. This tip shows how to perform string or regex search on multiple DOCX files in the specific directory.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |