Hi,
I need to import Word (.DOC and .DOCX) document files into MySQL Server Database directly using OpenXML.
In this code I have set an regexp is to check if a line starts with a whitespace, a letter, the bullet character or the - character or number
But this code import only the text contained in Word (.DOC and .DOCX) document files.
If on the Word file I have
- List 1
- List 2
- List 3
or
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(file, true))
{
body = wordDoc.MainDocumentPart.Document.Body;
contents = "";
var reg = new Regex(@"^[\s\p{L}\d•-]");
foreach (Paragraph co in
wordDoc.MainDocumentPart.Document.Body.Descendants<Paragraph>().Where<Paragraph>(somethingElse =>
reg.IsMatch(somethingElse.InnerText)))
{
contents += co.InnerText + "<br />";
//insert contents into database;
}
}
on the table of MySQL Server Database
+----------+
| contents |
+----------+
| List 1 |
| List 2 |
| List 3 |
| List 4 |
| List 5 |
| List 6 |
+----------+
6 rows in set (0.08 sec)
Can someone help me?
Any help would greatly appreciate.
Thank you.