Efficient Way to Round Double Precision Numbers to a Lower Precision Given in Number of Bits

Extracting Images from Word Documents Using Python docx2txt

https://github.com/ankushshah89/python-docx2txt/blob/c94663234d2882aa75932f9c9973eb5a804df13b/docx2txt/docx2txt.py#L72

it specifies directory, so instead

for i in range(len(files)):
image = docx2txt.process(path+ "/" +files[i], savepath) ## this is the line that overwrites each new image

you could specify a separate save path

for i in range(len(files)):
savepath=savepath+str(i)
image = docx2txt.process(path+ "/" +files[i], savepath) ## this is the line that overwrites each new image

Extracting images from Word document using VBA

Extracting the pictures from a Filtered HTML document that was created from your original source document would be faster. However, you said that was not a good fit for you needs so ... here is example code that will locate pictures in your source document and paste them into a second document.

The speed problem of this type of code is caused by the CopyPicture working from a Selection command, so I recommend using a range instead. Of course the For/Next loop that is required is slower no matter what.

Sub CopyPasteAsPicture()
Dim doc As Word.Document, iShp As Word.InlineShape, shp As Word.Shape
Dim i As Integer, nDoc As Word.Document, rng As Word.Range
Set doc = ActiveDocument

If doc.Shapes.Count > 0 Then
For i = 1 To doc.Shapes.Count
Set shp = doc.Shapes(i)
If shp.Type = msoLinkedPicture Or shp.Type = msoPicture Then
'if you want only pictures extracted then you have
'to specify the type
shp.ConvertToInlineShape
'if you want all extracted pictures to be in the sequence
'they appear in the document then you have to convert
'floating shapes to inline shapes
End If
Next
End If

If doc.Content.InlineShapes.Count > 0 Then
Set nDoc = Word.Documents.Add
Set rng = nDoc.Content
For i = 1 To doc.Content.InlineShapes.Count
doc.Content.InlineShapes(i).Range.CopyAsPicture
rng.Paste
rng.Collapse Word.WdCollapseDirection.wdCollapseEnd
rng.Paragraphs.Add
rng.Collapse Word.WdCollapseDirection.wdCollapseEnd
Next
End If
End Sub

If you want to place all shapes (floating or inline) into a folder as image files, then the best way is to save the source document as a filtered HTML document. Here is the command:

htmDoc.SaveAs2 FileName:=LGPWorking & strFileName, AddToRecentFiles:=False, FileFormat:=Word.WdSaveFormat.wdFormatFilteredHTML

In the above the active document is assigned to the variable htmDoc. I am giving this new document a specific name and location. The output from this is not only the HTML file but also a directory by the same name with an appended "_Files" label. In the "x_Files" directory are all the image files.

If you only want selective images pulled from your original source document, or if you want images pulled from multiple source documents ... then you need to use the above code that I shared for placing only the images you want from one or more source document into a new Word document and then save that new document as an Filtered HTML.

When your routine is done, you can Kill the HTML document and only leave the Files directory.



Related Topics



Leave a reply



Submit