Convert html to plain text in VBA
Set a reference to "Microsoft HTML object library".
Function HtmlToText(sHTML) As String
Dim oDoc As HTMLDocument
Set oDoc = New HTMLDocument
oDoc.body.innerHTML = sHTML
HtmlToText = oDoc.body.innerText
End Function
Tim
Decode HTML entities into plain text
You could create an HTMLDocument
object, store the HTML in it, and get the text version of it out of it:
Function HtmlDecode(str)
Dim dom
Set dom = CreateObject("htmlfile")
dom.Open
dom.Write str
dom.Close
HtmlDecode = dom.body.innerText
End Function
decoded = HtmlDecode("±") ' = "±"
Convert html to plain text in VBA
Set a reference to "Microsoft HTML object library".
Function HtmlToText(sHTML) As String
Dim oDoc As HTMLDocument
Set oDoc = New HTMLDocument
oDoc.body.innerHTML = sHTML
HtmlToText = oDoc.body.innerText
End Function
Tim
how to convert a column of text with html tags to formatted text in vba in excel
Your code is working just for the first line because you are getting and setting only the first line :
'get the A1 cell value
.document.body.InnerHTML = Sheets("Sheet1").Range("A1").Value
'set the B1 cell value
ActiveSheet.Paste Destination:=Sheets("Sheet1").Range("B1")
To apply your code for all the lines you have to execute it inside a loop.
So your code becomes :
Sub Sample()
Dim Ie As Object
'get the last row filled
lastRow = Sheets("Sheet1").Range("A" & Sheets("Sheet1").Rows.Count).End(xlUp).Row
'loop to apply the code for all the lines filled
For Row = 1 To lastRow
Set Ie = CreateObject("InternetExplorer.Application")
With Ie
.Visible = False
.Navigate "about:blank"
.document.body.InnerHTML = Sheets("Sheet1").Range("A" & Row).Value
'update to the cell that contains HTML you want converted
.ExecWB 17, 0
'Select all contents in browser
.ExecWB 12, 2
'Copy them
ActiveSheet.Paste Destination:=Sheets("Sheet1").Range("B" & Row)
'update to cell you want converted HTML pasted in
.Quit
End With
Set Ie = Nothing
Next
End Sub
VBA ms word macro: convert an embedded HTML link into plain text
This will print the HTML tags as you specify for all of the links in your document.
Dim hlink As Hyperlink
Dim htmlLink As String
For Each hlink In ThisDocument.Hyperlinks
With hlink
htmlLink = "<a target=""_blank"" href=""" & .Address & """>" & _
.TextToDisplay & "</a>"
Debug.Print htmlLink
End With
Next hlink
Of course, you'll want to do something more useful with them than just print them in the Immediate window.
As an aside, I prefer to use DuckDuckGo in my examples due its much better privacy policy than Google's...
How do you convert Html to plain text?
If you are talking about tag stripping, it is relatively straight forward if you don't have to worry about things like <script>
tags. If all you need to do is display the text without the tags you can accomplish that with a regular expression:
<[^>]*>
If you do have to worry about <script>
tags and the like then you'll need something a bit more powerful then regular expressions because you need to track state, omething more like a Context Free Grammar (CFG). Althought you might be able to accomplish it with 'Left To Right' or non-greedy matching.
If you can use regular expressions there are many web pages out there with good info:
- http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx
- http://www.google.com/search?hl=en&q=html+tag+stripping+&btnG=Search
If you need the more complex behaviour of a CFG I would suggest using a third party tool, unfortunately I don't know of a good one to recommend.
Related Topics
Height Is Not Correct in Flexbox Items in Chrome
@Media Queries and Image Swapping
HTML5 Video Safari Downloads Full Before Playing
How to Print HTML Source to Console with Phantomjs
How to Have Attributes on Closing Tags
Overflow-Y:Visible Not Working When Overflow-X:Hidden Is Present
Make Flex Container Take Width of Content, Not Width 100%
Select Size Attribute Size Not Working in Chrome/Safari
Progress Bar Made of Solid Line, with Dots as Steps
Img's Max-Height Not Respecting Parent's Dimensions
What Are The Default Values for Justify-Content & Align Content
What Do Square Brackets in Class Names Mean