I have an Excel file with hundreds of cells that use the Hyperlink formula =HYPERLINK(
. I need to extract the plain te
I ended up using Python:
Read the xlsx with a code similar to:
from openpyxl import load_workbook
wb = load_workbook(filename = 'cities.xlsx')
print(wb.worksheets)
print(dir(wb))
sheet_ranges = wb['Sheet1']
for c in sheet_ranges['B']:
print(c.hyperlink.target)
Note that the name 'Sheet1' or column names might be different case by case ('B' is the column with hyperlink in my case).
Here is an Excel formula that can extract URL from a hyperlink used into cell.
A1= Excel Cell where you want to extract URL.
=MID(FORMULATEXT(A1),FIND(CHAR(34),FORMULATEXT(A1))+1,FIND(CHAR(34),FORMULATEXT(A1),FIND(CHAR(34),FORMULATEXT(A1))+1)-1-FIND(CHAR(34),FORMULATEXT(A1)))
Excel Formula to get URL from excel hyperlink [Image]
Non-VBA possibility:
Work on copies of the cells with links because the first step is to replace part of their content (specifically =
with I suggest the not sign ¬
). Then, assuming the copy is in A1:
=SUBSTITUTE(LEFT(MID(A1,13,LEN(A1)),FIND("""",MID(A1,13,LEN(A1)))-1),"¬","=")
and replace ¬
with =
where the link contains an equals sign.
Here is a method that will return the hyperlink text whether it has been created by a formula, or by the Insert/Hyperlink method.
If the former, we merely have to parse the formula; if the latter, we need to iterate through the hyperlinks collection on the worksheet.
The formula will return nothing if there is no hyperlink in cell_ref; change to suit.
Option Explicit
Function HyperLinkText(rg As Range)
Dim sFormula As String, S As String
Dim L As Long
Dim H As Hyperlink, HS As Hyperlinks
sFormula = rg.Formula
L = InStr(1, sFormula, "HYPERLINK(""", vbBinaryCompare)
If L > 0 Then
S = Mid(sFormula, L + 11)
S = Left(S, InStr(S, """") - 1)
Else
Set HS = rg.Worksheet.Hyperlinks
For Each H In HS
If H.Range = rg Then
S = H.Address
End If
Next H
End If
HyperLinkText = S
End Function
You could extract it with a regex:
Dim re
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "^=HYPERLINK\(""([^""]+)"""
If re.Test(pRange.Formula) Then
Debug.Print "URL = " & re.Execute(pRange.Formula)(0).SubMatches(0)
Else
Debug.Print "URL not found"
End If
This just checks to see if the formula begins with:
=HYPERLINK("
and, if so, grabs the text from that point until the following "
.
Hm - playing around with it, I couldn't get .Address
to work either.
You say you want to extract the URL only, I was able to do that with this macro:
Function hyperlinkText(pRange As Range) As String
Dim st1 As String, st2 As String
Dim tempSub1 As String, tempSub2 As String
If Left(pRange.Formula, 10) <> "=HYPERLINK" Then
hyperlinkText = "not found"
Exit Function
Else
tempSub1 = WorksheetFunction.Substitute(pRange.Formula, """", "[", 1)
tempSub2 = WorksheetFunction.Substitute(tempSub1, """", "]", 1)
hyperlinkText = Mid(tempSub2, WorksheetFunction.Find("[", tempSub2) + 1, WorksheetFunction.Find("]", tempSub2) - WorksheetFunction.Find("[", tempSub2) - 1)
End If
End Function
Note though, it doesn't get the "Friendly Name" of the Hyperlink()
formula, just the URL.