Link Extraction from a Google Page in Bash

How to extract URL actually opened page from browser i.e. Google Chrome using Linux bash script?

Put the url to the window title with https://chrome.google.com/webstore/detail/url-in-title/ignpacbgnbnkaiooknalneoeladjnfgb , then you can see it with wmctrl -l

Getting the URLs for the first Google search results in a shell script

I ended up using curl's --data-urlencode option to encode the query parameter and just sed for extracting the first result.

curl -s --get --data-urlencode "q=example" http://ajax.googleapis.com/ajax/services/search/web?v=1.0 | sed 's/"unescapedUrl":"\([^"]*\).*/\1/;s/.*GwebSearch",//'

How to extract a link from an html file using bash

You can do all of that with your native grep

This options may just be what you are looking for grep's man page:

-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)

-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

curl <URL> | grep -o -E "href=[\"'](.*)[\"'] "

The regular expression is extremely generic but you may be able to refine it to your needs

How to strip out all of the links of an HTML file in Bash or grep or batch and store them in a text file

$ sed -n 's/.*href="\([^"]*\).*/\1/p' file
http://www.drawspace.com/lessons/b03/simple-symmetry
http://www.drawspace.com/lessons/b04/faces-and-a-vase
http://www.drawspace.com/lessons/b05/blind-contour-drawing
http://www.drawspace.com/lessons/b06/seeing-values

Extract filename and path from URL in bash script

In bash:

URL='http://login:password@example.com/one/more/dir/file.exe?a=sth&b=sth'
URL_NOPRO=${URL:7}
URL_REL=${URL_NOPRO#*/}
echo "/${URL_REL%%\?*}"

Works only if URL starts with http:// or a protocol with the same length
Otherwise, it's probably easier to use regex with sed, grep or cut ...

Bash: Extract URL from markdown format

The command below gets the expected URL

sed -nre '/:target=/ s/.*[]][(]([^)]+)[)][{]:target=.*/\1/p' test.txt 

Result

https://www.linkhere.net/somepage

Alternative command

sed -nre '/:target=/ s/.*\]\(([^)]+)\)\{:target=.*/\1/p' test.txt



Related Topics



Leave a reply



Submit