Parsing CSS Background-Image

Parsing CSS background-image

function split (string) {
var token = /((?:[^"']|".*?"|'.*?')*?)([(,)]|$)/g;
return (function recurse () {
for (var array = [];;) {
var result = token.exec(string);
if (result[2] == '(') {
array.push(result[1].trim() + '(' + recurse().join(',') + ')');
result = token.exec(string);
} else array.push(result[1].trim());
if (result[2] != ',') return array
}
})()
}

split("linear-gradient(top left, red, rgba(255,0,0,0)), url(a), image(url" +
"(b.svg), 'b.png' 150dpi, 'b.gif', rgba(0,0,255,0.5)), none").toSource()

["linear-gradient(top left,red,rgba(255,0,0,0))", "url(a)",
"image(url(b.svg),'b.png' 150dpi,'b.gif',rgba(0,0,255,0.5))", "none"]

Parsing CSS background-image and selector

(?<selector>\n.*?)[^{}](\{.+?(?<=url\()(?<image>[^\)]*).+?\})

released on your <script> block would give you a regex that has 2 named groups

regex.groups("selector") = ".foo, #bar"
regex.groups("image") = "image.png"

and for the second match

regex.groups("selector") = "P"
regex.groups("image") = "image2.png"

or you can use the "image.png" in your regex to get :

\n(?<selector>[^{]*?){(?<t>.+)(?<image>image.png)

to get the result in

regex.groups("selector") = ".foo, #bar"

Parsing css background url and selector using regex

update

After a closer look, I offer 2 soulutions that mitigate backtracking issue's to a relative degree.

Before looking at them, I want to point out that there are only a very few delimiters associated with CSS syntax.

Moreover, it's more related to the order and content of allowed characters that define CSS syntax.

The cure to backtracking is to restrict the regex engine to fewer allowable

characters to match and withing strategic position.

If you look at the CSS specification here -> https://www.w3.org/TR/CSS21/syndata.html

you'll notice that it is entirely defined by regular expressions.

That indicates CSS parsers are entirely constructed with chopped version of regex.

However, while it would be an interesting exercise to put it into a

all encompasing regex, I will decline that challenge, because there is

nothing in it for me.

Instead, I offer these 2 regex tailored to your request.

Fisrt one:

  • Matches only the first url() block within the <style> element

<style[^>]*?>(?:[^{}:]*{[^{}]*?:[^{}()]*?})*?(?:([^{}:]*){[^{}]*?:\s*url\s*\(\s*([^{}()]*?)\s*\)\s*})

see -> https://regex101.com/r/2SNIks/1


Second one:

  • Matches all the url() blocks with the <style> element

(?:<style[^>]*?>|(?!^)\G)(?:(?:(?!</style)[^{}:])*{[^{}]*?:[^{}()]*?})*?(?:([^{}:]*){[^{}]*?:\s*url\s*\(\s*([^{}()]*?)\s*\)\s*})

see -> https://regex101.com/r/d8q6LH/1


For both regex,

  • The selector is in group 1
  • The url is in group 2

Why do I get a parse error when trying to add a background image?

Just use quotes and then whitespace is okay. Single or double:

background-image: url('../Project Img/camping sky.jpg');

Parse css file and build array of background images with the class as a key

Try this function:

function makeArrayOfBackroundImages($css) {
if (!preg_match_all('/\.([a-z\d_]+)\s*\{[^{}]*url\("?([^()]+)"?\)?/i', $css, $arr)) return array();
return array_combine($arr[1], $arr[2]);
}

Note: It extract only one (first) image url per class. So it doesn't intended for work with multiple background images.

Extracting background-images from a web page / Parsing HTML+CSS

When you parse the CSS of a web site, any images you are going to get back are going to be related to the user interface (sprites, backgrounds), not the actual content of the page.

I don't think it would be worth your while unless you're just trying to extract logos. In that case I would restrict to matches on class names/ids/paths containing the word "logo".

If you want to extract "representative images" from a page, I would just parse the image tags as you are doing then generate (and crop) a screenshot of the page as per: How do I take screenshots of web pages using ruby and a unix server?

How are you handling images that aren't in the raw HTML source?

In terms of libraries, I'm pretty sure nokogiri is the best thing out there.

Css background-image issue

Change display: inline; with display: inline-block;, also add background properties:

.linkedin {
display: inline-block;
background-image: url('../images/Linkedin(Idle).png');
background-repeat: no-repeat;
background-position: 0 0;
width: 16px;
height: 16px;
}


Related Topics



Leave a reply



Submit