What Are The Best Practices for Writing Maintainable CSS

What is the best practice for writing maintainable web scrapers?

Pages have the potential to change so drastically that building a very "smart" scraper might be pretty difficult; and if possible, the scraper would be somewhat unpredictable, even with fancy techniques like machine-learning etcetera. It's hard to make a scraper that has both trustworthiness and automated flexibility.

Maintainability is somewhat of an art-form centered around how selectors are defined and used.

In the past I have rolled my own "two stage" selectors:

  1. (find) The first stage is highly inflexible and checks the structure of the page toward a desired element. If the first stage fails, then it throws some kind of "page structure changed" error.

  2. (retrieve) The second stage then is somewhat flexible and extracts the data from the desired element on the page.

This allows the scraper to isolate itself from drastic page changes with some level of auto-detection, while still maintaining a level of trustworthy flexibility.

I frequently have used xpath selectors, and it is really quit surprising, with a little practice, how flexible you can be with a good selector while still being very accurate. I'm sure css selectors are similar. This gets easier the more semantic and "flat" the page design is.

A few important questions to answer are:

  1. What do you expect to change on the page?

  2. What do you expect to stay the same on the page?

When answering these questions, the more accurate you can be the better your selectors can become.

In the end, it's your choice how much risk you want to take, how trustworthy your selectors will be, when both finding and retrieving data on a page, how you craft them makes a big difference; and ideally, it's best to get data from a web-api, which hopefully more sources will begin providing.


EDIT: Small example

Using your scenario, where the element you want is at .content > .deal > .tag > .price, the general .content .price selector is very "flexible" regarding page changes; but if, say, a false positive element arises, we may desire to avoid extracting from this new element.

Using two-stage selectors we can specify a less general, more inflexible first stage like .content > .deal, and then a second, more general stage like .price to retrieve the final element using a query relative to the results of the first.

So why not just use a selector like .content > .deal .price?

For my use, I wanted to be able to detect large page changes without running extra regression tests separately. I realized that rather than one big selector, I could write the first stage to include important page-structure elements. This first stage would fail (or report) if the structural elements no longer exist. Then I could write a second stage to more gracefully retrieve data relative to the results of the first stage.

I shouldn't say that it's a "best" practice, but it has worked well.

Best Practices - CSS Theming

You should look into SASS, specifically their "Mixins" feature. It takes CSS and introduces a very DRY programmatic approach. You can override existing classes with it, making it perfect for what I think you're trying to do.

Link

Are there any CSS standards that I should follow while writing my first stylesheet?

An error that beginners make quite often:

CSS is semantic as well. Try to express concepts, not formats. Contrived example:

Wrong:

div.red
{
color: red;
}

as opposed to:

Good:

div.error
{
color: red;
}

CSS should be the formatting companion for the concepts you use on your web site, so they should be reflected in it. You will be much more flexible this way.

What is the best technique for consistent form, function between all web browsers (including Google Chrome)?

I am in a similar situation, working on a web app that is targeted at IT professionals, and required to support the same set of browsers, minus Opera.

Some general things I've learned so far:

  • Test often, in as many of your target browsers as you can. Make sure you have time for this in your development schedule.
  • Toolkits can get you part of the way to cross-browser support, but will eventually miss something on some browser. Plan some time for debugging and researching fixes for specific browsers.
  • If you need something that's not in a toolkit and can't find a free code snippet, invest some time to write utility functions that encapsulate the browser-dependent behavior.
  • Educate yourself about known browser bugs, so that you can steer your implementation around them.

A few more-specific things I've learned:

  • Use conditional code based on the user-agent only as a last resort, because different generations of the "same" browser may have different features. Instead, test for standards-compliant behavior first — e.g., if(node.addEventListener)..., then common non-standard functions — e.g., if(window.attachEvent)..., and then, if you must, look at the user-agent for a specific browser type & version number.
  • Knowing when the DOM is 'ready' for script access is different in just about every browser. A good toolkit will abstract this for you.
  • Event handlers are different in just about every browser. A good toolkit will abstract this for you.
  • Creating DOM elements, particularly form controls or elements with attributes, can be tricky with document.createElement and element.setAttribute. While not standard (and kinda yucky), using node.innerHTML with strings that contain bits of HTML seems to be more reliable across browser types. I have yet to find a toolkit that will let you use element.setAttribute to add a 'name' to a form element in IE.
  • CSS differences (and bugs) are just as important as JS differences.
  • The 'core' Javascript features (String, Date, RegExp, Array functions) seem to be pretty reliable and consistent across browsers, especially relative to the DOM/CSS/Window functions. There's some small joy in the fact that the language isn't entirely different on every platform. :-)

I haven't really run into any Chrome-specific JS bugs, but it's always one of the first browsers I test.

HTH

Maintainable CSS for large number of themes

Broadly speaking, I know of two ways to go about changing a site's style while using one css source (which may or not require multiple files).

  1. Define lots of classes .blueBorder, .redBorder etc and using JavaScript, add and remove classes on elements as needed.
  2. Or define classes and using JavaScript again, change the definition of those classes.

It is possible to use a mixture of both approaches, though I'm not sure why one would want to do that.

Here's a JSFIDDLE using the 2nd approach.

Rather than using jQuery, which would make the coding somewhat simpler (I guess) due to the power of its selectors, I decided to use a pure JavaScript solution. The meat of which, however, I did not write. The function getCSSRule, by Patrick Hunlock, can be found here. Each line of the function is commented. However, I've removed the comments in the Fiddle only because of wrapping issues.

The function returns a pointer to a CSS rule which then can be easily manipulated. For example:

    // get a class rule (in production code check return value for valid result)
var r = getCSSRule('.primaryColor');
// change its definition
r.style.backgroundColor = "#f00";

All elements which have the class primaryColor assigned to them will have their background color change to red (#f00) at the point the 2 above lines execute. There is nothing else required.

NOTE the names of the nodes in the style sheet are not exactly the same as the CSS rule (backgroundColor vs. background-color). I know a lot of folks here do not like w3Schools.com site, but when looking for a style object reference, that's where I found one. You can find it here

And here is the code:

Starting CSS Styles:

    <style type="text/css">

#box1 {width: 50%; height: 200px; margin: 40px auto; padding-top: 20px;}
#box2 {width: 50%; height: 120px; margin: 20px auto 20px; padding: 10px;}
.primaryColor {background-color: #f00;}
.primaryBorder {border: 10px solid #000;}
.secondaryColor {background-color: #ff0;}
.secondaryBorder {border: 5px solid #fff;}
.t {color: #f00;}
</style>

HTML:

<div id="box1" class="primaryColor primaryBorder">
<div id="box2" class="secondaryColor secondaryBorder"><p class="t">Theme Demonstration</p>
</div>
</div>

<form style="margin: 40px auto; width:50%">
<div role="radio" style="text-align:center" aria-checked="false">
<input type="radio" name="theme" CHECKED value="theme1" onClick="setThemeOne()" >Theme 1
<input type="radio" name="theme" value="theme2" onClick="setThemeTwo()" >Theme 2
<input type="radio" name="theme" value="theme3" onClick="setThemeThree()">Theme 3
</div>
</form>

And the good stuff, JavaScript:

function getCSSRule(ruleName, deleteFlag) {
ruleName=ruleName.toLowerCase();
if (document.styleSheets) {
for (var i=0; i<document.styleSheets.length; i++) {
var styleSheet=document.styleSheets[i];
var ii=0;
var cssRule=false;
do {
if (styleSheet.cssRules) {
cssRule = styleSheet.cssRules[ii];
} else {
cssRule = styleSheet.rules[ii];
}
if (cssRule) {
if (cssRule.selectorText.toLowerCase()==ruleName) {
if (deleteFlag=='delete') {
if (styleSheet.cssRules) {
styleSheet.deleteRule(ii);
} else {
styleSheet.removeRule(ii);
}
return true;
} else {
return cssRule;
}
}
}
ii++;
} while (cssRule)
}
}
return false;
}

function setThemeOne() {
var r = getCSSRule('.primaryColor');
r.style.backgroundColor = "#f00";
r = getCSSRule('.primaryBorder');
r.style.border = "10px solid #000;";
r = getCSSRule('.secondaryColor');
r.style.backgroundColor = "#ff0";
r = getCSSRule('.secondaryBorder');
r.style.border = "5px solid #fff";
r = getCSSRule('.t');
r.style.color = "#000";
};

function setThemeTwo() {
var r = getCSSRule('.primaryColor');
r.style.backgroundColor = "#ff0";
r = getCSSRule('.primaryBorder');
r.style.border = "10px solid #ccc;";
r = getCSSRule('.secondaryColor');
r.style.backgroundColor = "#f00";
r = getCSSRule('.secondaryBorder');
r.style.border = "5px solid #000";
r = getCSSRule('.t');
r.style.color = "#ccc";

};

function setThemeThree() {
var r = getCSSRule('.primaryColor');
r.style.backgroundColor = "#ccc";
r = getCSSRule('.primaryBorder');
r.style.border = "10px solid #000;";
r = getCSSRule('.secondaryColor');
r.style.backgroundColor = "#000";
r = getCSSRule('.secondaryBorder');
r.style.border = "5px solid #fff";
r = getCSSRule('.t');
r.style.color = "#fff";

};

Note about compatibility

This specific example I've tested in IE11 and current version of Chrome. However, I've had similar code deployed on a site since about 2011 and at that time the site supported browsers back to IE7 or IE8 (don't recall) and no one ever reported an issue. But I see now that I did patch the getCSSRule function for Chrome. (I did not have to do that for the current version.) Here's the patch:

 if (cssRule){  //If we found a rule...
// [KT] 04/24/2012 - added condition to check for undefined selector for Chrome
if ((cssRule.selectorText != undefined) && cssRule.selectorText.toLowerCase()==ruleName)){//match rule Name?

What is the best Approach for CSS framework of an Enterprise Cloud application?

File sizes

My first point would be that when dealing with an enterprise level application the actual total quantity of css when measured in megabytes is slightly less important, even for slow internet connections. It's important that the pages you load into an empty cache of a potential conversion that just clicked your pay per click ad for the first time are as tight as you can possibly make them, but for an app that a user is paying for and is intending to invest their time and effort, priming a cache every release, even with a megabyte of css is less of a problem. You could load it all last on the login page so it's all sorted while they put their credentials in.

Furthermore, you'll have the time to investigate some other techniques, such as loading critical 'above the fold' css in it's own, optimised file first; and splitting the css files up so that the common stuff is loaded on the first page view but any page specific stuff is loaded per page, as it's visited (for the record, this can be very good for the aforementioned PPC targets).

CCS Tricks goes into more detail here and here.

Complexity

One of the bigger considerations of enterprise cloud applications is the maintainability of the css. You're probably going to have a team of developers and a complex user interface. These things can quickly turn into a maintenance nightmare if the wrong decisions are made concerning the approach to css.

It's all very well if you users can load a page in 0.1s less, but if it takes you 30mins more to make every simple css edit then you're in trouble.

My recommendation

You want a combination of both. You should strive for semantic, context free css selectors in order to hit maximum re-usability (and low file size) and maximum maintainability. This allows for effective file size management and effective, scalable development.

For example:


.blue-box

.header-login-box

.contact-form-submit .green-button

bad: not semantic, or too context specific. I'm assuming that .blah pretty much falls into this category, judging by the phrase 'do this for each element'.


.login-box

better: easier to re-use, semantic, but still too contextual


.box--highlighted

.button

.button--standout

even better: really re-usable because of complete decoupling from page context, but still clearly semantic, making it easier to maintain.


With the final examples you break your app UI designs down into modules which are defined and re-used wherever they are needed. It's conceivable that you may use more than one per HTML element, but you won't have ten.

It's also OK to use utility classes, such as .pull-left in fact, Harry Roberts at CSS Wizardry, a successful consultant whose done this stuff in the wild for real clients recommends it.

Three further avenues of investigation

There are currently three organisational / naming strategies for scalable css architecture that try to tackle the problem, you might want to look at them in more detail:

BEM: docs introductory article

OOCSS: docs introductory article

SMACSS: docs and introduction

All three will help maximise re-usability and minimise file sizes while giving you rules to follow to keep things tight and help with new members of the team.

CSS Best Practices: Classes for all elements or styling by parent class?

I would definitely go ahead with the containment selector in the case you give. Fewer spurious classes in the markup is easier to maintain, and the rule ‘.main p’ says clearly what it does. Use your judgement to whether more complicated cases are still clear.

Note that ‘.main p’ selects all descendent paragraphs and not just direct children, which may or may not be what you want; accidentally-nested descendant matches are a potential source of bugs (especially for cumulative properties like relative font size). If you want only children to be selected you need ‘.main>p’, which unfortunately does not work in IE6.

This is one reason why many sites go crazy with the classnames: the more involved selectors that could otherwise be used to pick out elements without a classname, tend not to work in IE.



Related Topics



Leave a reply



Submit