How to Compare 2 HTML Pages, and Output Only the Different Bits in Ruby or PHP

What is the best Diff library in Ruby?

I looked around and couldn't find an existing gem or library that offered a convenient way to generate diff style output from ruby.

I just released diffy which does what I want. It's a lightweight wrapper around diff which lets you generate text or html diffs from two strings, without a lot of fuss. I hope others find it useful. It's in use on wiff.me for anyone wants to preview the html output.

Ruby on Rails / Devise - Split user edit page into 2 pages

You don't need a separate controller, especially since you're already extending the default Devise RegistrationsController, which already works fine for updating user attributes.

Edit: If these aren't just extended user attributes, and profile is it's own object with its own logic and behaviour, then consider creating it's own controller, to manage the CRUD for that object.

If you're using devise's user/edit page as part one, all you need to do is add a profile action in your custom controller, and create a view file to go with it.

# this is all that's in the edit action from Devise
def edit
render :edit
end

# add this to your custom RegistrationsController
def profile
render :profile
end

Then you can fiddle with your routes (see this and this) until they route the URLs you want to use to the correct controller:

# you probably have this, which covers your current user/edit route
devise_for :users

# but you can add this to extend these routes
devise_scope :user do
# will route /profile to the profile action on User::RegistrationsController
get :profile, to: 'users/registrations'

# or if you want more control over the specifics
get 'users/settings/profile', to: 'users/registrations#profile', as: :user_profile
end

For your second view/form to update user attributes from another, non-devise controller, you can use form_for current_user, { url: user_registration_path }

If you do want to use resource, you'll have to add this to the top of your registrations controller, so that the resource gets defined on your profile action as well:

prepend_before_filter :authenticate_scope!, only: [:edit, :profile, :update, :destroy]

Take a look at devise's documentation around strong parameters to see how to make sure whatever additional attributes you're going to add to your user are white listed by your custom RegistrationsController.

PHP block syntax conventions

The only difference between these is that the second requires the setting short_open_tag to be enabled (which is off by default in new PHP version).

<?php regular open tag.

<? Short open tag (disabled by default)

Beyond this, the placement of something like <? include("langsettings.php"); ?> on its own line enclosed in its own pair of <? ?> is really a matter of style specific to the source you found it in. Different projects use very widely different conventions, and PHP books each tend to adopt their own convention.

PHP doesn't unfortunately have any real specific coding conventions such as you might find in languages like Ruby, Java, or Python, which is, in my unsolicited opionion, one of PHP's chief failings as well as one of its greatest flexibilities.

Now, as to whether or not short open tags are good practice for use in a modern PHP application is a separate issue entirely, which has been discussed at great length here.

Anyone have a diff algorithm for rendered HTML?

There's another nice trick you can use to significantly improve the look of a rendered HTML diff. Although this doesn't fully solve the initial problem, it will make a significant difference in the appearance of your rendered HTML diffs.

Side-by-side rendered HTML will make it very difficult for your diff to line up vertically. Vertical alignment is crucial for comparing side-by-side diffs. In order to improve the vertical alignment of a side-by-side diff, you can insert invisible HTML elements in each version of the diff at "checkpoints" where the diff should be vertically aligned. Then you can use a bit of client-side JavaScript to add vertical spacing around checkpoint until the sides line up vertically.

Explained in a little more detail:

If you want to use this technique, run your diff algorithm and insert a bunch of visibility:hidden <span>s or tiny <div>s wherever your side-by-side versions should match up, according to the diff. Then run JavaScript that finds each checkpoint (and its side-by-side neighbor) and adds vertical spacing to the checkpoint that is higher-up (shallower) on the page. Now your rendered HTML diff will be vertically aligned up to that checkpoint, and you can continue repairing vertical alignment down the rest of your side-by-side page.

Make header and footer files to be included in multiple html pages

You can accomplish this with jquery.

Place this code in index.html

<html>
<head>
<title></title>
<script
src="https://code.jquery.com/jquery-3.3.1.js"
integrity="sha256-2Kok7MbOyxpgUVvAk/HJ2jigOSYS2auK4Pfzbm7uH60="
crossorigin="anonymous">
</script>
<script>
$(function(){
$("#header").load("header.html");
$("#footer").load("footer.html");
});
</script>
</head>
<body>
<div id="header"></div>
<!--Remaining section-->
<div id="footer"></div>
</body>
</html>

and put this code in header.html and footer.html, at the same location as index.html

<a href="http://www.google.com">click here for google</a>

Now, when you visit index.html, you should be able to click the link tags.

What is the shortest way of inserting a variable into text with PHP?

You could use short_open_tag, which have to be enabled in your configuration, but that's not considered as a good practice, as it only works if those are enabled -- and they are not always (maybe not even by default)

Using long tags and echo/print might be longer, yes... But I would recommend using those, and not short tags.


Also note that you might need to escape your data, when it comes from an un-trusted source and/or might contain HTML you don't want to get injected in the page, to avoid injections of HTML/JS (see htmlspecialchars) :


EDIT after the comments, to add couple of things about short_open_tag :

Why are short open tags considered (at least by me ^^ ) bad practice ?

First of all, after some checking, they are not enabled by default :

For PHP 5.3 :

squale@shark:~/temp/php/php-5.3.0
$ grep 'short_open_tag' php.ini-development
; short_open_tag
short_open_tag = Off
squale@shark:~/temp/php/php-5.3.0
$ grep 'short_open_tag' php.ini-production
; short_open_tag
short_open_tag = Off

Disabled by default in either "development" or "production" settings.

For PHP 5.2.10 (most recent version of PHP 5.2) :

squale@shark:~/temp/php/php-5.2.10
$ grep 'short_open_tag' php.ini-dist
short_open_tag = On
squale@shark:~/temp/php/php-5.2.10
$ grep 'short_open_tag' php.ini-recommended
; - short_open_tag = Off [Portability]
short_open_tag = Off

Disabled by default in the "recommended" settings


Considering these default settings are sometimes (often ?) kept by hosting services, it is dangerous to rely on short_open_tag being activated.

(I have myself run into problem with those being disabled... And when you are not admin of the server and don't have required privilegies to modify that, it's not fun ^^ )

If you want some numbers, you can take a look at Quick survery: short_open_tag support on or off by default?

(Not a scientific proof -- but show it could be dangerous to use those for an application you'd release to the public)


Like you said, those, when activated, conflict with XML declaration -- means you have to use something like this :

<?php echo '<?xml version="1.0" encoding="UTF-8" ?>'; ?>

Considering short open tags exists, and might be activated on the server you'll use, you should probable not use <?xml ever, though ; too bad :-(


Actually, reading through the php.ini-recommended of PHP 5.2.10 :

; Allow the <? tag.  Otherwise, only <?php and <script> tags are recognized.
; NOTE: Using short tags should be avoided when developing applications or
; libraries that are meant for redistribution, or deployment on PHP
; servers which are not under your control, because short tags may not
; be supported on the target server. For portable, redistributable code,
; be sure not to use short tags.

The one from PHP 6 is even more interesting :

; This directive determines whether or not PHP will recognize code between
; <? and ?> tags as PHP source which should be processed as such. It's been
; recommended for several years that you not use the short tag "short cut" and
; instead to use the full <?php and ?> tag combination. With the wide spread use
; of XML and use of these tags by other languages, the server can become easily
; confused and end up parsing the wrong code in the wrong context. But because
; this short cut has been a feature for such a long time, it's currently still
; supported for backwards compatibility, but we recommend you don't use them.

(Might be the same in PHP 5.3 ; didn't check)


There have been rumors short open tags could be removed from PHP 6 ; considering the portion of php.ini I just posted, it probably won't... but, still...


To give an argument pointing to the other direction (I've gotta be honest, after all) : using short open tags for template files (only) is something that is often done in Zend Framework's examples that use template files :

In our examples and documentation, we
make use of PHP short tags:

That said, many developers prefer to
use full tags for purposes of
validation or portability. For
instance, short_open_tag is disabled
in the php.ini.recommended file, and
if you template XML in view scripts,
short open tags will cause the
templates to fail validation.

(source)

On the contrary, for .php files :

Short tags are never allowed. For
files containing only PHP code, the
closing tag must always be omitted

(source)


I hope those informations are useful, and bring some kind of answer to your comment :-)

Text Parser with PHP, like Instapaper

You might try looking at the algorithms behind this bookmarklet, readability - It's got a decent success rate for extracting content among on all web page rubbish.

Friend of mine made it, that's why I'm recommending it - since I know it works, and I'm aware of the many techniques he's using to parse the data. You could apply these techniques for what your asking.

Ruby on Rails externalising Views

What you are looking for are called partials. You can create a partial, such as a sidebar or a footer, then render it into a template.

The official Rails guide contains some information about using partials.

Essentially, you create a file name prefixed by an underscore such as posts/_form.html.erb and you render it into the view

<%= render partial: "form" %>

You can also specify an absolute path from the views folder

<%= render partial: "/posts/form" %>

The same naming conventions of the template (e.g. the format suffixes) apply.

php tidy strange behaviour

The problem is that you are trying to process an HTML fragment.

When you do this, the rest of the document is inferred. If you leave the configuration as default, and output a tidy document with just a piece of text, you will see the DOCTYPE, html, head and body tags that you did not give it. It inferred that these tags had to exist.

The problem here is that the HTML specification regarding objects states that:

The OBJECT element may also appear in the content of the HEAD element.

When the location of your fragment is being inferred, it puts it in the first place that it can occur. This means that tidy will place it in the head tag.

The reason why show-body-only is affecting your output is because your fragment did not get placed in the body.


However when you add some text, it forces your snippet into the body tag. This is because raw text is not allowed in the head tag. So the logically inferred location of your fragment is in the body.

In my opinion, the best option available to you is to inject all of your code fragments into a "template" document, and then parse them out again afterwards. You can probably do this fairly easily with DOMDocument.

A second solution would be to inject a sentinel value that you can strip out again afterwards, when showing only the body.

I.e.

____MY_MAGIC_TOKEN____
<object ...></object>

Then you can strip it out again afterwards.



Related Topics



Leave a reply



Submit