PHP to Clean-Up Pasted Microsoft Input

PHP to clean-up pasted Microsoft input

HTML Purifier will create standards compliant markup and filter out many possible attacks (such as XSS).

For faster cleanups that don't require XSS filtering, I use the PECL extension Tidy which is a binding for the Tidy HTML utility.

If those don't help you, I suggest you switch to FCKEditor which has this feature built-in.

Remove MS Word HTML using PHP

http://htmlpurifier.org/

This will do what you want.

How to clean up garbage text from string using PHP?

Word documents (like docx and doc) are not straight text files - they are actually proprietary file types that do not just have the text from byte 0 - this is how they have fancy formatting and fonts. .docx files are actually archives (.zip files) that contain a myriad of XML and styles.

Your best bet is to use a text input form, or find code online that allows you to extract just the text. Or, download the doc files to your own computer and use your own copy of MS word to open it.

formatted PHP code in Microsoft Word

i know one way.

open this page: http://qbnz.com/highlighter/demo.php

the above link is the php syntax highlighter on web,

(1) copy and paste your php code to the text area labelled 'Input via a text field:'

(2) go to the 'Options' selectbox below that text area, and choose 'Line numbers: none'

(3) click the 'Highlight!' button at the bottom of the page

(4) the highlighted php code will be shown

(5) select, copy this highlighted code, and paste it into Word. u will see the colored code in your Word document

those' are the detailed steps

hope this may help

php regular expression removing mso tag

Code :

$html = "<p style='mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; padding: 4px;' class=MsoNormal>text</P>";

$cleanHtml = preg_replace('(mso-[a-z\-: ]+; )i', '', $html);

echo $cleanHtml;

Output :

<P style='padding: 4px;' class=MsoNormal>text</P>

Clean Microsoft Word Pasted Text using JavaScript

Here is the function I wound up writing that does the job fairly well (as far as I can tell anyway).

I am certainly open for improvement suggestions if anyone has any. Thanks.

function cleanWordPaste( in_word_text ) {
var tmp = document.createElement("DIV");
tmp.innerHTML = in_word_text;
var newString = tmp.textContent||tmp.innerText;
// this next piece converts line breaks into break tags
// and removes the seemingly endless crap code
newString = newString.replace(/\n\n/g, "<br />").replace(/.*<!--.*-->/g,"");
// this next piece removes any break tags (up to 10) at beginning
for ( i=0; i<10; i++ ) {
if ( newString.substr(0,6)=="<br />" ) {
newString = newString.replace("<br />", "");
}
}
return newString;
}

Hope this is helpful to some of you.

how to separate data pasted from excel to textarea

you have to check this question:
Parse form textarea by comma or new line

using that in your code:

 <?php

if(isset($_POST['url']))
{

$input = $_POST['url'];

$data = preg_split("/[\r\n]+/", $input, -1, PREG_SPLIT_NO_EMPTY);
var_dump($data);

}

?>

$data array will have the required data



Related Topics



Leave a reply



Submit