Making Sure PHP Substr Finishes on a Word Not a Character

Making sure PHP substr finishes on a word not a character

It could be done with a regex, something like this will get up to 260 characters from the start of string up to a word boundary:

$line=$body;
if (preg_match('/^.{1,260}\b/s', $body, $match))
{
$line=$match[0];
}

Alternatively, you could maybe use the wordwrap function to break your $body into lines, then just extract the first line.

substr that won't cut out word with dynamic content

This is the way to do without using regex :

<?
$position=200;
$post = substr($row_News['teaser'],0,$position);
$post = substr($post, 0, strrpos($post, ' '));
echo $post."....";
?>

Reference : Making sure PHP substr finishes on a word not a character

How to capture complete words using substr() in PHP, limit by word?

If you just count the words the resulting sting could still be very long as a single "word" might have 30 characters or more. I would suggest instead truncating the text to 100 characters, except if this causes a word to be truncated then you should also remove the truncated part of the word. This is covered by this related question:

How to Truncate a string in PHP to the word closest to a certain number of characters?

Using wordwrap

$your_desired_width = 100;
if (strlen($string) > $your_desired_width)
{
$string = wordwrap($string, 100);
$i = strpos($string, "\n");
if ($i) {
$string = substr($string, 0, $i);
}
}

This is a modified versions of the answer here. if the input text could be very long you can add this line before the call to wordwrap to avoid wordwrap having to parse the entire text:

$string = substr($string, 0, 101);

Using a regular expression (Source)

$string = preg_replace('/\s+?(\S+)?$/', '', substr($string, 0, 100));

How to get first x chars from a string, without cutting off the last word?

You can use the wordwrap() function, then explode on newline and take the first part:

$str = wordwrap($str, 28);
$str = explode("\n", $str);
$str = $str[0] . '...';

How can I truncate a string in php without cutting off words?

Before I give you an answer in PHP, have you considered the following CSS solution?

overflow:hidden;
white-space:nowrap;
text-overflow:ellipsis;

This will result in the text being cut off at the most appropriate place and an ellipsis ... marking the cut-off.

If this is not the effect you're looking for, try this PHP:

$words = explode(" ",$input);
// if the first word is itself too long, like hippopotomonstrosesquipedaliophobia‎
// then just cut that word off at 20 characters
if( strlen($words[0]) > 20) $output = substr($words[0],0,20);
else {
$output = array_shift($words);
while(strlen($output." ".$words[0]) <= 20) {
$output .= " ".array_shift($words);
}
}

How to Truncate a string in PHP to the word closest to a certain number of characters?

By using the wordwrap function. It splits the texts in multiple lines such that the maximum width is the one you specified, breaking at word boundaries. After splitting, you simply take the first line:

substr($string, 0, strpos(wordwrap($string, $your_desired_width), "\n"));

One thing this oneliner doesn't handle is the case when the text itself is shorter than the desired width. To handle this edge-case, one should do something like:

if (strlen($string) > $your_desired_width) 
{
$string = wordwrap($string, $your_desired_width);
$string = substr($string, 0, strpos($string, "\n"));
}

The above solution has the problem of prematurely cutting the text if it contains a newline before the actual cutpoint. Here a version which solves this problem:

function tokenTruncate($string, $your_desired_width) {
$parts = preg_split('/([\s\n\r]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
$parts_count = count($parts);

$length = 0;
$last_part = 0;
for (; $last_part < $parts_count; ++$last_part) {
$length += strlen($parts[$last_part]);
if ($length > $your_desired_width) { break; }
}

return implode(array_slice($parts, 0, $last_part));
}

Also, here is the PHPUnit testclass used to test the implementation:

class TokenTruncateTest extends PHPUnit_Framework_TestCase {
public function testBasic() {
$this->assertEquals("1 3 5 7 9 ",
tokenTruncate("1 3 5 7 9 11 14", 10));
}

public function testEmptyString() {
$this->assertEquals("",
tokenTruncate("", 10));
}

public function testShortString() {
$this->assertEquals("1 3",
tokenTruncate("1 3", 10));
}

public function testStringTooLong() {
$this->assertEquals("",
tokenTruncate("toooooooooooolooooong", 10));
}

public function testContainingNewline() {
$this->assertEquals("1 3\n5 7 9 ",
tokenTruncate("1 3\n5 7 9 11 14", 10));
}
}

EDIT :

Special UTF8 characters like 'à' are not handled. Add 'u' at the end of the REGEX to handle it:

$parts = preg_split('/([\s\n\r]+)/u', $string, null, PREG_SPLIT_DELIM_CAPTURE);

Extracting substring ending with word not working in PHP

You're useing strpos(), which looks from the start of a string. You want strRpos() (r=reverse):

$description=substr($userresult->aboutme, 0, strrpos($userresult->aboutme, ' '));

You don't want to use the offset for the strpos(), because it might work in this situation, but if the first few words are shorter/longer, it no longer works.

Get first 100 characters from string, respecting full words

All you need to do is use:

$pos=strpos($content, ' ', 200);
substr($content,0,$pos );

PHP function for removing everything before a substring (with including option)

What you're looking for is something like this:

<?php
function remove_before($needle, $haystack, $removeNeedle=false) {

$pos = strpos($haystack, $needle);

if (($pos !== false)) {
return substr($haystack, $pos + (strlen($needle) * $removeNeedle));
}

return $haystack; // if word not found, return full string.
}

$needle = "cheese";
$haystack = "I like to eat cheese, crackers and ham";

echo remove_before($needle, $haystack, false);
?>

$pos + (strlen($needle)*$removeNeedle) $removeNeedle is a boolean, meaning its value is either 1 or 0. if you multiply by 1, then value is value, if you multiply by 0, value is 0. so basically, you multiply the length of needle by 1 or 0.

The boolean at the end, is optional, as its default value is false.

The code is written by me, and is free of use to anyone, without limit.



Related Topics



Leave a reply



Submit