How to Truncate a String in PHP to the Word Closest to a Certain Number of Characters

How to Truncate a string in PHP to the word closest to a certain number of characters?

By using the wordwrap function. It splits the texts in multiple lines such that the maximum width is the one you specified, breaking at word boundaries. After splitting, you simply take the first line:

substr($string, 0, strpos(wordwrap($string, $your_desired_width), "\n"));

One thing this oneliner doesn't handle is the case when the text itself is shorter than the desired width. To handle this edge-case, one should do something like:

if (strlen($string) > $your_desired_width) 
{
$string = wordwrap($string, $your_desired_width);
$string = substr($string, 0, strpos($string, "\n"));
}

The above solution has the problem of prematurely cutting the text if it contains a newline before the actual cutpoint. Here a version which solves this problem:

function tokenTruncate($string, $your_desired_width) {
$parts = preg_split('/([\s\n\r]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
$parts_count = count($parts);

$length = 0;
$last_part = 0;
for (; $last_part < $parts_count; ++$last_part) {
$length += strlen($parts[$last_part]);
if ($length > $your_desired_width) { break; }
}

return implode(array_slice($parts, 0, $last_part));
}

Also, here is the PHPUnit testclass used to test the implementation:

class TokenTruncateTest extends PHPUnit_Framework_TestCase {
public function testBasic() {
$this->assertEquals("1 3 5 7 9 ",
tokenTruncate("1 3 5 7 9 11 14", 10));
}

public function testEmptyString() {
$this->assertEquals("",
tokenTruncate("", 10));
}

public function testShortString() {
$this->assertEquals("1 3",
tokenTruncate("1 3", 10));
}

public function testStringTooLong() {
$this->assertEquals("",
tokenTruncate("toooooooooooolooooong", 10));
}

public function testContainingNewline() {
$this->assertEquals("1 3\n5 7 9 ",
tokenTruncate("1 3\n5 7 9 11 14", 10));
}
}

EDIT :

Special UTF8 characters like 'à' are not handled. Add 'u' at the end of the REGEX to handle it:

$parts = preg_split('/([\s\n\r]+)/u', $string, null, PREG_SPLIT_DELIM_CAPTURE);

How to truncate a string in PHP to the sentence closest to a certain number of characters?

This is what I came up with... you should check if the sentence is longer than the len you are looking for.. among other things like what g13n said. It might be better if the sentence is too short/long to chopping it off and putting "...". Plus, you would have to check/convert whitespace since strrpos will only look for what is given.

$maxlen = 150;
$file = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer malesuada eleifend orci, eget dignissim ligula porttitor cursus. Praesent in blandit enim. Maecenas vitae eleifend est. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Maecenas pulvinar gravida tempor.";
if ( strlen($file) > $maxlen ){
$file = substr($file,0,strrpos($file,". ",$maxlen-strlen($file))+1);
}

if you want to use the same function you have, you can try this:

function shortenString($string, $your_desired_width) {
$parts = preg_split('/([\s\n\r]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
$parts_count = count($parts);

$length = 0;
$last_part = 0;
$last_taken = 0;
foreach($parts as $part){
$length += strlen($part);
if ( $length > $your_desired_width ){
break;
}
++$last_part;
if ( $part[strlen($part)-1] == '.' ){
$last_taken = $last_part;
}
}
return implode(array_slice($parts, 0, $last_taken));
}

How can I truncate a string to the first 20 words in PHP?

function limit_text($text, $limit) {
if (str_word_count($text, 0) > $limit) {
$words = str_word_count($text, 2);
$pos = array_keys($words);
$text = substr($text, 0, $pos[$limit]) . '...';
}
return $text;
}

echo limit_text('Hello here is a long sentence that will be truncated by the', 5);

Outputs:

Hello here is a long ...

Truncate a string to first n characters of a string and add three dots if any characters are removed

//The simple version for 10 Characters from the beginning of the string
$string = substr($string,0,10).'...';

Update:

Based on suggestion for checking length (and also ensuring similar lengths on trimmed and untrimmed strings):

$string = (strlen($string) > 13) ? substr($string,0,10).'...' : $string;

So you will get a string of max 13 characters; either 13 (or less) normal characters or 10 characters followed by '...'

Update 2:

Or as function:

function truncate($string, $length, $dots = "...") {
return (strlen($string) > $length) ? substr($string, 0, $length - strlen($dots)) . $dots : $string;
}

Update 3:

It's been a while since I wrote this answer and I don't actually use this code any more. I prefer this function which prevents breaking the string in the middle of a word using the wordwrap function:

function truncate($string,$length=100,$append="…") {
$string = trim($string);

if(strlen($string) > $length) {
$string = wordwrap($string, $length);
$string = explode("\n", $string, 2);
$string = $string[0] . $append;
}

return $string;
}

How to Truncate a string in PHP to the word closest to a certain number of characters?

By using the wordwrap function. It splits the texts in multiple lines such that the maximum width is the one you specified, breaking at word boundaries. After splitting, you simply take the first line:

substr($string, 0, strpos(wordwrap($string, $your_desired_width), "\n"));

One thing this oneliner doesn't handle is the case when the text itself is shorter than the desired width. To handle this edge-case, one should do something like:

if (strlen($string) > $your_desired_width) 
{
$string = wordwrap($string, $your_desired_width);
$string = substr($string, 0, strpos($string, "\n"));
}

The above solution has the problem of prematurely cutting the text if it contains a newline before the actual cutpoint. Here a version which solves this problem:

function tokenTruncate($string, $your_desired_width) {
$parts = preg_split('/([\s\n\r]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
$parts_count = count($parts);

$length = 0;
$last_part = 0;
for (; $last_part < $parts_count; ++$last_part) {
$length += strlen($parts[$last_part]);
if ($length > $your_desired_width) { break; }
}

return implode(array_slice($parts, 0, $last_part));
}

Also, here is the PHPUnit testclass used to test the implementation:

class TokenTruncateTest extends PHPUnit_Framework_TestCase {
public function testBasic() {
$this->assertEquals("1 3 5 7 9 ",
tokenTruncate("1 3 5 7 9 11 14", 10));
}

public function testEmptyString() {
$this->assertEquals("",
tokenTruncate("", 10));
}

public function testShortString() {
$this->assertEquals("1 3",
tokenTruncate("1 3", 10));
}

public function testStringTooLong() {
$this->assertEquals("",
tokenTruncate("toooooooooooolooooong", 10));
}

public function testContainingNewline() {
$this->assertEquals("1 3\n5 7 9 ",
tokenTruncate("1 3\n5 7 9 11 14", 10));
}
}

EDIT :

Special UTF8 characters like 'à' are not handled. Add 'u' at the end of the REGEX to handle it:

$parts = preg_split('/([\s\n\r]+)/u', $string, null, PREG_SPLIT_DELIM_CAPTURE);

How to truncate by word instead of letter count

trim back to the last space

 $title = substr($title, 0, 69) ;
$title = substr($title, 0, strrpos($title," ")) . '...';

http://php.net/manual/en/function.strrpos.php

Trimming a block of text to the nearest word when a certain character limit is reached?

See the wordwrap function.

I would probably do something like:

function wrap($string) {
$wstring = explode("\n", wordwrap($string, 27, "\n") );
return $wstring[0];
}

(If your strings already span across severeal lines, use other char - or pattern - for the split other than "\n")

php truncate string if longer than limit and put some omission at the end..similar to ruby

function substr_with_ellipsis($string, $chars = 100)
{
preg_match('/^.{0,' . $chars. '}(?:.*?)\b/iu', $string, $matches);
$new_string = $matches[0];
return ($new_string === $string) ? $string : $new_string . '…';
}

How to shorten a string without slicing through a word while keeping within a character limit in PHP

If the string is too long, you can first use substr to truncate the string and then a regular expression to remove the last full or partial word:

$s = substr($s, 0, (140 - 3));
$s = preg_replace('/ [^ ]*$/', ' ...', $s);

Note that you have to make the original shorter than 140 bytes because when you add the ... this could increase the length of the string beyond 140 bytes.

How to Truncate a string in PHP to the word closest to a certain number of characters?

By using the wordwrap function. It splits the texts in multiple lines such that the maximum width is the one you specified, breaking at word boundaries. After splitting, you simply take the first line:

substr($string, 0, strpos(wordwrap($string, $your_desired_width), "\n"));

One thing this oneliner doesn't handle is the case when the text itself is shorter than the desired width. To handle this edge-case, one should do something like:

if (strlen($string) > $your_desired_width) 
{
$string = wordwrap($string, $your_desired_width);
$string = substr($string, 0, strpos($string, "\n"));
}

The above solution has the problem of prematurely cutting the text if it contains a newline before the actual cutpoint. Here a version which solves this problem:

function tokenTruncate($string, $your_desired_width) {
$parts = preg_split('/([\s\n\r]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
$parts_count = count($parts);

$length = 0;
$last_part = 0;
for (; $last_part < $parts_count; ++$last_part) {
$length += strlen($parts[$last_part]);
if ($length > $your_desired_width) { break; }
}

return implode(array_slice($parts, 0, $last_part));
}

Also, here is the PHPUnit testclass used to test the implementation:

class TokenTruncateTest extends PHPUnit_Framework_TestCase {
public function testBasic() {
$this->assertEquals("1 3 5 7 9 ",
tokenTruncate("1 3 5 7 9 11 14", 10));
}

public function testEmptyString() {
$this->assertEquals("",
tokenTruncate("", 10));
}

public function testShortString() {
$this->assertEquals("1 3",
tokenTruncate("1 3", 10));
}

public function testStringTooLong() {
$this->assertEquals("",
tokenTruncate("toooooooooooolooooong", 10));
}

public function testContainingNewline() {
$this->assertEquals("1 3\n5 7 9 ",
tokenTruncate("1 3\n5 7 9 11 14", 10));
}
}

EDIT :

Special UTF8 characters like 'à' are not handled. Add 'u' at the end of the REGEX to handle it:

$parts = preg_split('/([\s\n\r]+)/u', $string, null, PREG_SPLIT_DELIM_CAPTURE);



Related Topics



Leave a reply



Submit