Ideal method to truncate a string with ellipsis
I like the idea of letting "thin" characters count as half a character. Simple and a good approximation.
The main issue with most ellipsizings however, are (imho) that they chop of words in the middle. Here is a solution taking word-boundaries into account (but does not dive into pixel-math and the Swing-API).
private final static String NON_THIN = "[^iIl1\\.,']";
private static int textWidth(String str) {
return (int) (str.length() - str.replaceAll(NON_THIN, "").length() / 2);
}
public static String ellipsize(String text, int max) {
if (textWidth(text) <= max)
return text;
// Start by chopping off at the word before max
// This is an over-approximation due to thin-characters...
int end = text.lastIndexOf(' ', max - 3);
// Just one long word. Chop it off.
if (end == -1)
return text.substring(0, max-3) + "...";
// Step forward as long as textWidth allows.
int newEnd = end;
do {
end = newEnd;
newEnd = text.indexOf(' ', end + 1);
// No more spaces.
if (newEnd == -1)
newEnd = text.length();
} while (textWidth(text.substring(0, newEnd) + "...") < max);
return text.substring(0, end) + "...";
}
A test of the algorithm looks like this:
Smart way to truncate long strings
Essentially, you check the length of the given string. If it's longer than a given length n
, clip it to length n
(substr
or slice
) and add html entity …
(…) to the clipped string.
Such a method looks like
function truncate(str, n){
return (str.length > n) ? str.slice(0, n-1) + '…' : str;
};
If by 'more sophisticated' you mean truncating at the last word boundary of a string then you need an extra check.
First you clip the string to the desired length, next you clip the result of that to its last word boundary
function truncate( str, n, useWordBoundary ){
if (str.length <= n) { return str; }
const subString = str.slice(0, n-1); // the original check
return (useWordBoundary
? subString.slice(0, subString.lastIndexOf(" "))
: subString) + "…";
};
You can extend the native String
prototype with your function. In that case the str
parameter should be removed and str
within the function should be replaced with this
:
String.prototype.truncate = String.prototype.truncate ||
function ( n, useWordBoundary ){
if (this.length <= n) { return this; }
const subString = this.slice(0, n-1); // the original check
return (useWordBoundary
? subString.slice(0, subString.lastIndexOf(" "))
: subString) + "…";
};
More dogmatic developers may chide you strongly for that ("Don't modify objects you don't own". I wouldn't mind though).
An approach without extending the String
prototype is to create
your own helper object, containing the (long) string you provide
and the beforementioned method to truncate it. That's what the snippet
below does.
const LongstringHelper = str => {
const sliceBoundary = str => str.substr(0, str.lastIndexOf(" "));
const truncate = (n, useWordBoundary) =>
str.length <= n ? str : `${ useWordBoundary
? sliceBoundary(str.slice(0, n - 1))
: str.slice(0, n - 1)}…`;
return { full: str, truncate };
};
const longStr = LongstringHelper(`Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore
magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute
irure dolor in reprehenderit in voluptate velit esse cillum dolore
eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum`);
const plain = document.querySelector("#resultTruncatedPlain");
const lastWord = document.querySelector("#resultTruncatedBoundary");
plain.innerHTML =
longStr.truncate(+plain.dataset.truncateat, !!+plain.dataset.onword);
lastWord.innerHTML =
longStr.truncate(+lastWord.dataset.truncateat, !!+lastWord.dataset.onword);
document.querySelector("#resultFull").innerHTML = longStr.full;
body {
font: normal 12px/15px verdana, arial;
}
p {
width: 450px;
}
#resultTruncatedPlain:before {
content: 'Truncated (plain) n='attr(data-truncateat)': ';
color: green;
}
#resultTruncatedBoundary:before {
content: 'Truncated (last whole word) n='attr(data-truncateat)': ';
color: green;
}
#resultFull:before {
content: 'Full: ';
color: green;
}
<p id="resultTruncatedPlain" data-truncateat="120" data-onword="0"></p>
<p id="resultTruncatedBoundary" data-truncateat="120" data-onword="1"></p>
<p id="resultFull"></p>
How can I truncate my strings with a ... if they are too long?
Here is the logic wrapped up in an extension method:
public static string Truncate(this string value, int maxChars)
{
return value.Length <= maxChars ? value : value.Substring(0, maxChars) + "...";
}
Usage:
var s = "abcdefg";
Console.WriteLine(s.Truncate(3));
How to shorten a string by replacing some characters with an ellipsis
If I understand correctly, n represents the maximum length of the string. If the string is longer, you want to replace a substring by the ellipsis character, such that the resulting string has a length of n. If possible, you also want that ellipsis to be placed before the extension of the filename, leaving one character visible before the final dot.
Some things to consider:
- Use
lastIndexOf
instead ofindexOf
, as the last occurrence really determines where the file extension starts. - Don't use the
substr
method, as it is considered a legacy feature in ECMAScript. I preferslice
. - Make the necessary calculations so to ensure that after the insertion of the ellipsis you arrive at the correct length.
- Deal with boundary cases, where the filename has no extension, or where the extension itself is longer than n.
- Don't use the HTML entity
…
, as that will limit the use of this function to HTML output only. Instead put the actual…
character, which will both work for HTML and plain text.
Here is the code I would suggest:
function truncate(str, n) {
if (str.length <= n) return str; // Nothing to do
if (n <= 1) return "…"; // Well... not much else we can return here!
let dot = str.lastIndexOf("."); // Where the extension starts
// How many characters from the end should remain:
let after = dot < 0 ? 1 : Math.max(1, Math.min(n - 2, str.length - dot + 2));
// How many characters from the start should remain:
let before = n - after - 1; // Account for the ellipsis
return str.slice(0, before) + "…" + str.slice(-after);
}
Python truncate a long string
info = (data[:75] + '..') if len(data) > 75 else data
Ellipsis in the middle of a text (Mac style)
In the HTML, put the full value in a custom data-* attribute like
<span data-original="your string here"></span>
Then assign load
and resize
event listeners to a JavaScript function which will read the original data attribute and place it in the innerHTML
of your span tag. Here is an example of the ellipsis function:
function start_and_end(str) {
if (str.length > 35) {
return str.substr(0, 20) + '...' + str.substr(str.length-10, str.length);
}
return str;
}
Adjust the values, or if possible, make them dynamic, if necessary for different objects. If you have users from different browsers, you can steal a reference width from a text by the same font and size elsewhere in your dom. Then interpolate to an appropriate amount of characters to use.
A tip is also to have an abbr-tag on the ... or who message to make the user be able to get a tooltip with the full string.
<abbr title="simple tool tip">something</abbr>
Trim a string based on the string length
s = s.substring(0, Math.min(s.length(), 10));
Using Math.min
like this avoids an exception in the case where the string is already shorter than 10
.
Notes:
The above does simple trimming. If you actually want to replace the last characters with three dots if the string is too long, use Apache Commons
StringUtils.abbreviate
; see @H6's solution. If you want to use the Unicode horizontal ellipsis character, see @Basil's solution.For typical implementations of
String
,s.substring(0, s.length())
will returns
rather than allocating a newString
.This may behave incorrectly1 if your String contains Unicode codepoints outside of the BMP; e.g. Emojis. For a (more complicated) solution that works correctly for all Unicode code-points, see @sibnick's solution.
1 - A Unicode codepoint that is not on plane 0 (the BMP) is represented as a "surrogate pair" (i.e. two char
values) in the String
. By ignoring this, we might trim the string to fewer than 10 code points, or (worse) truncate it in the middle of a surrogate pair. On the other hand, String.length()
is not a good measure of Unicode text length, so trimming based on that property may be the wrong thing to do.
Related Topics
Specifying Java Version in Maven - Differences Between Properties and Compiler Plugin
Rationale for Matcher Throwing Illegalstateexception When No 'Matching' Method Is Called
Why Invoke Thread.Currentthread.Interrupt() in a Catch Interruptexception Block
Collection to Stream to a New Collection
How to Draw the Same Moving Image Multiple Times
Use Cases and Examples of Gof Decorator Pattern for Io
Maven Error: Could Not Find or Load Main Class Org.Codehaus.Plexus.Classworlds.Launcher.Launcher
What Is Simplest Way to Read a File into String
Using Mockito with Multiple Calls to the Same Method with the Same Arguments
Given Final Block Not Properly Padded
Method Overloading and Choosing the Most Specific Type
Jvm Takes a Long Time to Resolve Ip-Address for Localhost
Adding N Hours to a Date in Java
Program Freezes During Thread.Sleep() and with Timer