How Does Zalgo Text Work

How does Zalgo text work?

The text uses combining characters, also known as combining marks. See section 2.11 of Combining Characters in the Unicode Standard (PDF).

In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character

So you can easily construct a character sequence, consisting of a base character and “combining above” marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model. Such a sequence has no meaning of course, and even a monkey could produce it (e.g., given a keyboard with suitable driver).

And you can mix “combining above” and “combining below” marks.

The sample text in the question starts with:

  • LATIN CAPITAL LETTER H - H
  • COMBINING LATIN SMALL LETTER T - ͭ
  • COMBINING GREEK KORONIS - ̓
  • COMBINING COMMA ABOVE - ̓
  • COMBINING DOT ABOVE - ̇

Zalgo text in Java?

EDIT to show how to do it in java.

The result is saved in the text file zalgo.txt in unicode format. We save it to a file because your IDE might not know how to display the unicode characters properly if you write it to the outputstream.

  import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.UnsupportedEncodingException;
import java.io.Writer;

public class Zalgo {

private static final char[] zalgo_up =
{ '\u030d', /* Ì? */'\u030e', /* ÌŽ */'\u0304', /* Ì„ */'\u0305', /* Ì… */
'\u033f', /* Ì¿ */'\u0311', /* Ì‘ */'\u0306', /* ̆ */'\u0310', /* Ì? */
'\u0352', /* ͒ */'\u0357', /* ͗ */'\u0351', /* ͑ */'\u0307', /* ̇ */
'\u0308', /* ̈ */'\u030a', /* ̊ */'\u0342', /* ͂ */'\u0343', /* ̓ */
'\u0344', /* ̈Ì? */'\u034a', /* ÍŠ */'\u034b', /* Í‹ */'\u034c', /* ÍŒ */
'\u0303', /* ̃ */'\u0302', /* Ì‚ */'\u030c', /* ÌŒ */'\u0350', /* Í? */
'\u0300', /* Ì€ */'\u0301', /* Ì? */'\u030b', /* Ì‹ */'\u030f', /* Ì? */
'\u0312', /* ̒ */'\u0313', /* ̓ */'\u0314', /* ̔ */'\u033d', /* ̽ */
'\u0309', /* ̉ */'\u0363', /* ͣ */'\u0364', /* ͤ */'\u0365', /* ͥ */
'\u0366', /* ͦ */'\u0367', /* ͧ */'\u0368', /* ͨ */'\u0369', /* ͩ */
'\u036a', /* ͪ */'\u036b', /* ͫ */'\u036c', /* ͬ */'\u036d', /* ͭ */
'\u036e', /* ͮ */'\u036f', /* ͯ */'\u033e', /* ̾ */'\u035b', /* ͛ */
'\u0346', /* ͆ */'\u031a' /* ̚ */
} ;

private static final char[] zalgo_down =
{ '\u0316', /* ̖ */'\u0317', /* ̗ */'\u0318', /* ̘ */'\u0319', /* ̙ */
'\u031c', /* Ìœ */'\u031d', /* Ì? */'\u031e', /* Ìž */'\u031f', /* ÌŸ */
'\u0320', /* Ì */'\u0324', /* ̤ */'\u0325', /* Ì¥ */'\u0326', /* ̦ */
'\u0329', /* ̩ */'\u032a', /* ̪ */'\u032b', /* ̫ */'\u032c', /* ̬ */
'\u032d', /* ̭ */'\u032e', /* ̮ */'\u032f', /* ̯ */'\u0330', /* ̰ */
'\u0331', /* ̱ */'\u0332', /* ̲ */'\u0333', /* ̳ */'\u0339', /* ̹ */
'\u033a', /* ̺ */'\u033b', /* ̻ */'\u033c', /* ̼ */'\u0345', /* ͅ */
'\u0347', /* ͇ */'\u0348', /* ͈ */'\u0349', /* ͉ */'\u034d', /* Í? */
'\u034e', /* ÍŽ */'\u0353', /* Í“ */'\u0354', /* Í” */'\u0355', /* Í• */
'\u0356', /* ͖ */'\u0359', /* ͙ */'\u035a', /* ͚ */'\u0323' /* ̣ */
} ;

//those always stay in the middle
private static final char[] zalgo_mid =
{ '\u0315', /* Ì• */'\u031b', /* Ì› */'\u0340', /* Ì€ */'\u0341', /* Ì? */
'\u0358', /* ͘ */'\u0321', /* ̡ */'\u0322', /* ̢ */'\u0327', /* ̧ */
'\u0328', /* ̨ */'\u0334', /* ̴ */'\u0335', /* ̵ */'\u0336', /* ̶ */
'\u034f', /* Í? */'\u035c', /* Íœ */'\u035d', /* Í? */'\u035e', /* Íž */
'\u035f', /* ÍŸ */'\u0360', /* Í */'\u0362', /* Í¢ */'\u0338', /* ̸ */
'\u0337', /* Ì· */'\u0361', /* Í¡ */'\u0489' /* Ò‰_ */
} ;


// rand funcs
//---------------------------------------------------

//gets an int between 0 and max

private static int rand(int max) {
return (int)Math.floor(Math.random() * max);
}

//gets a random char from a zalgo char table

private static char rand_zalgo(char[] array) {
int ind = (int)Math.floor(Math.random() * array.length);
return array[ind];
}

//hide show element
//lookup char to know if its a zalgo char or not

private static boolean is_zalgo_char(char c) {
for (int i = 0; i < zalgo_up.length; i++)
if (c == zalgo_up[i])
return true;
for (int i = 0; i < zalgo_down.length; i++)
if (c == zalgo_down[i])
return true;
for (int i = 0; i < zalgo_mid.length; i++)
if (c == zalgo_mid[i])
return true;
return false;
}

public static String goZalgo(String iText, boolean zalgo_opt_mini, boolean zalgo_opt_normal, boolean up,
boolean down, boolean mid) {
String zalgoTxt = "";

for (int i = 0; i < iText.length(); i++) {
if (is_zalgo_char(iText.charAt(i)))
continue;

int num_up;
int num_mid;
int num_down;

//add the normal character
zalgoTxt += iText.charAt(i);

//options
if (zalgo_opt_mini) {
num_up = rand(8);
num_mid = rand(2);
num_down = rand(8);
} else if (zalgo_opt_normal) {
num_up = rand(16) / 2 + 1;
num_mid = rand(6) / 2;
num_down = rand(16) / 2 + 1;
} else //maxi
{
num_up = rand(64) / 4 + 3;
num_mid = rand(16) / 4 + 1;
num_down = rand(64) / 4 + 3;
}

if (up)
for (int j = 0; j < num_up; j++)
zalgoTxt += rand_zalgo(zalgo_up);
if (mid)
for (int j = 0; j < num_mid; j++)
zalgoTxt += rand_zalgo(zalgo_mid);
if (down)
for (int j = 0; j < num_down; j++)
zalgoTxt += rand_zalgo(zalgo_down);
}



return zalgoTxt;
}

public static void main(String[] args){
final String zalgoTxt = goZalgo("To invoke the hive-mind representing chaos.\n" +
"Invoking the feeling of chaos.\n" +
"With out order.\n" +
"The Nezperdian hive-mind of chaos. Zalgo. \n" +
"He who Waits Behind The Wall.\n" +
"ZALGO!", true, false, true, true, true);

try {
final File fileDir = new File("zalgo.txt");
final Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileDir), "UTF8"));

final String[] lines = zalgoTxt.split("\n");

for (int i = 0; i < lines.length; i++) {
String line = lines[i];
out.append(line).append("\r\n");;
}

out.flush();
out.close();

} catch (UnsupportedEncodingException e) {
System.out.println(e.getMessage());
} catch (IOException e) {
System.out.println(e.getMessage());
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
}

If you look carefully here (another zalgo generator):

http://textozor.com/zalgo-text/

You can see that it uses javascript to generate the messed up code: http://textozor.com/zalgo-text/scriptz.js

Convert that logic into any language you want.

How to avoid Zalgo text bleeding all over place without totally removing it?

After testing an example case with Firefox and Chrome, I would say the best option is to use declaration overflow: auto. Using overflow: hidden would make sense only if possible scrollbars are considered worse than losing user content.

The overflow: auto allows falling back to scrollbars automatically if content does not fit and it still forces clipping to selected element.

The declaration clip: rect(0,auto,auto,0); is no good because it only works with position: absolute; and without overflow: visible.

See an example without overflow: auto for an comparision.

Above examples inlined here as snippets:

An example without safeguards against Zalgo text:

body
{
background: #ccc;
color: #000;
font-family: sans-serif;
padding: 4em 1em;
}
section.userinput
{
margin-top: 0.5em;
padding: 0.1em 0.25em;
background: #fff;
border: solid #ccc 0.1em;
border-radius: 0.5em;
font-size: 120%;
}
.v1 section.userinput
{
overflow: auto;
}
.v2 section.userinput
{
position: absolute;
clip: rect(0em,auto,auto,0em);
}
<body class="">
<section class="userinput">
Some real content here.
</section>
<section class="userinput">
T̠̬̘̯̙̲̪̪͇̜̭̣̘̟̲͇̳̬͕̖̜̯̘͉͈͉͎̱͓̣̰̳̳͉̙̯̙̰͚͇͎͕̘̯̳̞̲̼̱̖̩͙͕̤͂̋͌ͤ̒ͬ̀͒͂͒ͩ̌͗̋̓ͭ́̀̿͒ͣͮ̓̋ͭ͊͒͐ͩ̆̓͂ͫ̐͂̇͛͑̓̚ͅẽ̗̙͚̮̭̮̬̠͈̻̦̭̭̳̹̯̹̦̔̌ͯ͂̎̈̊̍ͣ̿̈̈̿̄ͦͭ̍͑̽̎̅͛͗͐ͬ̂̊̽̌̎̋ͭ͆̈́̓ͦͦ̑͛ͯs̭̠͖̝͙̩̫̫̥̦͚̝̼̣̥̗ͣͦ͑́͐ͭ͊ͧ̽͐̈̔͛ͨ́̎̔ͤ͐͒̓̀̅̈́̊̋͋̀̿̎͒̉̽͂ͮͬt̾͒̂̽̐ͪ̆̾ͮ͌̽͛̌͒̔ͧ͗̿ͩ̄̿̿̌ͪͩ̊̏͑͌̀̋ͩ͆ͣ̑̏́̽̐͐̔ͪ̓̓ͭ͆ ̯̘͙̠̦̩̝͎̭̖̪̗̞̖̟̲͖̥͙͕̟̝̹͎̽o̼̮̭̞ͯͬ̀͐͗̿ͣ͛ͮͭ̎ͨ͒̌̾̐̉̍͗̎̈́̆ͪͦ̌ͧͦ̓ͨ̐ͯ͒͑͛ͯ̽ͅf̝̪̼̠͎͇̹̝̙̰̟̼͎̱͂ͩ̈́̌ͬ̒ͧ̽̅̉ͧ́͒̒͊ͦͭͭͭ͗́̽ͦ ͚̝̝̠̪͍̰̺̳̫̭͎͔̭̟͍͎͇͎͈͔̠̬͇̦͈̟̰̱̹̲̰̭̲̭̺̜͚̰̹̮̣̤̲̪̙̞͇̦͙͆ͪͨ̐ͨ̽̒͛ͩ̐ͤ͌̂́͒̌ͭͩͦ̎́̈ͬ̓̑̔͐̎͒̔̄ͥ̏ͥͯͧ͐ͪͧͥ̂ͬ̒̀̉̓ͭ̚ͅͅͅͅZ̘̥̲͍̠͎̱̺̘͈͍̟̤̠̮͖͉͕̙̩̲̣̠͎̥̣̜͚̜͕̻͔̰̞̫̭̹͕͙̝̠̮̣̰ͤͦ̍̓ͪͥ̒ͩ̋͆̍͋̽̅͗̈ͨ̂ͨ͋̔̔̓ͪͣ̅̇̏͒ͬ͐ͩ̇ͨ̋ͣ̌̔ͨ́ͪ̔̾̄ͤ̅͑̚ͅA̞̜̺̣͓̼̭̭͈̳̞͚̭̭͕̺͉̜̗̼̣̩̪͂ͯ̈́̍̍͛ͮ̂ͯ̽̎ͬͯ͆̋͌̍͐̌͗ͤ̒ͤ͊̐̈́ͧ̓̇ͬͦ̾ͭ̐̆̚ͅͅͅL̪͉̬̦̝̠̲͖̘̮̙̳͓͇͇̪̱͉̱͓̺͙͓̲̇́̍̽̇̎͊̍̐ͩ̔̋́ͬ̍ͮͫͮ͗̍͋ͭͯ̑̉̈́̄̾̂̀͆̅͑̽̃̚G̣̺̼͔̺̖̣̥̝̰͙̖͖̮̻̩͓̞͈̜̗̤̺̥̻̞͇̩͕̲̙̝̲̤̤̜̐͗͐ͦ̉͐͗ͩ̿ͩ̑ͫ̍͛̄ͦ̔̚O͇͎̬̰̦̜̻͔͇̖͇̞̪͉͉͔͕̥͇̬̮̰̠̟̤̰̹͖̗̺̙͍ͮͨ̿ͪͯ̈́ͫ̔̽̃̀ ̺͕̠̰̝͎̰̟̠̲̗͈̬̥͈͎̺̮̗͍̺͚̟̠̙̠̜̘̹͉̖̤͉̫̰̱̭̠̲̲̗͒ͥͯ̎͐ͨ̓̓ͮ͒ͧ̒̾́̍̍ͦͥ̈́͒͊̃̓̈̈́̀ͮ̂ͪ̓̄̏ͫ̄̓̓̿̓̔̋̎ͧͪͩͪ͋ͫt̘̥̳̺̳̟̯̜̱̯̬̣̣͔̬̟͈͖̗̹͉̫̯́̋͒͂̈́̎͐̇́ͫ̒͛ͥͦ̐̿͂͒͗̃ͮ͒ͪ̌͆̏ͯ̏ͯ̊ͣ̾̃͋ͩ̃̿͐e̹̠̻̟̪̪͎̭̭͎͎̮̹̬̮̪̓̑ͨ̐͐̈́̓ͤͦ͂̿̅͋ͭ̑̓ͬ͐͐ͤ͐ͪ̒ͥ̀̈́ͪ̇̆ͤ̏̏̄̾̌͒ͬ̊ͬ͛̄̄̌̍͋ͥͅͅx̪͇̞̫̰̠͓̣̻̯̞̭̙̝̣͉̱̘̤͇̦̘̙̥͚̫̩̲̘̻͈͉̱͙͇͙ͫ̐̌͛̓͛ͨ͒̂t̩̖̮̙̻ͬ͗͛̍̅̌ͧ͒ͫ̓ͮ̈͒̾ͮͣͮͨͪ͆ͥ̐̍ͮ̽̅̈́̿ͫ͐̍̉ͦͮ͆͗̔̎̿̇ͧ̋ͨͮ̐̓͑̽̑ͤ̊̚̚ͅ ̝͔̺̩͔͈̰͈̣̫̤͉͚͇̟̹̘͔͇̥̘̘̝͛͛̒ͭͣͮͥͦ̿̏ͥͦ̀͂̾͆ͯͧͮͤ͌̌́͗ͨ̎̒ͬ̈́ͧ̊ͨ̓͂̾̉͐ͦ̃̃̚ͅẖ̰̠̮͓̣̯̭̥̹̜̟͍͍͇̀ͧ̽͑̄͊̋̐͋ͨ̔ͭͬ́̀̐͌͗ͥ̓̇͗̂̊ͅe͇͙͕̺̖̰̟̠̩̘̪̳̻̳͉͔̺̳̲̦̘̞̬̬̝͓̬̣̟͕̘͓̬̍͗̋ͮ͑ͣ͗̓̓̎̈̃̾̊̃ͧ̊ͪ̃̀͋̋̄͑̈́̂́͒̔̎̎ͥ͛͌̃͒̈́ͤ͛ͬͫͪ̚r͉̮̼̙̩͖͍̗̣̘͚̭̩͙͙̻͓̦̱̣͉̮̲͇̥͉͚̲͕͖̩̦̫̪̬͔̟͔̦̻̼̼̫̫̯̣̮͈̺͓͖̬̂̾͛̉̆̍ͥ̈́̓̆ͫ͑̄̔̅̈̏̅̓ͨ͐̊ͮ̋̈́ͣͮ̋̓̾ͤ͊ͬ̀̑ͣ͊̇͌ͯ̚ͅḙ̲͍͙͕̯̘͓͔͔͈̹͈̗͎͕̬̖̟̖͚̳͎̖ͩ͊̃ͫ̔̓͒͗ͩ͋̂ͩͩͧ̍͛̿͒ͩͅ.̯̗͗̑̍̑͗ͫͦͦͪͪͧ́̾̓̌̉͑̊̌̿̓ͫ̆̑̽̽ͪͦͨ͌ͦͨ̓
</section>
<section class="userinput">
Some another real content here.
</section>
</body>

Compare two DataTables to determine rows in one but not the other

would I have to iterate through each row on each DataTable to check if they are the same.

Seeing as you've loaded the data from a CSV file, you're not going to have any indexes or anything, so at some point, something is going to have to iterate through every row, whether it be your code, or a library, or whatever.

Anyway, this is an algorithms question, which is not my specialty, but my naive approach would be as follows:

1: Can you exploit any properties of the data? Are all the rows in each table unique, and can you sort them both by the same criteria? If so, you can do this:

  • Sort both tables by their ID (using some useful thing like a quicksort). If they're already sorted then you win big.
  • Step through both tables at once, skipping over any gaps in ID's in either table. Matched ID's mean duplicated records.

This allows you to do it in (sort time * 2 ) + one pass, so if my big-O-notation is correct, it'd be (whatever-sort-time) + O(m+n) which is pretty good.

(Revision: this is the approach that ΤΖΩΤΖΙΟΥ describes )

2: An alternative approach, which may be more or less efficient depending on how big your data is:

  • Run through table 1, and for each row, stick it's ID (or computed hashcode, or some other unique ID for that row) into a dictionary (or hashtable if you prefer to call it that).
  • Run through table 2, and for each row, see if the ID (or hashcode etc) is present in the dictionary. You're exploiting the fact that dictionaries have really fast - O(1) I think? lookup. This step will be really fast, but you'll have paid the price doing all those dictionary inserts.

I'd be really interested to see what people with better knowledge of algorithms than myself come up with for this one :-)

How to achieve unusual text effect?

It's called Zalgo text.

You can Google for an online generator and use it:

TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

As a side note, don't try to parse HTML with RegEx.

How does Zalgo text work?

The text uses combining characters, also known as combining marks. See section 2.11 of Combining Characters in the Unicode Standard (PDF).

In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character

So you can easily construct a character sequence, consisting of a base character and “combining above” marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model. Such a sequence has no meaning of course, and even a monkey could produce it (e.g., given a keyboard with suitable driver).

And you can mix “combining above” and “combining below” marks.

The sample text in the question starts with:

  • LATIN CAPITAL LETTER H - H
  • COMBINING LATIN SMALL LETTER T - ͭ
  • COMBINING GREEK KORONIS - ̓
  • COMBINING COMMA ABOVE - ̓
  • COMBINING DOT ABOVE - ̇


Related Topics



Leave a reply



Submit