Getting the First Character of a String with $Str[0]

Getting the first character of a string with $str[0]

Yes. Strings can be seen as character arrays, and the way to access a position of an array is to use the [] operator. Usually there's no problem at all in using $str[0] (and I'm pretty sure is much faster than the substr() method).

There is only one caveat with both methods: they will get the first byte, rather than the first character. This is important if you're using multibyte encodings (such as UTF-8). If you want to support that, use mb_substr(). Arguably, you should always assume multibyte input these days, so this is the best option, but it will be slightly slower.

How to get the first character of string in php

For the single-byte encoded (binary) strings:

substr($username, 0, 1);
// or
$username[0] ?? null;

For the multi-byte encoded strings, such as UTF-8:

mb_substr($username, 0, 1);

Replace first character of a String if first character is 0

Test the first character and if it is zero, replace it.

if (substr($pn, 0, 1) === '0') {
$pn = '233' . substr($pn, 1);
}

You may also want to use other criteria, such as the length of the original value, etc, to ensure that your result is what you expect, and not an exceptional value. For example if the original value is 023333623236 then performing the above transformation may not be what you want.

How do I get the the first character of an already initialized string in PHP?

$firstChar = $stringTest[0];

This would get the first character of string - treating stringTest as an array of characters - and is fastest method.

$firstChar = substr($stringTest, 0, 1);

This is slower, and takes a substring - retrieving 1 character (the last argument) from the string, and setting off from an offset of 0.

Print / Find the first character (with the LOWEST occurrence) in a non-empty string and order is important

ANSWER #4 - file/substr()/reduced-array-usage solution

After some back-n-forth with @AKS, and using an ever-larger data set (latest test using a 36 MB file), the awk/array memory issue has cropped up (eg, for the larger dataset the various awk answers - up to this point - require 6-8 GB of RAM).

My first attempt at addressing the memory issue will be to copy all of the input into a new variable; yeah, this means copying 36 MB of data into an awk variable, but that'll still be a lot less than 6-8 GB of RAM.

Using the new (bigger) dataset provided by @AKS:

$ str="upvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLPlakjfaldsfpuFHAOOJJADFIASYDOYsggdhuafaismxasidfuasudfoasdufoiasudfoiayioOISYDOIQYORIYOIRYOIYQNOIYFAclamscvjlaivniauppruporupourpoupupovupuadouaouuouaudfaodfpadufuudofupuaspfupipoporqwooPOFPUnmcupauvpaupvouapouqweruuUPOUADUFUAUASDFLKHLP"
$ for i in {1..10}; do str="${str}${str}"; done
$ for i in {1..3}; do str="${str}${str}"; done
$ echo -e "\n\n-- Adding 'z' the only char in this big string blob 'str' variable'\n"
$ str="${str}z"
$ echo $str | wc
1 1 36864002
$ echo "${str}" > newgiga.txt
$ ls -lh newgiga.txt
-rw-r--r--+ 1 xxxxx yyyyy 36M Jun 6 16:55 newgiga.txt

NOTE: The way this data has been created, all letters/numbers occur more than once except for the letter z (which appears just once, and at the end of the entire dataset).

And the new/improved awk solution:

$ time awk '
{ copy = copy $0 # make a copy of our input for later reparsing
len = length($0)

for ( i=1; i<=len; i++ ) { # get a count of occurrences of each letter/number
token = substr($0,i,1)
count[token]++
}
}

END { for ( i in count ) {
if ( min <= 0 )
min = count[i]
else
min = count[i]<min?count[i]:min # find the lowest/minimum count
}

for ( i=1; i<=len; i++ ) { # reparse input looking for first letter with count == min
token = substr(copy,i,1)
if ( min == count[token] ) {
print token, min # print the letter/number and count and
break # break out of our loop
}
}
}
' newgiga.txt

z 1 # as mentioned in the above NOTE => z occurs just once in the dataset

real 0m19.575s # slightly better rate than the previous answer #3 that took 6 secs for 14 MB of data
user 0m19.406s
sys 0m0.171s

NOTE: This answer used up 160 MB of memory on my machine (much better than the 6-8 GB of earlier answers) while also running at about the same rate as before.


Tried a solution that eliminates the copy variable and instead processes the input file a second time. Results:

  • total memory usage dropped by ~30 MB (to ~130 MB)
  • total run time increased by ~2 seconds

So, the trade-offs aren't really worth the effort.

Get the repeated values of the first character in a string

This should do the trick,

$string = "01110111011101110111101110111110111111";
$offset = 0;
$return_value = array();
$character = substr($string, 0, 1);

while (($offset = strpos($string, $character, $offset))!== false) {
$return_value[] = $offset + 1;
$offset = $offset + strlen($character);
}

var_dump($return_value);

Which return_value will produce,

array(8) { [0]=> int(1) [1]=> int(5) [2]=> int(9) [3]=> int(13) [4]=> int(17) [5]=> int(22) [6]=> int(26) [7]=> int(32)}

get first char of each string in a list

Santiago Squarzon provided the crucial pointer in a comment:

Provide the strings you want the ForEach-Object cmdlet to operate on via the pipeline, which allows you to refer to each via the automatic $_ variable (the following uses an array literal as input for brevity):

PS> 'tom', 'dick', 'harry' | ForEach-Object { $_[0] }
t
d
h

Alternatively, for values already in memory, use the .ForEach array method for better performance:

('tom', 'dick', 'harry').ForEach({ $_[0] })

The foreach statement provides the best performance:

foreach ($str in ('tom', 'dick', 'harry')) { $str[0] }

As for what you tried:

  • ForEach-Object { ... } - without pipeline input - is essentially the same as executing ... directly.

    • Thus, expressed in terms of the sample input above:

      • You executed:

        ForEach-Object { ('tom', 'dick', 'harry')[0][0] }
      • which is the same as:

         ('tom', 'dick', 'harry')[0][0]
      • which therefore extracts the first element from the input array (the first [0]) and then applies the second [0] to that string only, and therefore only yields 't'.

  • In other words: use of ForEach-Object only makes sense with input from the pipeline.



Related Topics



Leave a reply



Submit