Why Questionmark Comes in The End of Filename When I Create .Txt File Through Shell Script

why questionmark comes in the end of filename when i create .txt file through shell script?

Sounds like you script uses \r\n as line endings, this is typical DOS style line endings. Unix like systems uses \n. You should try to change the line feeds, eg with your favorite text editor:

vim +'set ff=unix | x' my_script

Or with dos2unix:

dos2unix my_script

Or with GNU sed:

sed -i 's/\r$//' my_script

Shell Scripting unwanted '?' character at the end of file name

It sounds like your script file has DOS-style line endings (\r\n) instead of unix-style (just \n) -- when a script in this format, the \r gets treated as part of the commands. In this instance, it's getting included in $emplid and therefore in the filename.

Many platforms support the dos2unix command to convert the file to unix-style line endings. And once it's converted, stick to text editors that support unix-style text files.

EDIT: I had assumed the problem line endings were in the shell script, but it looks like they're in the input file ("$i".txt) instead. You can use dos2unix on the input file to clean it and/or add a cleaning step to the sed command in your script. BTW, you can have a single instance of sed apply several edits with the -e option:

emplid=$(grep -a "Student ID" "$i".txt  | sed '-e s/(Student ID:  //g' -e 's/)Tj//g' -e $'s/\r$//' )

I'd recommend against using sed 's/.$//' -- if the file is in unix format, that'll cut off the last character of the filename.

question marks appers at the end of text file on renaming the text file using perl

Your $link contains a newline character and the question mark is just a placeholder for such a non-printable character.

Try chomp($link);

Trailing questions marks in filename that is not showing up with echo

I ran e ++ff=unix and it seems like each one of the lines has a ^M at the end of them, so I removed them.

Why can't you use a question mark in a batch for loop?

It's because ? will be expanded into a list of filenames one character long. The "naked" for is using that list as a list of filenames.

If you run the following commands, you'll see this in action:

c:\> echo xx >a
c:\> echo xx >b
c:\> for %i in (1, ?) do echo %x
1
a
b

If you look at Rob van der Woude's excellent scripting pages, you'll see that the for page has options for processing command output, numbers and files - it's not really suited for arbitrary strings.

One way to get around that is to provide your own for-like command as shown in the following example:

    @echo off
    setlocal enableextensions enabledelayedexpansion

    rem Call the callback function for each argument.
    set escapee=/
    call :doFor :processEach 1 2 ? 4 5
    echo.Escapee was %escapee%

    rem Execute simple command for each argument.
    call :doFor echo 6 7 ? 9 10

    endlocal
    goto :eof

:processEach
    set escapee=%escapee%%1/
    goto :eof

:doFor
    setlocal

    rem Get action.
    set cbAction=%1
    shift

:dfloop
    rem Process each argument with callback or command.
    if not "%1" == "" (
        call %cbAction% %1
        shift
        goto :dfloop
    )
    endlocal&&set escapee=%escapee%
    goto :eof

This provides a single functions which can handle both callbacks and simple commands. For more complex commands, provide a callback function and it will get called with each argument in turn. The callback function can be arbitrarily complex but keep in mind that, because it's operating within a setlocal, changes to environment variables cannot escape back to the caller.

As a way around this, it allows one variable, escapee, to escape the scope - you could also add more if needed.

For simple commands (like echo) where you just need the argument placed at the end, you do the same thing. It doesn't need a callback function but it's restricted to very simple scenarios.

Also keep in mind that, although this seems like a lot of code, the vast majority of it only needs to exist in one place. To use it, you simply need a one-liner like the sample:

call :doFor echo my hovercraft is full of eels

Also keep in mind that there may be other characters that do not fare well, even with this scheme. It solves the ? issue but others may still cause problems. I suspect that this would be an ideal opportunity to add PowerShell to your CV, for example, a command that's almost bash-like in it's elegance and zen-ness:

PShell> foreach ($item in @("1", "?", "3", "4")) { echo $item }
1
?
3
4

How do I filter lines in a text file that start with a capital letter and end with a positive integer with regex on the command line in linux?

You need to use

grep '^[A-Z].*[0-9]$'
grep '^[[:upper:]].*[0-9]$'

See the online demo. The regex matches:

^ - start of string
[A-Z] / [[:upper:]] - an uppercase letter
.* - any zero or more chars ([^0-9]* matches zero or more non-digit chars)
[0-9] - a digit.
$ - end of string.

Also, if you want to make sure there is no - before the number at the end of string, you need to use a negated bracket expression, like

grep -E '^[[:upper:]](.*[^-0-9])?[1-9][0-9]*$'

Here, the POSIX ERE regx (due to -E option) matches

^[[:upper:]] - an uppercase letter at the start and then
(.*[^-0-9])? - an optional occurrence of any text and then any char other than a digit and -
[1-9] - a non-zero digit
[0-9]* - zero or more digits
$ - end of string.

FindFirst and question mark

The simplest solution for your specific need is to replace this:

ShowMessage(tsrMessage.Name);

with this

if length(tsrMessage.Name)=8 then ShowMessage(tsrMessage.Name);

this will ensure that the length of the file name is exactly four characters + the period + the extension. Like David says, there's no way to have the API do this kind of filtering, so you'll have to do it yourself, but in your particular case, there's no need to enumerate the entire directory. You may at least let the API do the filtering it can do, and then do your own filtering on top of it.

EDIT: If you need to ensure that the three characters following the "a" are digits, you can do it this way:

if (length(tsrMessage.Name)=8) and tsrMessage[2].IsDigit and tsrMessage[3].IsDigit and tsrMessage[4].IsDigit then ShowMessage(tsrMessage.Name);

provided you are using a modern compiler (you'll need to include the "Characters" unit). Also take note that if you are compiling a mobile version, you'll need to use index [1], [2] and [3] instead, as they start index at 0 for strings.

If you are using an older version, you can do it like this:

function IsDigit(c : char) : boolean;
  begin
    Result:=(c>='0') and (c<='9')
  end;

if (length(tsrMessage.Name)=8) and IsDigit(tsrMessage[2]) and IsDigit(tsrMessage[3]) and IsDigit(tsrMessage[4]) then ShowMessage(tsrMessage.Name);

Turn a string into a valid filename?

You can look at the Django framework (but take there licence into account!) for how they create a "slug" from arbitrary text. A slug is URL- and filename- friendly.

The Django text utils define a function, slugify(), that's probably the gold standard for this kind of thing. Essentially, their code is the following.

import unicodedata
import re

def slugify(value, allow_unicode=False):
    """
    Taken from https://github.com/django/django/blob/master/django/utils/text.py
    Convert to ASCII if 'allow_unicode' is False. Convert spaces or repeated
    dashes to single dashes. Remove characters that aren't alphanumerics,
    underscores, or hyphens. Convert to lowercase. Also strip leading and
    trailing whitespace, dashes, and underscores.
    """
    value = str(value)
    if allow_unicode:
        value = unicodedata.normalize('NFKC', value)
    else:
        value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore').decode('ascii')
    value = re.sub(r'[^\w\s-]', '', value.lower())
    return re.sub(r'[-\s]+', '-', value).strip('-_')

And the older version:

def slugify(value):
    """
    Normalizes string, converts to lowercase, removes non-alpha characters,
    and converts spaces to hyphens.
    """
    import unicodedata
    value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore')
    value = unicode(re.sub('[^\w\s-]', '', value).strip().lower())
    value = unicode(re.sub('[-\s]+', '-', value))
    # ...
    return value

There's more, but I left it out, since it doesn't address slugification, but escaping.

Why Questionmark Comes in The End of Filename When I Create .Txt File Through Shell Script