Piping Text To An External Program Appends A Trailing Newline
tl;dr:
When PowerShell pipes a string to an external program:
- It encodes it using the character encoding stored in the
$OutputEncoding
preference variable - It invariably appends a trailing (platform-appropriate) newline.
Therefore, the key is to avoid PowerShell's pipeline in favor of the native shell's, so as to prevent implicit addition of a trailing newline:
- If you're running your command on a Unix-like platform (using PowerShell Core):
sh -c "printf %s 'string' | openssl dgst -sha256 -hmac authcode"
printf %s
is the portable alternative to echo -n
. If the string contains '
chars., double them or use `"...`"
quoting instead.
- In case you need to do this on Windows via
cmd.exe
, things get even trickier, becausecmd.exe
doesn't directly support echoing without a trailing newline:
cmd /c "<NUL set /p =`"string`"| openssl dgst -sha256 -hmac authcode"
Note that there must be no space before |
for this to work. For an explanation and the limitations of this solution, see this answer.
Encoding issues would only arise if the string contained non-ASCII characters and you're running in Windows PowerShell; in that event, first set $OutputEncoding
to the encoding that the target utility expects, typically UTF-8: $OutputEncoding = [Text.Utf8Encoding]::new()
PowerShell, as of Windows PowerShell v5.1 / PowerShell (Core) v7.2, invariably appends a trailing newline when you send a string without one via the pipeline to an external utility, which is the reason for the difference you're observing (that trailing newline will be a LF only on Unix platforms, and a CRLF sequence on Windows).
- You can keep track of efforts to address this problem in GitHub issue #5974, opened by the OP.
Additionally, PowerShell's pipeline is invariably text-based when it comes to piping data to external programs; the internally UTF-16LE-based PowerShell (.NET) strings are transcoded based on the encoding stored in the automatic
$OutputEncoding
variable, which defaults to ASCII-only encoding in Windows PowerShell, and to UTF-8 encoding in PowerShell Core (both on Windows and on Unix-like platforms).- In PowerShell Core, a change is being discussed for piping raw byte streams between external programs.
The fact that
echo -n
in PowerShell does not produce a string without a trailing newline is therefore incidental to your problem; for the sake of completeness, here's an explanation:echo
is an alias for PowerShell'sWrite-Output
cmdlet, which - in the context of piping to external programs - writes text to the standard input of the program in the next pipeline segment (similar to Bash / cmd.exe'secho
).-n
is interpreted as an (unambiguous) abbreviation forWrite-Output
's-NoEnumerate
switch.-NoEnumerate
only applies when writing multiple objects, so it has no effect here.- Therefore, in short: in PowerShell,
echo -n "string"
is the same asWrite-Output -NoEnumerate "string"
, which - because only a single string is output - is the same asWrite-Output "string"
, which, in turn, is the same as just using"string"
, relying on PowerShell's implicit output behavior. Write-Output
has no option to suppress a trailing newline, and even if it did, using a pipeline to pipe to an external program would add it back in.
Powershell Pipeline data to external console application
EDIT: As @mklement0 pointed out, this is different in PowerShell Core.
In PowerShell 5.1 (and lower) think you would have to manually write each pipeline item to the external application's input stream.
Here's an attempt to build a function for that:
function Invoke-Pipeline {
[CmdletBinding()]
param (
[Parameter(Mandatory, Position = 0)]
[string]$FileName,
[Parameter(Position = 1)]
[string[]]$ArgumentList,
[int]$TimeoutMilliseconds = -1,
[Parameter(ValueFromPipeline)]
$InputObject
)
begin {
$process = [System.Diagnostics.Process]::Start((New-Object System.Diagnostics.ProcessStartInfo -Property @{
FileName = $FileName
Arguments = $ArgumentList
UseShellExecute = $false
RedirectStandardInput = $true
RedirectStandardOutput = $true
}))
$output = [System.Collections.Concurrent.ConcurrentQueue[string]]::new()
$event = Register-ObjectEvent -InputObject $process -EventName 'OutputDataReceived' ` -Action {
$Event.MessageData.TryAdd($EventArgs.Data)
} -MessageData $output
$process.BeginOutputReadLine()
}
process {
$process.StandardInput.WriteLine($InputObject)
[string]$line = ""
while (-not ($output.TryDequeue([ref]$line))) {
start-sleep -Milliseconds 1
}
do {
$line
} while ($output.TryDequeue([ref]$line))
}
end {
if ($TimeoutMilliseconds -lt 0) {
$exited = $process.WaitForExit()
}
else {
$exited = $process.WaitForExit($TimeoutMilliseconds)
}
if ($exited) {
$process.Close()
}
else {
try {$process.Kill()} catch {}
}
}
}
Run-Commands | Invoke-Pipeline netapp.exe "-connect QQQQ -U user -P password"
The problem is, that there is no perfect solution, because by definition, you cannot know when the external program will write something to its output stream, or how much.
Note: This function doesn't redirect the error stream. The approach would be the same though.
Different behaviour and output when piping in CMD and PowerShell
tl;dr:
Up to at least PowerShell 7.2.x, if you need raw byte handling and/or need to prevent PowerShell from situationally adding a trailing newline to your text data, avoid the PowerShell pipeline altogether.
- Future support for passing raw byte data between external programs and to-file redirections is the subject of GitHub issue #1908.
For raw byte handling, shell out to cmd
with /c
(on Windows; on Unix-like platforms / Unix-like Windows subsystems, use sh
or bash
with -c
):
cmd /c 'type .\test.txt | .\Crypt.exe --encrypt | .\Crypt.exe --decrypt'
Use a similar technique to save raw byte output in a file - do not use PowerShell's >
operator:
cmd /c 'someexe > file.bin'
Note that if you want to capture an external program's text output in a PowerShell variable or process it further in a PowerShell pipeline, you need to make sure that [Console]::OutputEncoding
matches your program's output character encoding (the active OEM code page, typically), which should be true by default in this case; see the next section for details.
Generally, however, byte manipulation of text data is best avoided.
There are two separate problems, only one of which has a simple solution:
Problem 1: There is indeed a character encoding problem, as you suspected:
PowerShell invisibly inserts itself as an intermediary in pipelines, even when sending data to and receiving data from external programs: It converts data from and to .NET strings (System.String
), which are sequences of UTF-16 code units.
- As an aside: Even when using only PowerShell-native commands, this means that reading input from files and saving them again can result in a different character encoding, because the information about the original character encoding is not preserved once (string) data has been read into memory, and on saving it is the cmdlets' default character encoding that is used; while this default encoding is consistently BOM-less UTF-8 in PowerShell (Core) 6+, it varies by cmdlet in Windows PowerShell - see this answer.
In order to send to and receive data from external programs (such as Crypt.exe
in your case), you need to match their character encoding; in your case, with a Windows console application that uses raw byte handling, the implied encoding is the system's active OEM code page.
On sending data, PowerShell uses the encoding of the
$OutputEncoding
preference variable to encode (what is invariably treated as text) data, which defaults to ASCII(!) in Windows PowerShell, and (BOM-less) UTF-8 in PowerShell (Core).The receiving end is covered by default: PowerShell uses
[Console]::OutputEncoding
(which itself reflects the code page reported bychcp
) for decoding data received, and on Windows this by default reflects the active OEM code page, both in Windows PowerShell and PowerShell [Core][1].
To fix your primary problem, you therefore need to set $OutputEncoding
to the active OEM code page:
# Make sure that PowerShell uses the OEM code page when sending
# data to `.\Crypt.exe`
$OutputEncoding = [Console]::OutputEncoding
Problem 2: PowerShell invariably appends a trailing newline to data that doesn't already have one when piping data to external programs:
That is, "foo" | .\Crypt.exe
doesn't send (the $OutputEncoding
-encoded bytes representing) "foo"
to .\Crypt.exe
's stdin, it sends "foo`r`n"
on Windows; i.e., a (platform-appropriate) newline sequence (CRLF on Windows) is automatically and invariably appended (unless the string already happens to have a trailing newline).
This problematic behavior is discussed in GitHub issue #5974 and also in this answer.
In your specific case, the implicitly appended "`r`n"
is also subject to the byte-value-shifting, which means that the 1st Crypt.exe
calls transforms it to -*
, causing another "`r`n"
to be appended when the data is sent to the 2nd Crypt.exe
call.
The net result is an extra newline that is round-tripped (the intermediate -*
), plus an encrypted newline that results in φΩ
).
In short: If your input data had no trailing newline, you'll have to cut off the last 4 characters from the result (representing the round-tripped and the inadvertently encrypted newline sequences):
# Ensure that .\Crypt.exe output is correctly decoded.
$OutputEncoding = [Console]::OutputEncoding
# Invoke the command and capture its output in variable $result.
# Note the use of the `Get-Content` cmdlet; in PowerShell, `type`
# is simply a built-in *alias* for it.
$result = Get-Content .\test.txt | .\Crypt.exe --decrypt | .\Crypt.exe --encrypt
# Remove the last 4 chars. and print the result.
$result.Substring(0, $result.Length - 4)
Given that calling cmd /c
as shown at the top of the answer works too, that hardly seems worth it.
How PowerShell handles pipeline data with external programs:
Unlike cmd
(or POSIX-like shells such as bash
):
- PowerShell doesn't support raw byte data in pipelines.[2]
- When talking to external programs, it only knows text (whereas it passes .NET objects when talking to PowerShell's own commands, which is where much of its power comes from).
Specifically, this works as follows:
When you send data to an external program via the pipeline (to its stdin stream):
It is converted to text (strings) using the character encoding specified in the
$OutputEncoding
preference variable, which defaults to ASCII(!) in Windows PowerShell, and (BOM-less) UTF-8 in PowerShell (Core).Caveat: If you assign an encoding with a BOM to
$OutputEncoding
, PowerShell (as of v7.0) will emit the BOM as part of the first line of output sent to an external program; therefore, for instance, do not use[System.Text.Encoding]::Utf8
(which emits a BOM) in Windows PowerShell, and use[System.Text.Utf8Encoding]::new($false)
(which doesn't) instead.If the data is not captured or redirected by PowerShell, encoding problems may not always become apparent, namely if an external program is implemented in a way that uses the Windows Unicode console API to print to the display.
Something that isn't already text (a string) is stringified using PowerShell's default output formatting (the same format you see when you print to the console), with an important caveat:
- If the (last) input object already is a string that doesn't itself have a trailing newline, one is invariably appended (and even an existing trailing newline is replaced with the platform-native one, if different).
- This behavior can cause problems, as discussed in GitHub issue #5974 and also in this answer.
- If the (last) input object already is a string that doesn't itself have a trailing newline, one is invariably appended (and even an existing trailing newline is replaced with the platform-native one, if different).
When you capture / redirect data from an external program (from its stdout stream), it is invariably decoded as lines of text (strings), based on the encoding specified in
[Console]::OutputEncoding
, which defaults to the active OEM code page on Windows (surprisingly, in both PowerShell editions, as of v7.0-preview6[1]).PowerShell-internally text is represented using the .NET
System.String
type, which is based on UTF-16 code units (often loosely, but incorrectly called "Unicode"[3]).
The above also applies:
when piping data between external programs,
when data is redirected to a file; that is, irrespective of the source of the data and its original character encoding, PowerShell uses its default encoding(s) when sending data to files; in Windows PowerShell,
>
produces UTF-16LE-encoded files (with BOM), whereas PowerShell (Core) sensibly defaults to BOM-less UTF-8 (consistently, across file-writing cmdlets).
[1] In PowerShell (Core), given that $OutputEncoding
commendably already defaults to UTF-8, it would make sense to have [Console]::OutputEncoding
be the same - i.e., for the active code page to be effectively 65001
on Windows, as suggested in GitHub issue #7233.
[2] With input from a file, the closest you can get to raw byte handling is to read the file as a .NET System.Byte
array with Get-Content -AsByteStream
(PowerShell (Core)) / Get-Content -Encoding Byte
(Windows PowerShell), but the only way you can further process such as an array is to pipe to a PowerShell command that is designed to handle a byte array, or by passing it to a .NET type's method that expects a byte array. If you tried to send such an array to an external program via the pipeline, each byte would be sent as its decimal string representation on its own line.
[3] Unicode is the name of the abstract standard describing a "global alphabet". In concrete use, it has various standard encodings, UTF-8 and UTF-16 being the most widely used.
PowerShell's pipe adds linefeed
Introduction
Here is my Invoke-RawPipeline
function (get latest version from this Gist).
Use it to pipe binary data between processes' Standard Output and Standard Input streams. It can read input stream from file/pipeline and save resulting output stream to file.
It requires PsAsync module to be able to launch and pipe data in multiple processes.
In case of issues use -Verbose
switch to see debug output.
Examples
Redirecting to file
- Batch:
findstr.exe /C:"Warning" /I C:\Windows\WindowsUpdate.log > C:\WU_Warnings.txt
- PowerShell:
Invoke-RawPipeline -Command @{Path = 'findstr.exe' ; Arguments = '/C:"Warning" /I C:\Windows\WindowsUpdate.log'} -OutFile 'C:\WU_Warnings.txt'
Redirecting from file
- Batch:
svnadmin load < C:\RepoDumps\MyRepo.dump
- PowerShell:
Invoke-RawPipeline -InFile 'C:\RepoDumps\MyRepo.dump' -Command @{Path = 'svnadmin.exe' ; Arguments = 'load'}
Piping strings
- Batch:
echo TestString | find /I "test" > C:\SearchResult.log
- PowerShell:
'TestString' | Invoke-RawPipeline -Command @{Path = 'find.exe' ; Arguments = '/I "test"'} -OutFile 'C:\SearchResult.log'
Piping between multiple processes
- Batch:
ipconfig | findstr /C:"IPv4 Address" /I
- PowerShell:
Invoke-RawPipeline -Command @{Path = 'ipconfig'}, @{Path = 'findstr' ; Arguments = '/C:"IPv4 Address" /I'} -RawData
Code:
<#
.Synopsis
Pipe binary data between processes' Standard Output and Standard Input streams.
Can read input stream from file and save resulting output stream to file.
.Description
Pipe binary data between processes' Standard Output and Standard Input streams.
Can read input stream from file/pipeline and save resulting output stream to file.
Requires PsAsync module: http://psasync.codeplex.com
.Notes
Author: beatcracker (https://beatcracker.wordpress.com, https://github.com/beatcracker)
License: Microsoft Public License (http://opensource.org/licenses/MS-PL)
.Component
Requires PsAsync module: http://psasync.codeplex.com
.Parameter Command
An array of hashtables, each containing Command Name, Working Directory and Arguments
.Parameter InFile
This parameter is optional.
A string representing path to file, to read input stream from.
.Parameter OutFile
This parameter is optional.
A string representing path to file, to save resulting output stream to.
.Parameter Append
This parameter is optional. Default is false.
A switch controlling wheither ovewrite or append output file if it already exists. Default is to overwrite.
.Parameter IoTimeout
This parameter is optional. Default is 0.
A number of seconds to wait if Input/Output streams are blocked. Default is to wait indefinetely.
.Parameter ProcessTimeout
This parameter is optional. Default is 0.
A number of seconds to wait for process to exit after finishing all pipeline operations. Default is to wait indefinetely.
Details: https://msdn.microsoft.com/en-us/library/ty0d8k56.aspx
.Parameter BufferSize
This parameter is optional. Default is 4096.
Size of buffer in bytes for read\write operations. Supports standard Powershell multipliers: KB, MB, GB, TB, and PB.
Total number of buffers is: Command.Count * 2 + InFile + OutFile.
.Parameter ForceGC
This parameter is optional.
A switch, that if specified will force .Net garbage collection.
Use to immediately release memory on function exit, if large buffer size was used.
.Parameter RawData
This parameter is optional.
By default function returns object with StdOut/StdErr streams and process' exit codes.
If this switch is specified, function will return raw Standard Output stream.
.Example
Invoke-RawPipeline -Command @{Path = 'findstr.exe' ; Arguments = '/C:"Warning" /I C:\Windows\WindowsUpdate.log'} -OutFile 'C:\WU_Warnings.txt'
Batch analog: findstr.exe /C:"Warning" /I C:\Windows\WindowsUpdate.log' > C:\WU_Warnings.txt
.Example
Invoke-RawPipeline -Command @{Path = 'findstr.exe' ; WorkingDirectory = 'C:\Windows' ; Arguments = '/C:"Warning" /I .\WindowsUpdate.log'} -RawData
Batch analog: cd /D C:\Windows && findstr.exe /C:"Warning" /I .\WindowsUpdate.log
.Example
'TestString' | Invoke-RawPipeline -Command @{Path = 'find.exe' ; Arguments = '/I "test"'} -OutFile 'C:\SearchResult.log'
Batch analog: echo TestString | find /I "test" > C:\SearchResult.log
.Example
Invoke-RawPipeline -Command @{Path = 'ipconfig'}, @{Path = 'findstr' ; Arguments = '/C:"IPv4 Address" /I'} -RawData
Batch analog: ipconfig | findstr /C:"IPv4 Address" /I
.Example
Invoke-RawPipeline -InFile 'C:\RepoDumps\Repo.svn' -Command @{Path = 'svnadmin.exe' ; Arguments = 'load'}
Batch analog: svnadmin load < C:\RepoDumps\MyRepo.dump
#>
function Invoke-RawPipeline
{
[CmdletBinding()]
Param
(
[Parameter(ValueFromPipeline = $true)]
[ValidateNotNullOrEmpty()]
[ValidateScript({
if($_.psobject.Methods.Match.('ToString'))
{
$true
}
else
{
throw 'Can''t convert pipeline object to string!'
}
})]
$InVariable,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[ValidateScript({
$_ | ForEach-Object {
$Path = $_.Path
$WorkingDirectory = $_.WorkingDirectory
if(!(Get-Command -Name $Path -CommandType Application -ErrorAction SilentlyContinue))
{
throw "Command not found: $Path"
}
if($WorkingDirectory)
{
if(!(Test-Path -LiteralPath $WorkingDirectory -PathType Container -ErrorAction SilentlyContinue))
{
throw "Working directory not found: $WorkingDirectory"
}
}
}
$true
})]
[ValidateNotNullOrEmpty()]
[array]$Command,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[ValidateScript({
if(!(Test-Path -LiteralPath $_))
{
throw "File not found: $_"
}
$true
})]
[ValidateNotNullOrEmpty()]
[string]$InFile,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[ValidateScript({
if(!(Test-Path -LiteralPath (Split-Path $_)))
{
throw "Folder not found: $_"
}
$true
})]
[ValidateNotNullOrEmpty()]
[string]$OutFile,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[switch]$Append,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[ValidateRange(0, 2147483)]
[int]$IoTimeout = 0,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[ValidateRange(0, 2147483)]
[int]$ProcessTimeout = 0,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[long]$BufferSize = 4096,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[switch]$RawData,
[Parameter(ValueFromPipelineByPropertyName = $true)]
[switch]$ForceGC
)
Begin
{
$Modules = @{PsAsync = 'http://psasync.codeplex.com'}
'Loading modules:', ($Modules | Format-Table -HideTableHeaders -AutoSize | Out-String) | Write-Verbose
foreach($module in $Modules.GetEnumerator())
{
if(!(Get-Module -Name $module.Key))
{
Try
{
Import-Module -Name $module.Key -ErrorAction Stop
}
Catch
{
throw "$($module.Key) module not available. Get it here: $($module.Value)"
}
}
}
function New-ConsoleProcess
{
Param
(
[string]$Path,
[string]$Arguments,
[string]$WorkingDirectory,
[switch]$CreateNoWindow = $true,
[switch]$RedirectStdIn = $true,
[switch]$RedirectStdOut = $true,
[switch]$RedirectStdErr = $true
)
if(!$WorkingDirectory)
{
if(!$script:MyInvocation.MyCommand.Path)
{
$WorkingDirectory = [System.AppDomain]::CurrentDomain.BaseDirectory
}
else
{
$WorkingDirectory = Split-Path $script:MyInvocation.MyCommand.Path
}
}
Try
{
$ps = New-Object -TypeName System.Diagnostics.Process -ErrorAction Stop
$ps.StartInfo.Filename = $Path
$ps.StartInfo.Arguments = $Arguments
$ps.StartInfo.UseShellExecute = $false
$ps.StartInfo.RedirectStandardInput = $RedirectStdIn
$ps.StartInfo.RedirectStandardOutput = $RedirectStdOut
$ps.StartInfo.RedirectStandardError = $RedirectStdErr
$ps.StartInfo.CreateNoWindow = $CreateNoWindow
$ps.StartInfo.WorkingDirectory = $WorkingDirectory
}
Catch
{
throw $_
}
return $ps
}
function Invoke-GarbageCollection
{
[gc]::Collect()
[gc]::WaitForPendingFinalizers()
}
$CleanUp = {
$IoWorkers + $StdErrWorkers |
ForEach-Object {
$_.Src, $_.Dst |
ForEach-Object {
if(!($_ -is [System.Diagnostics.Process]))
{
Try
{
$_.Close()
}
Catch
{
Write-Error "Failed to close $_"
}
$_.Dispose()
}
}
}
}
$PumpData = {
Param
(
[hashtable]$Cfg
)
# Fail hard, we don't want stuck threads
$Private:ErrorActionPreference = 'Stop'
$Src = $Cfg.Src
$SrcEndpoint = $Cfg.SrcEndpoint
$Dst = $Cfg.Dst
$DstEndpoint = $Cfg.DstEndpoint
$BufferSize = $Cfg.BufferSize
$SyncHash = $Cfg.SyncHash
$RunspaceId = $Cfg.Id
# Setup Input and Output streams
if($Src -is [System.Diagnostics.Process])
{
switch ($SrcEndpoint)
{
'StdOut' {$InStream = $Src.StandardOutput.BaseStream}
'StdIn' {$InStream = $Src.StandardInput.BaseStream}
'StdErr' {$InStream = $Src.StandardError.BaseStream}
default {throw "Not valid source endpoint: $_"}
}
}
else
{
$InStream = $Src
}
if($Dst -is [System.Diagnostics.Process])
{
switch ($DstEndpoint)
{
'StdOut' {$OutStream = $Dst.StandardOutput.BaseStream}
'StdIn' {$OutStream = $Dst.StandardInput.BaseStream}
'StdErr' {$OutStream = $Dst.StandardError.BaseStream}
default {throw "Not valid destination endpoint: $_"}
}
}
else
{
$OutStream = $Dst
}
$InStream | Out-String | ForEach-Object {$SyncHash.$RunspaceId.Status += "InStream: $_"}
$OutStream | Out-String | ForEach-Object {$SyncHash.$RunspaceId.Status += "OutStream: $_"}
# Main data copy loop
$Buffer = New-Object -TypeName byte[] $BufferSize
$BytesThru = 0
Try
{
Do
{
$SyncHash.$RunspaceId.IoStartTime = [DateTime]::UtcNow.Ticks
$ReadCount = $InStream.Read($Buffer, 0, $Buffer.Length)
$OutStream.Write($Buffer, 0, $ReadCount)
$OutStream.Flush()
$BytesThru += $ReadCount
}
While($readCount -gt 0)
}
Catch
{
$SyncHash.$RunspaceId.Status += $_
}
Finally
{
$OutStream.Close()
$InStream.Close()
}
}
}
Process
Related Topics
How to Check If Sed Has Changed a File
Linux Time Command Microseconds or Better Accuracy
Linux: Where Are Environment Variables Stored
"In-Source Builds Are Not Allowed" in Cmake
Docker: Are Docker Links Deprecated
Execute Combine Multiple Linux Commands in One Line
Setting the Vim Background Colors
What's the Memory Before 0X08048000 Used for in 32 Bit MAChine
Secure Way to Run Other People Code (Sandbox) on My Server
What's the Point of Eval/Bash -C as Opposed to Just Evaluating a Variable
How to Attach a File Using Mail Command on Linux
Truncating a File While It's Being Used (Linux)
How to Read the Source Code of Shell Commands
How to Delete All Lines in a File Starting from After a Matching Line