admin管理员组文章数量:1023903
The program inv.exe returns some console data based on parameters. It seems like a JSON/dictionary, but it's in text format (printed output). It works when I simply call it without trying to capture the output.
.\inv.exe getter segments
{28: 'Renda Fixa', 29: 'Renda Variável', ...
However, if I try to capture it, it doesn't work:
$segmentsjson = .\inv.exe getter segments
$segmentsjson
{28: 'Renda Fixa', 29: 'Renda Vari�vel'....
$segmentsjson = .\inv.exe getter segments | ConvertFrom-Json
$segmentsjson
{"28": "Renda Fixa", "29": "Renda Variável"...
What I tried:
1. chcp 65001
2. $OutputEncoding = [System.Text.Encoding]::UTF8
3. [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
4. $OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8Encoding
5. .\inv.exe getter segments > test.txt
$segmentsjson = Get-Content "test.txt" -Encoding UTF8
6. .\inv.exe getter segments | Out-File -FilePath "output_temp.txt" -Encoding UTF-8
7. cmd /c inv.exe getter segments > test.txt
The program inv.exe returns some console data based on parameters. It seems like a JSON/dictionary, but it's in text format (printed output). It works when I simply call it without trying to capture the output.
.\inv.exe getter segments
{28: 'Renda Fixa', 29: 'Renda Variável', ...
However, if I try to capture it, it doesn't work:
$segmentsjson = .\inv.exe getter segments
$segmentsjson
{28: 'Renda Fixa', 29: 'Renda Vari�vel'....
$segmentsjson = .\inv.exe getter segments | ConvertFrom-Json
$segmentsjson
{"28": "Renda Fixa", "29": "Renda Variável"...
What I tried:
1. chcp 65001
2. $OutputEncoding = [System.Text.Encoding]::UTF8
3. [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
4. $OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8Encoding
5. .\inv.exe getter segments > test.txt
$segmentsjson = Get-Content "test.txt" -Encoding UTF8
6. .\inv.exe getter segments | Out-File -FilePath "output_temp.txt" -Encoding UTF-8
7. cmd /c inv.exe getter segments > test.txt
Share
Improve this question
asked Nov 19, 2024 at 14:10
LegsNotHandLegsNotHand
234 bronze badges
2
|
1 Answer
Reset to default 2Character-encoding problems may only surface if external-program output is captured or redirected in PowerShell on Windows, because some CLIs - including high-profile ones such as python.exe
and node.exe
- use the Unicode version of the WriteConsole
WinAPI function when printing to the console, where all characters print as intended.[1]
PowerShell indeed uses [Console]::OutputEncoding
when decoding external-program output into .NET strings (System.String
([string]
, in PowerShell terms), which internally uses a Unicode encoding composed of in-memory UTF-16 code units (System.Char
([char]
)).
If [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
doesn't help, the implication is that inv.exe
's output isn't UTF-8.
Thus, you must (temporarily) set [Console]::OutputEncoding
to match the actual character encoding inv.exe
uses, which looks to be the legacy system locale's active ANSI encoding, presumably Windows-1252.
The following code temporarily sets [Console]::OutputEncoding
to the active ANSI code page's, calls .inv.exe
, then restores the original encoding:
$segmentsjson =
& {
$prevEnc = [Console]::OutputEncoding
# Set [Console]::OutputEncoding to that of the system's active ANSI code page.
[Console]::OutputEncoding =
if ($IsCoreCLR) { [Text.Encoding]::GetEncoding([int] (Get-ItemPropertyValue registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage ACP)) }
else { [Text.Encoding]::Default }
.\inv.exe getter segments
[Console]::OutputEncoding = $prevEnc
}
Note:
Obtaining the encoding for the active ANSI code page is simpler in Windows PowerShell (the legacy, ships-with-Windows, Windows-only edition of PowerShell whose latest and last version is 5.1), via
[Text.Encoding]::Default
, compared to PowerShell (Core) 7, (detected via$IsCoreClr
being$true
) where[Text.Encoding]::Default
reports UTF-8, and the registry must be consulted for the ANSI code page number).For additional information and helper functions, see this answer; with the
Invoke-WithEncoding
helper function from the linked answer, the solution would be as simple as:$segmentsJson = Invoke-WithEncoding -Encoding ANSI { .\inv.exe getter segments }
Alternatively, see if your CLI offers a parameter that would allow you to specify the desired output character encoding, such as UTF-8 (which is preferable, because it doesn't limit the set of characters that can be output).
CLIs that use Python behind the scenes may switch to UTF-8 if$env:PYTHONUTF8 = 1
is in effect.
[1] Depending on the selected font, not all Unicode characters may render properly, but the console buffer does store them correctly, so you can copy and paste them without loss of information.
The program inv.exe returns some console data based on parameters. It seems like a JSON/dictionary, but it's in text format (printed output). It works when I simply call it without trying to capture the output.
.\inv.exe getter segments
{28: 'Renda Fixa', 29: 'Renda Variável', ...
However, if I try to capture it, it doesn't work:
$segmentsjson = .\inv.exe getter segments
$segmentsjson
{28: 'Renda Fixa', 29: 'Renda Vari�vel'....
$segmentsjson = .\inv.exe getter segments | ConvertFrom-Json
$segmentsjson
{"28": "Renda Fixa", "29": "Renda Variável"...
What I tried:
1. chcp 65001
2. $OutputEncoding = [System.Text.Encoding]::UTF8
3. [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
4. $OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8Encoding
5. .\inv.exe getter segments > test.txt
$segmentsjson = Get-Content "test.txt" -Encoding UTF8
6. .\inv.exe getter segments | Out-File -FilePath "output_temp.txt" -Encoding UTF-8
7. cmd /c inv.exe getter segments > test.txt
The program inv.exe returns some console data based on parameters. It seems like a JSON/dictionary, but it's in text format (printed output). It works when I simply call it without trying to capture the output.
.\inv.exe getter segments
{28: 'Renda Fixa', 29: 'Renda Variável', ...
However, if I try to capture it, it doesn't work:
$segmentsjson = .\inv.exe getter segments
$segmentsjson
{28: 'Renda Fixa', 29: 'Renda Vari�vel'....
$segmentsjson = .\inv.exe getter segments | ConvertFrom-Json
$segmentsjson
{"28": "Renda Fixa", "29": "Renda Variável"...
What I tried:
1. chcp 65001
2. $OutputEncoding = [System.Text.Encoding]::UTF8
3. [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
4. $OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8Encoding
5. .\inv.exe getter segments > test.txt
$segmentsjson = Get-Content "test.txt" -Encoding UTF8
6. .\inv.exe getter segments | Out-File -FilePath "output_temp.txt" -Encoding UTF-8
7. cmd /c inv.exe getter segments > test.txt
Share
Improve this question
asked Nov 19, 2024 at 14:10
LegsNotHandLegsNotHand
234 bronze badges
2
-
If you have the chance to test with pwsh 7 probably wouldn't have this issue. A few things you can test, 1. change the encoding of your
ps1
script file. 2. try with utf8 with a BOM ($OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new($true)
– Santiago Squarzon Commented Nov 19, 2024 at 14:30 -
Some native programs behave differently when redirected. When you redirect to a file, using
cmd
shell, is the file actually encoded in UTF-8?inv.exe getter segments >file.txt
. Then inspect the encoding using Notepad++ for instance. – zett42 Commented Nov 19, 2024 at 16:39
1 Answer
Reset to default 2Character-encoding problems may only surface if external-program output is captured or redirected in PowerShell on Windows, because some CLIs - including high-profile ones such as python.exe
and node.exe
- use the Unicode version of the WriteConsole
WinAPI function when printing to the console, where all characters print as intended.[1]
PowerShell indeed uses [Console]::OutputEncoding
when decoding external-program output into .NET strings (System.String
([string]
, in PowerShell terms), which internally uses a Unicode encoding composed of in-memory UTF-16 code units (System.Char
([char]
)).
If [Console]::OutputEncoding = [System.Text.Encoding]::UTF8
doesn't help, the implication is that inv.exe
's output isn't UTF-8.
Thus, you must (temporarily) set [Console]::OutputEncoding
to match the actual character encoding inv.exe
uses, which looks to be the legacy system locale's active ANSI encoding, presumably Windows-1252.
The following code temporarily sets [Console]::OutputEncoding
to the active ANSI code page's, calls .inv.exe
, then restores the original encoding:
$segmentsjson =
& {
$prevEnc = [Console]::OutputEncoding
# Set [Console]::OutputEncoding to that of the system's active ANSI code page.
[Console]::OutputEncoding =
if ($IsCoreCLR) { [Text.Encoding]::GetEncoding([int] (Get-ItemPropertyValue registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage ACP)) }
else { [Text.Encoding]::Default }
.\inv.exe getter segments
[Console]::OutputEncoding = $prevEnc
}
Note:
Obtaining the encoding for the active ANSI code page is simpler in Windows PowerShell (the legacy, ships-with-Windows, Windows-only edition of PowerShell whose latest and last version is 5.1), via
[Text.Encoding]::Default
, compared to PowerShell (Core) 7, (detected via$IsCoreClr
being$true
) where[Text.Encoding]::Default
reports UTF-8, and the registry must be consulted for the ANSI code page number).For additional information and helper functions, see this answer; with the
Invoke-WithEncoding
helper function from the linked answer, the solution would be as simple as:$segmentsJson = Invoke-WithEncoding -Encoding ANSI { .\inv.exe getter segments }
Alternatively, see if your CLI offers a parameter that would allow you to specify the desired output character encoding, such as UTF-8 (which is preferable, because it doesn't limit the set of characters that can be output).
CLIs that use Python behind the scenes may switch to UTF-8 if$env:PYTHONUTF8 = 1
is in effect.
[1] Depending on the selected font, not all Unicode characters may render properly, but the console buffer does store them correctly, so you can copy and paste them without loss of information.
本文标签: printingHow to capture the output string of a UTF8 program using PowerShellStack Overflow
版权声明:本文标题:printing - How to capture the output string of a UTF-8 program using PowerShell? - Stack Overflow 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://it.en369.cn/questions/1745556181a2155889.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
ps1
script file. 2. try with utf8 with a BOM ($OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new($true)
– Santiago Squarzon Commented Nov 19, 2024 at 14:30cmd
shell, is the file actually encoded in UTF-8?inv.exe getter segments >file.txt
. Then inspect the encoding using Notepad++ for instance. – zett42 Commented Nov 19, 2024 at 16:39