问题
I am creating a script that will copy a file, rename it and then look inside to remove certain special characters. One of these special characters is some sort of ASCII apostrophe that I cannot replicate with keys. I can copy and paste it though, however the replace function doesn't work.
Opens file > Searches for strange apostrophe ’ and replaces with nothing. I'd like it to replace it with a normal apostrophe but I don't know how this is done, and at current the biggest problem is that I can't get it to "see" this strange apostrophe that winds up in the autogenerated file I'm modifying. Any help much appreciated. Thanks :)
Apostrophe in file: ’
Normal Apostrophe: '
This is a chunk of the batch that I've isolated to test with.
@echo off
set YYMMDD=%DATE:~-2,2%%DATE:~-7,2%%DATE:~-10,2%
set DDMMYYYY=%DATE:~-10,2%%DATE:~-7,2%%DATE:~-4,4%
set YYYY-MM-DD=%DATE:~-4,4%-%DATE:~-7,2%-%DATE:~-10,2%
powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv') -replace '’', '' | Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'"
Echo Done
回答1:
set "fileIn=C:\LOCATION\Client_List_%DDMMYYYY%.csv"
set "fileOu=C:\LOCATION\Client_List_%DDMMYYYY%.csv"
powershell -c "(gc '%fileIn%').Replace('‘‘','').Replace('’’','')|Out-File '%fileOu%'"
That strange apostrophe ’ is U+2019 Right Single Quotation Mark, supposedly a closing quote. It could be paired with a different opening quote. In above example, ‘ is U+2018 Left Single Quotation Mark.
Get-Help 'about_Quoting_Rules' says
Quotation marks are used to specify a literal string. You can enclose a string in single quotation marks (
') or double quotation marks (").
In fact, PowerShell accepts two different sets of quotes:
- double quotation marks
"“”„ - single quotation marks
'‘’‚‛
AFAIK, all those quotation marks are present in most Windows ANSI code pages (1252, 1250, 1257, 1253, 1251, 1254, 1255, 1256, 1258) so they may be used literally in ANSI-saved .bat script - except the latter quotation mark ‛ U+201B Single High-Reversed-9 Quotation Mark. In such case, use $([char]0x201B) instead of '‛‛' as follows:
rem cast [char] to `[string]` ↓↓↓↓↓↓↓↓
powershell -c "(gc '%fileIn%').Replace( [string]$([char]0x201B) , '')"
rem ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
or as follows:
rem [char] can't be empty so specify `[string]` ↓↓↓↓↓↓↓↓
powershell -c "(gc '%fileIn%').Replace( $([char]0x201B) , [string]'')"
rem ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
Analysis and explanation
Next PowerShell code snippet shows an excerpt from Unicode database (character names ending with Quotation Mark or containing Apostrophe):
PS D:> 0x22,0x27,0x00AB,0x00BB,0x2018,0x2019,0x201A,0x201B,0x201C,0x201D,0x201E,0x201F,
0x2039,0x203A,0x2E42,0x301D,0x301E,0x301F,0x055A | Get-CharInfo | Format-Table -AutoSize
Char CodePoint Category Description
---- --------- -------- -----------
" U+0022 OtherPunctuation Quotation Mark
' U+0027 OtherPunctuation Apostrophe
« U+00AB InitialQuotePunctuation Left-Pointing Double Angle Quotation Mark
» U+00BB FinalQuotePunctuation Right-Pointing Double Angle Quotation Mark
‘ U+2018 InitialQuotePunctuation Left Single Quotation Mark
’ U+2019 FinalQuotePunctuation Right Single Quotation Mark
‚ U+201A OpenPunctuation Single Low-9 Quotation Mark
‛ U+201B InitialQuotePunctuation Single High-Reversed-9 Quotation Mark
“ U+201C InitialQuotePunctuation Left Double Quotation Mark
” U+201D FinalQuotePunctuation Right Double Quotation Mark
„ U+201E OpenPunctuation Double Low-9 Quotation Mark
‟ U+201F InitialQuotePunctuation Double High-Reversed-9 Quotation Mark
‹ U+2039 InitialQuotePunctuation Single Left-Pointing Angle Quotation Mark
› U+203A FinalQuotePunctuation Single Right-Pointing Angle Quotation Mark
⹂ U+2E42 OtherNotAssigned Undefined
〝 U+301D OpenPunctuation Reversed Double Prime Quotation Mark
〞 U+301E ClosePunctuation Double Prime Quotation Mark
〟 U+301F ClosePunctuation Low Double Prime Quotation Mark
՚ U+055A OtherPunctuation Armenian Apostrophe
(Output from modified Get-CharInfo cmdlet.) Original Get-CharInfo module is downloadable from http://poshcode.org/5234.
Next PowerShell script completes above results by showing some valid (and invalid in my locale) combinations of quotes:
$arrSingleQuotes =
''' U+0027 Apostrophe ''' ,
‘‘‘ U+2018 Left Single Quotation Mark ‘‘‘ ,
’’’ U+2019 Right Single Quotation Mark ’’’ ,
‚‚‚ U+201A Single Low-9 Quotation Mark ‚‚‚ ,
‛‛‛ U+201B Single High-Reversed-9 Quotation Mark ‛‛‛ ,
‘‘‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’’’ ,
’’’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘‘‘
'$arrSingleQuotes (any combination)'
$arrSingleQuotes
$arrDoubleQoutes =
""" U+0022 Quotation Mark """ ,
“““ U+201C Left Double Quotation Mark “““ ,
””” U+201D Right Double Quotation Mark ””” ,
„„„ U+201E Double Low-9 Quotation Mark „„„ ,
“““ U+201C (Left/Right) Double Quotation Mark U+201D ””” ,
””” U+201D (Right/Left) Double Quotation Mark U+201C “““
'$arrDoubleQoutes (any combination)'
$arrDoubleQoutes
$noQuotes = @"
« U+00AB Left-Pointing Double Angle Quotation Mark
» U+00BB Right-Pointing Double Angle Quotation Mark
‟ U+201F Double High-Reversed-9 Quotation Mark
⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
‹ U+2039 Single Left-Pointing Angle Quotation Mark
› U+203A Single Right-Pointing Angle Quotation Mark
〝 U+301D Reversed Double Prime Quotation Mark
〞U+301E Double Prime Quotation Mark
〟U+301F Low Double Prime Quotation Mark
՚ U+055A Armenian Apostrophe
"@
'$noQuotes'
$noQuotes
Output:
PS D:> D:\PShell\SO\41488245_quotes.ps1
$arrSingleQuotes (any combination)
' U+0027 Apostrophe '
‘ U+2018 Left Single Quotation Mark ‘
’ U+2019 Right Single Quotation Mark ’
‚ U+201A Single Low-9 Quotation Mark ‚
‛ U+201B Single High-Reversed-9 Quotation Mark ‛
‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’
’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘
$arrDoubleQoutes (any combination)
" U+0022 Quotation Mark "
“ U+201C Left Double Quotation Mark “
” U+201D Right Double Quotation Mark ”
„ U+201E Double Low-9 Quotation Mark „
“ U+201C (Left/Right) Double Quotation Mark U+201D ”
” U+201D (Right/Left) Double Quotation Mark U+201C “
$noQuotes
« U+00AB Left-Pointing Double Angle Quotation Mark
» U+00BB Right-Pointing Double Angle Quotation Mark
‟ U+201F Double High-Reversed-9 Quotation Mark
⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
‹ U+2039 Single Left-Pointing Angle Quotation Mark
› U+203A Single Right-Pointing Angle Quotation Mark
〝 U+301D Reversed Double Prime Quotation Mark
〞U+301E Double Prime Quotation Mark
〟U+301F Low Double Prime Quotation Mark
՚ U+055A Armenian Apostrophe
Note that ⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK is present in Unicode database and is properly rendered in PowerShell ISE.
Addendum: I found more candidates of quotation marks (shown merely result obtained from Excerpt_From_UnicodeDataTxt.ps1 script):
PS > $x = .\tests\Excerpt_From_UnicodeDataTxt.ps1 -SearchString "Quotation|Apostrophe" |
Where-Object {$_.Category -match 'Punctuation'}
PS > $x.Count
23
PS > $x
Char CodePoint Category Description
---- --------- -------- -----------
" U+0022 Po-OtherPunctuation Quotation Mark
' U+0027 Po-OtherPunctuation Apostrophe
« U+00AB Pi-InitialQuotePunctuation Left-Pointing Double Angle Quotation Mark
» U+00BB Pf-FinalQuotePunctuation Right-Pointing Double Angle Quotation Mark
՚ U+055A Po-OtherPunctuation Armenian Apostrophe
‘ U+2018 Pi-InitialQuotePunctuation Left Single Quotation Mark
’ U+2019 Pf-FinalQuotePunctuation Right Single Quotation Mark
‚ U+201A Ps-OpenPunctuation Single Low-9 Quotation Mark
‛ U+201B Pi-InitialQuotePunctuation Single High-Reversed-9 Quotation Mark
“ U+201C Pi-InitialQuotePunctuation Left Double Quotation Mark
” U+201D Pf-FinalQuotePunctuation Right Double Quotation Mark
„ U+201E Ps-OpenPunctuation Double Low-9 Quotation Mark
‟ U+201F Pi-InitialQuotePunctuation Double High-Reversed-9 Quotation Mark
‹ U+2039 Pi-InitialQuotePunctuation Single Left-Pointing Angle Quotation Mark
› U+203A Pf-FinalQuotePunctuation Single Right-Pointing Angle Quotation Mark
❮ U+276E Ps-OpenPunctuation Heavy Left-Pointing Angle Quotation Mark Ornament
❯ U+276F Pe-ClosePunctuation Heavy Right-Pointing Angle Quotation Mark Ornament
⹂ U+2E42 Ps-OpenPunctuation Undefined
〝 U+301D Ps-OpenPunctuation Reversed Double Prime Quotation Mark
〞 U+301E Pe-ClosePunctuation Double Prime Quotation Mark
〟 U+301F Pe-ClosePunctuation Low Double Prime Quotation Mark
" U+FF02 Po-OtherPunctuation Fullwidth Quotation Mark
' U+FF07 Po-OtherPunctuation Fullwidth Apostrophe
回答2:
I think it's a weird backtick character. At least that's what it's acting like.
If I do this:
$text = "Weird ’ Normal ' Backtick ` Weird ’ "
$text.Replace("’","")
It gives me This:
Weird Normal ' Backtick Weird
So does this work?
powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv').replace('’’', '') |
Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'"
By doubling a normal back tick, it makes the script take the character literally. Doubling the weird apostrophe seems to do the same thing, at least in my testing that works.
来源:https://stackoverflow.com/questions/41488245/cmd-to-powershell-replace-special-character