Split a String and then assign the splits

孤人 提交于 2020-02-21 03:13:22

问题


I have a text file, in the text file are two names, exactly like this.

Tom Hardy

Brad Pitt

I use this, to take the names from the file and split them.

$Names = gc C:\Temp\Name.txt

ForEach-Object {-Split $Names}

How do I then assign each first name to $FirstName and each last name to $LastName?

The idea behind this is that further down the line, for each $FirstName I will be creating a specific individual item with each name.

I understand that after I run the above, each section of the name is assigned to $_ so I can do the same thing with each section i.e

$Names = gc C:\Temp\Name.txt

$SplitNames = ForEach-Object {-Split $Names}

ForEach ($_ in $SplitNames) {Write-Host 'Name is' $_}
Name is Tom
Name is Hardy
Name is Brad
Name is Pitt

Hope this makes sense, please let me know if more clarification is needed.


回答1:


Same as @Paxz but with some explanation and suggestions:

$Names = @(
    'Brad Pitt',
    'Tom Hardy',
    'Daniel Craig Junior'
)

# the .ForEAch method is used as it's faster then piping data to Foreach-Object
$result = $Names.ForEach({
    # we use the second argument of -Split to indicate 
    # we're only interested in two values
    $tmpSplit = $_ -split ' ', 2

    # we then create an object that allows us to 
    # name things propertly so we can play with it later withoout hassle
    [PSCustomObject]@{
        Input = $_
        FirstName = $tmpSplit[0]
        LastName = $tmpSplit[1]
    }
})

# here we show the result of all our objects created
$result

# enable verbose text to he displayed
$VerbosePreference = 'Continue'

$result.ForEach({
    # here we can easily address the object by its property names
    Write-Verbose "Input '$($_.Input)' FirstName '$($_.FirstName)' LastName '$($_.LastName)'"
})

# disable verbose messages, because we don't need this in production
$VerbosePreference = 'SilentlyContinue'



回答2:


# Read the input file line by line with Get-Content and send each line
# to the ForEach-Object cmdlet, which sees each line as automatic variable
# $_
Get-Content C:\Temp\Name.txt | ForEach-Object {
  # Split the line into tokens by whitespace.
  # * $firstName receives the 1st token,
  # * $lastName the 2nd one (if there were more, $lastName would become an *array*)
  $firstName, $lastName = -split $_
  # Work with $firstName and $lastName
}

If you want to collect the name pairs for later use, consider wrapping them in custom objects, as in DarkLite1's answer.


As for what you tried:

ForEach-Object { -Split $Names }

ForEach ($_ in $SplitNames) {Write-Host 'Name is' $_}

If you call ForEach-Object without providing pipeline input to it, the script block is executed once, so that ForEach-Object { -Split $Names } is effectively the same as just calling -Split $Names.

Generally, these statements suggest that there's confusion around the distinction between PowerShell's various enumeration constructs.


PowerShell's various enumeration constructs:

  • The ForEach-Object cmdlet:

    • is designed to receive input via the pipeline (|)
    • reflects each input object in automatic variable $_
    • e.g., 1, 2, 3 | ForEach-Object { "number: $_ " }
    • Note: Sending $null as input does result in an invocation - unlike in a foreach loop.
  • The foreach loop statement:

    • is designed to enumerate a specified in-memory collection
    • via a self-chosen iteration variable (better not to choose $_, to avoid confusion)
    • e.g., foreach ($num in 1, 2, 3) { "number: $num" }
    • Note: $null as the input collection does not result in entering the loop body - unlike with ForEach-Object.
  • PSv4+ also offers the .ForEach() array method:

    • Similar to the foreach loop, it is designed to enumerate an in-memory collection, but you invoke it as a method on the collection itself.
    • Similar to the ForEach-Object cmdlet, it is automatic variable $_ that reflects the current iteration's object.
    • It offers additional functionality, such as enumerating properties by name, performing type conversions, calling methods.
    • e.g., (1, 2, 3).ForEach({ "number: $_" }
    • Note: $null as the input collection does not result in an invocation of the script block - unlike with ForEach-Object.
  • Perhaps surprisingly, PowerShell's switch statement too performs enumeration on inputs that happen to be collections.

    • The switch equivalent of foreach ($num in 1, 2, 3) { "number: $num" } is (note the use of automatic variable $_ as the implicit iterator variable):
      switch (1, 2, 3) { default { "number: $_"; continue } }
    • switch is similar to the foreach loop statement in terms of memory efficiency, performance and output timing, so it won't be discussed separately below. Its advantage is the ability to use sophisticated conditionals as well as being able to enumerate the lines of a file directly, with the -File option.
    • Note: $null as the input collection does result in an evaluation of the branches - unlike with foreach loops.

Somewhat confusingly, foreach is also a built-in alias for ForEach-Object. If you use foreach, it is the parsing context that determines which construct is used: in a pipeline (command context, argument mode), foreach refers to the ForEach-Object cmdlet, otherwise it refers to the foreach loop (expression mode) - see this answer for details.

As for what construct to use when: there are tradeoffs between:

  • performance (execution speed)
    • the foreach loop is generally fastest, followed by the .ForEach() method, with the ForEach-Object cmdlet being the slowest (the pipeline is slow in general)[1]
  • memory efficiency
    • only the ForEach-Object cmdlet (the pipeline) offers streaming processing, where each object is processed as it is being produced; unless the overall result is collected in memory (as opposed to outputting to a file, for instance), this keeps memory use constant, irrespective of how many objects are ultimately processed.
    • foreach and .ForEach() require the input collection to be present in memory in full.
  • output timing
    • the ForEach-Object cmdlet (and the pipeline in general) passes objects on as they're being processed / produced, so you usually start to see output right away.
    • foreach and .ForEach(), when given the output of a command, must collect that command's output in full, up front, before enumeration can start.
  • convenience
    • the .ForEach() method can be used as-is as part of an expression, whereas use of ForEach-Object requires enclosing (...) and use of a foreach loop requires enclosing $(...)
    • the .ForEach() method offers additional functionality that allows for concise expressions (emulating the functionality with foreach and ForEach-Object is possible, but more verbose)

[1] Running the following performance-comparison commands, which employ a trivial loop body, shows foreach to be faster than .ForEach() in my tests, with ForEach-Object slowest, as expected. You can download the TimeCommand.ps1 script from this Gist:

# 100 items, average of 10,000 runs
$a=1..100; ./Time-Command { foreach($e in $a) { ++$e } }, { $a.ForEach({ ++$_ }) }, { $a | ForEach-Object { ++$_ } } 10000
# 1 million items, average of 10 runs
$a=1..1e6; ./Time-Command { foreach($e in $a) { ++$e } }, { $a.ForEach({ ++$_ }) }, { $a | ForEach-Object { ++$_ } } 10

Results from a single-core Windows 10 VM running on my laptop, with Windows PowerShell v5.1 / PowerShell Core 6.1.0-preview.4; note that the absolute numbers are not important and will vary based on many variables, but the ratio (column Factor) should give you a sense.
Do tell us if you get different rankings.

Windows PowerShell v5.1:

100 items, average of 10,000 runs:

Command                        FriendlySecs (10000-run avg.) TimeSpan         Factor
-------                        ----------------------------- --------         ------
 foreach($e in $a) { ++$e }    0.000                         00:00:00.0001415 1.00
 $a.ForEach({ ++$_ })          0.000                         00:00:00.0002937 2.08
 $a | ForEach-Object { ++$_ }  0.001                         00:00:00.0006257 4.42

1 million items, average of 10 runs:

Command                        FriendlySecs (10-run avg.) TimeSpan         Factor
-------                        -------------------------- --------         ------
 foreach($e in $a) { ++$e }    1.297                      00:00:01.2969321 1.00
 $a.ForEach({ ++$_ })          2.548                      00:00:02.5480889 1.96
 $a | ForEach-Object { ++$_ }  5.992                      00:00:05.9917937 4.62

PowerShell Core 6.1.0-preview.4:

100 items, average of 10,000 runs:

Command                        FriendlySecs (10000-run avg.) TimeSpan         Factor
-------                        ----------------------------- --------         ------
 foreach($e in $a) { ++$e }    0.000                         00:00:00.0001981 1.00
 $a.ForEach({ ++$_ })          0.000                         00:00:00.0004406 2.22
 $a | ForEach-Object { ++$_ }  0.001                         00:00:00.0008829 4.46

1 million items, average of 10 runs:

Command                        FriendlySecs (10-run avg.) TimeSpan         Factor
-------                        -------------------------- --------         ------
 foreach($e in $a) { ++$e }    1.741                      00:00:01.7412378 1.00
 $a.ForEach({ ++$_ })          4.099                      00:00:04.0987548 2.35
 $a | ForEach-Object { ++$_ }  8.119                      00:00:08.1188042 4.66

General observations:

  • The relative performance didn't vary much between the 100-item and the 1-million-item tests: foreach is about 2 x as fast as .ForEach() and 4-5 x faster than ForEach-Object.

  • Windows PowerShell was noticeably faster than PS Core in these tests, and PS Core on macOS and Linux seems to be even slower.




回答3:


Split gives you an array with content after each posititon it splits. You can address each entry in the array then for further purpose:

e.g.: ("Tom Hardy" -split " ")[0] = Tom

$Names = gc C:\Temp\Name.txt

foreach ($name in $Names)
{
  $cutted = $name.Split()
  $firstname = $cutted[0]
  $lastname = $cutted[1]
  #Do whatever you need to do with the names here
}

As mentioned by @iRon, you can actually skip one step and directly save it from the split to the two variables:

$Names = gc C:\Temp\Name.txt

foreach ($name in $Names)
{
  $firstname, $lastname = $name -split " "
  #Do whatever you need to do with the names here
}

Or as a oneliner:

Get-Content -Path "C:\Temp\Name.txt" | % {$firstname, $lastname = $_ -split " "; #Do something with the variables}


来源:https://stackoverflow.com/questions/51709291/split-a-string-and-then-assign-the-splits

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!