问题
I have numerous uniquely named .CSV files that I need to remove the first 17 lines from. Some of these files exceed 65534 rows so my MORE/MOVE Batch script is not working. Looking for alternative solutions.
@echo off
for %%a in (*.csv) do (
more +17 "%%a" >"%%a.new"
move /y "%%a.new" "%%a" >nul
)
Regardless of number of rows input I am looking to have the 17 header rows removed and new file with all remaining rows built.
回答1:
Here's a powershell option; this one uses a stream to cater for your large files:
$csvs = Get-ChildItem -Path "P:\ath to\your csvs" -Filter *.csv
foreach ( $csv in $csvs ) {
$fin = New-Object System.IO.StreamReader( $csv.FullName )
$fout = New-Object System.IO.StreamWriter( $csv.FullName+".new" )
try {
for( $s = 1; $s -le 17 -and !$fin.EndOfStream; $s++ ) {
$fin.ReadLine()
}
while( !$fin.EndOfStream ) {
$fout.WriteLine( $fin.ReadLine() )
}
}
finally {
$fout.Close()
$fin.Close()
}
}
Just change the path to your .csv
s on the first line, before testing it.
I have purposely left out the deletion of the original files, simply appending .new
to the new filenames to allow you time to check the results, test the speed etc. I will leave it to you to include a Rename/Delete or Move should you feel the need to extend the functionality.
回答2:
Here's a one-line solution
for %%a in (*.txt) do powershell -Com "sc -Path '%%a' -Value (gc '%%a' | select -Skip 17)"
where gc
and sc
are default aliases for Get-Content and Set-Content respectively. See also
- Powershell select-object skip multiple lines?
- Powershell skip first 2 lines of txt file when importing it
If your files are huge then it'll be better to read in lines or blocks which can also be implemented easily using file functions, [IO.File]::OpenText
or the -ReadCount
option of Get-Content
in PowerShell
- Reading large text files with Powershell
- Reading very BIG text files using PowerShell
- How to process a file in PowerShell line-by-line as a stream
- How can I make this PowerShell script parse large files faster?
As Squashman mentioned, for /f
also has an option to skip lines at the beginning of the file
for %%a in (*.csv) do (
for /f "usebackq skip=17 delims=" %%l in ("%%f") do @echo(%%l>>"%%a.new"
move /y "%%a.new" "%%a" >nul
)
But that won't work if your file contains lines with special characters like &
or |
. For more information about it run for /?
回答3:
Make your own cut
command. This is VBScript ported to VB.NET.
Cut
cut {t|b} {i|x} NumOfLines
Cuts the number of lines from the top or bottom of file.
t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines
Example
cut t i 5 < "%systemroot%\win.ini"
Cut.bat
REM Cut.bat
REM This file compiles Cut.vb to Cut.exe
REM Cut.exe Removes specified from top or bottom of lines from StdIn and writes to StdOut
REM To use
REM cut {t|b} {i|x} NumOfLines
Rem Cuts the number of lines from the top or bottom of file.
Rem t - top of the file
Rem b - bottom of the file
Rem i - include n lines
Rem x - exclude n lines
Rem
Rem Example - Includes first 5 lines Win.ini
Rem
Rem cut t i 5 < "%systemroot%\win.ini"
"C:\Windows\Microsoft.NET\Framework\v4.0.30319\vbc.exe" /target:exe /out:"%~dp0\Cut.exe" "%~dp0\Cut.vb" /verbose
pause
Cut.vb
'DeDup.vb
Imports System
Imports System.IO
Imports System.Runtime.InteropServices
Imports Microsoft.Win32
Public Module DeDup
Sub Main
Dim Arg() As Object
Dim RS as Object
Dim LineCount as Object
Dim Line as Object
Arg = Split(Command(), " ")
rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append("LineNumber", 4)
.Fields.Append("Txt", 201, 5000)
.Open
LineCount = 0
Line=Console.readline
Do Until Line = Nothing
LineCount = LineCount + 1
.AddNew
.Fields("LineNumber").value = LineCount
.Fields("Txt").value = Console.readline
.UpDate
Line = Console.ReadLine
Loop
.Sort = "LineNumber ASC"
If LCase(Arg(0)) = "t" then
If LCase(Arg(1)) = "i" then
.filter = "LineNumber < " & LCase(Arg(2)) + 1
ElseIf LCase(Arg(1)) = "x" then
.filter = "LineNumber > " & LCase(Arg(2))
End If
ElseIf LCase(Arg(0)) = "b" then
If LCase(Arg(1)) = "i" then
.filter = "LineNumber > " & LineCount - LCase(Arg(2))
ElseIf LCase(Arg(1)) = "x" then
.filter = "LineNumber < " & LineCount - LCase(Arg(2)) + 1
End If
End If
Do While not .EOF
Console.writeline(.Fields("Txt").Value)
.MoveNext
Loop
End With
End Sub
End Module
来源:https://stackoverflow.com/questions/55620008/looking-for-more-move-solutions-that-can-handle-files-with-more-than-65534-rows