Is it possible to extract / read part of a file within a zip using powershell?

梦想与她 提交于 2021-01-29 13:09:17

问题


I have a powershell 4.0 script that does various things to organise some large zip files across an internal network. This is all working fine, but I am looking to make a few improvements. One thing that I want to do is extract some details that are within an XML file within the ZIP files.

I tested this on some small ZIP files by extracting just the XML which worked fine. I target the specific file because the zip can contain thousands of files that can be pretty large. This worked fine on my test files, but when I expanded the testing, I realised this wasn't particularly optimal because the XML files I am reading can get pretty large themselves (one was ~5GB but they could potentially be larger). So adding a file extraction step to the chain creates an unacceptable delay to the process, and I need to find an alternative.

Ideally, I would be able read the 3-5 values from the XML file from within the ZIP without extracting it. The values are always relatively early on in the file, so perhaps its possible to just extract the first ~100kb of the file and I could treat the extract as a text file and find the values required?

Is this possible / more performant than just extracting the entire file?

If I can't speed things up I'll have to look at another way. I do have limited control over the file content, so could potentially look at splitting out those details into a smaller separate file at ZIP creation. This would be a last resort though.


回答1:


You should be able to do this with the System.IO.Compression.ZipFile class:

# import the containing assembly
Add-Type -AssemblyName System.IO.Compression.FileSystem

try{
  # open the zip file with ZipFile
  $zipFileItem = Get-Item .\Path\To\File.zip
  $zipFile = [System.IO.Compression.ZipFile]::OpenRead($zipFileItem.FullName)

  # find the desired file entry
  $compressedFileEntry = $zipFile.Entries |Where-Object Name -eq MyAwesomeButHugeFile.xml

  # read the first 100kb of the file stream:
  $buffer = [byte[]]::new(100KB)
  $stream = $compressedFileEntry.Open()
  $readLength = $stream.Read($buffer, 0, $buffer.Length)
}
finally{
  # clean up
  if($stream){ $stream.Dispose() }
  if($zipFile){ $zipFile.Dispose() }
}

if($readLength){
  $xmlString = [System.Text.Encoding]::UTF8.GetString($buffer, 0, $readLength)
  # do what you must with `$xmlString` here :)
}
else{
  Write-Warning "Failed to extract partial xml string"
}


来源:https://stackoverflow.com/questions/58008164/is-it-possible-to-extract-read-part-of-a-file-within-a-zip-using-powershell

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!