问题
I am trying to do the following in AIR:
- browse to a text file
- read the text file and store it in a string (ultimately in an array)
- split the string by the delimiter \n and put the resulting strings in an array
- manipulate that data before sending it to a website (mysql database)
The text files I am dealing with will be anywhere from 100-500mb in size. So far, I've been able to to complete steps 1 and 2, here is my code:
<mx:Script>
<![CDATA[
import mx.collections.ArrayCollection;
import flash.filesystem.*;
import flash.events.*;
import mx.controls.*;
private var fileOpened:File = File.desktopDirectory;
private var fileContents:String;
private var stream:FileStream;
private function selectFile(root:File):void {
var filter:FileFilter = new FileFilter("Text", "*.txt");
root.browseForOpen("Open", [filter]);
root.addEventListener(Event.SELECT, fileSelected);
}
private function fileSelected(e:Event):void {
var path:String = fileOpened.nativePath;
filePath.text = path;
stream = new FileStream();
stream.addEventListener(ProgressEvent.PROGRESS, fileProgress);
stream.addEventListener(Event.COMPLETE, fileComplete);
stream.openAsync(fileOpened, FileMode.READ);
}
private function fileProgress(p_evt:ProgressEvent):void {
fileContents += stream.readMultiByte(stream.bytesAvailable, File.systemCharset);
readProgress.text = ((p_evt.bytesLoaded/1048576).toFixed(2)) + "MB out of " + ((p_evt.bytesTotal/1048576).toFixed(2)) + "MB read";
}
private function fileComplete(p_evt:Event):void {
stream.close();
//fileText.text = fileContents;
}
private function process(c:String):void {
if(!c.length > 0) {
Alert.show("File contents empty!", "Error");
}
//var array:Array = c.split(/\n/);
}
]]>
</mx:Script>
Here is the MXML
<mx:Text x="10" y="10" id="filePath" text="Select a file..." width="678" height="22" color="#FFFFFF" fontWeight="bold"/>
<mx:Button x="10" y="40" label="Browse" click="selectFile(fileOpened)" color="#FFFFFF" fontWeight="bold" fillAlphas="[1.0, 1.0]" fillColors="[#E2E2E2, #484848]"/>
<mx:Button x="86" y="40" label="Process" click="process(fileContents)" color="#FFFFFF" fontWeight="bold" fillAlphas="[1.0, 1.0]" fillColors="[#E2E2E2, #484848]"/>
<mx:TextArea x="10" y="70" id="fileText" width="678" height="333" editable="false"/>
<mx:Label x="10" y="411" id="readProgress" text="" width="678" height="19" color="#FFFFFF"/>
step 3 is where I am having some troubles. There are 2 lines in my code commented out, both lines cause the program to freeze.
fileText.text = fileContents; attempts to put the contents of the string in a textarea
var array:Array = c.split(/\n/); attempts to split the string by delimiter newline
Could use some input at this point... Am i even going about this the right way? Can flex/air handle files this large? (i'd assume so) This is my first attempt at doing any sort of flex work, if you see other things ive done wrong or could be done better, i'd appreciate the heads up!
Thanks!
回答1:
Doing a split
on a 500MB file might not be a good idea. You can write your own parser to work on the file but it may not be very fast either:
private function fileComplete(p_evt:Event):void
{
var array:Array = [];
var char:String;
var line:String = "";
while(stream.position < stream.bytesAvailable)
{
char = stream.readUTFBytes(1);
if(char == "\n")
{
array.push(line);
line = "";
}
else
{
line += char;
}
}
// catch the last line if the file isn't terminated by a \n
if(line != "")
{
array.push(line);
}
stream.close();
}
I haven't tested it but it should just step through the file character by character. If the character is a new line then push the old line into the array otherwise add it to the current line.
If you don't want it to block your UI while you do it, you'll need to abstract it into a timer based idea:
// pseudo code
private function fileComplete(p_evt:Event):void
{
var array:Array = [];
processFileChunk();
}
private function processFileChunk(event:TimerEvent=null):void
{
var MAX_PER_FRAME:int = 1024;
var bytesThisFrame:int = 0;
var char:String;
var line:String = "";
while( (stream.position < stream.bytesAvailable)
&& (bytesThisFrame < MAX_PER_FRAME))
{
char = stream.readUTFBytes(1);
if(char == "\n")
{
array.push(line);
line = "";
}
else
{
line += char;
}
bytesThisFrame++;
}
// if we aren't done
if(stream.position < stream.bytesAvailable)
{
// declare this in the class
timer = new Timer(100, 1);
timer.addEventListener(TimerEvent.TIMER_COMPLETE, processFileChunk);
timer.start();
}
// we're done
else
{
// catch the last line if the file isn't terminated by a \n
if(line != "")
{
array.push(line);
}
stream.close();
// maybe dispatchEvent(new Event(Event.COMPLETE)); here
// or call an internal function to deal with the complete array
}
}
Basically you choose an amount of the file to process each frame (MAX_PER_FRAME) and then process that many bytes. If you go over the number of bytes then just make a timer to call the process function again in a few frames time and it should continue where it left off. You can dispatch an event of call another function once you are sure you are complete.
回答2:
I agree.
Try to split the text into chunks while you're reading it from the stream.
This way you don't have to store the text in your fileContents String (reducing the memory usage by 50%)
回答3:
Try to process it in parts.
回答4:
With regards to James's homespun parser, there is a problem if the text files contain any multibyte UTF characters (I was trying to parse UTF files in a similar manner when I came across this thread). Converting each byte to an individual string will disintegrate multi-byte characters, so I made some modifications.
In order to make this parser multi-byte friendly, you can store the growing lines in a ByteArray rather than a string. Then when you hit the end of a line (or a chunk, or the file), you can parse it as a UTF string (if necessary) without any problems:
var
out :ByteArray,
line_out :String,
line_end :Number,
char :int,
line:ByteArray;
out = new ByteArray();
line = new ByteArray();
while( file_stream.bytesAvailable > 0 )
{
char = file_stream.readByte();
if( (String.fromCharCode( char ) == "\n") )
{
// Do some processing on a line-by-line basis
line_out = ProcessLine( line );
line_out += "\n";
out.writeUTFBytes( line_out );
line = new ByteArray();
}
else
{
line.writeByte( char );
}
}
//Get the last line in there
out.writeBytes( line );
回答5:
stream.position < stream.bytesAvailable Wouldn't this condition be false after the position reaches the middle of the file? If the file is 10 bytes, after you have read 5 bytes then bytesAvailable will be 5, I stored the initial value in another variable and used it in the condition. Besides that, I think it is pretty good
来源:https://stackoverflow.com/questions/1371409/parsing-large-text-files-with-adobe-air