Getting the file name from a text file after string matching - PHP

爷,独闯天下 提交于 2019-12-12 05:49:44

问题


I have a log file (log.txt) in the form:

=========================================
March 01 2050 13:05:00 log v.2.6 
General Option: [default] log_options.xml
========================================= 
Loaded options from xml file: '/the/path/of/log_options.xml'
printPDF started
PDF export
PDF file created:'/path/of/file.1.pdf'
postProcessingDocument started
INDD file removed:'/path/of/file.1.indd'
Error opening document: '/path/of/some/filesomething.indd':Error: file doesnt exist or no permissions 
=========================================
March 01 2050 14:15:00 log v.2.6 
General Option: [default] log_options.xml
========================================= 
Loaded options from xml file: '/the/path/of/log_options.xml'
extendedprintPDF started
extendedprintPDF: Error: Unsaved documents have no full name: line xyz

Note: Each file name is of the format: 3lettersdatesomename_LO.pdf/indd. Example: MNM011112ThisFile_LO.pdf. Also, on a given day and time, the entry could either have just errors, just the message about the file created or both, like I have shown here.

The file continues this way. And, I have a db in the form:

id  itemName status
1   file     NULL

And so on...

Now, I am expected to go through the log file and for each file that is created or if there in an error, I am supposed to update the last column of DB with appropriate message: File created or Error. I thought of searching the string "PDF file created/Error" and then grabbing the file name.

I have tried various things like pathinfo() and strpos. But, I can't seem to understand how I am going to get it done.

Can someone please provide me some inputs on how I can solve this? The txt file and db are pretty huge.

NOTE: I provided the 2nd entry of the log file to be clear that the format in which errors appear IS NOT consistent. I would like to know if I can still achieve what I am supposed to with an inconsistent format for errors. Can somebody please help after reading the whole question again? There have been plenty of changes from the first time I posted this.


回答1:


You can use the explode method of php to break your file into pieces of words. In case the fields in your text file are tab separated then you can explode on explode(String,'\t'); or else in case of space separated, explode on space.

Then a simple substr(word,start_index,length) on each word can give you the name of file (here start_index should be 0).

Using mysql_connect will help you connect to mysql database, or a much efficient way would be to use PDO (PHP Data Objects) to make your code much more reliable and flexible.

Another way out would be to use the preg_match method and specify a regular expression matching your error msg and parse for the file name.

You can refer to php.net manual for help any time.




回答2:


Are all of the files PDFs? If so you can do a regex search on files with the .pdf extension. However, if the filename is also contained in the error string, you will need to exclude that somehow.

// Assume filenames contain only upper/lowercase letters, 0-9, underscores, periods, dashes, and forward slashes
preg_match_all('/([a-zA-Z0-9_\.-/]+\.pdf)/', $log_file_contents, $matches);
// $matches should be an array containing each filename.
// You can do array_unique() to exclude duplicates.

Edit: Keep in mind, $matches will be a multi-dimensional array as described http://php.net/manual/en/function.preg-match-all.php and http://php.net/manual/en/function.preg-match.php

To test a regex expression, you can use http://regexpal.com/




回答3:


Okay, so the main issue here is that you either don't have a consistent delimiter for "entries"..or else you are not providing enough info. So based on what you have provided, here is my suggestion. The main caveat here is that without a solid delimiter for "entries," there's no way to know for sure if the error matches up with the file name. The only way to fix this is to format your file better. Also you have to fill in some blanks, like your db info and how you actually perform the query.

$handle = fopen("log.txt", "rb");
while (!feof($handle)) {
  // get the current row 
  $row = fread($handle, 8192);

  // get file names
  preg_match('~^PDF file created:(.*?)$~',$row,$match);
  if ( isset($match[1]) ) {
    $files[] = $match[1];
  }

  // get errors
  preg_match('~^Error:(.*?)$~',$row,$match);
  if ( isset($match[1]) ) {
    $errors[] = $match[1];
  }
}
fclose($handle);

// connect to db

foreach ($files as $k => $file) {
  // assumes your table just has basename of file
  $file = basename($file);

  $error = ( isset($errors[$k]) ) ? $errors[$k] : null;

  $sql = "update tablename set status='$error' where itemName='$file'";

  // execute query
}

EDIT: Actually going back to your post, it looks like you want to update a table not insert, so you will want to change the query to be an update. And you may need to further work with $file in that foreach for your where clause, depending on how you store your filenames in your db (for example, if you just store the basename, you will likely want to do $file = basename($file); in the foreach). Code updated to reflect this.

So hopefully this will point you in the right direction.



来源:https://stackoverflow.com/questions/13706775/getting-the-file-name-from-a-text-file-after-string-matching-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!