Extract text from doc and docx

后端 未结 9 1307
死守一世寂寞
死守一世寂寞 2020-11-27 16:24

I would like to know how can I read the contents of a doc or docx. I\'m using a Linux VPS and PHP, but if there is a simpler solution using other language, please let me kno

9条回答
  •  一整个雨季
    2020-11-27 16:53

    I insert little improvements in doc to txt converter function

    private function read_doc() {
        $line_array = array();
        $fileHandle = fopen( $this->filename, "r" );
        $line       = @fread( $fileHandle, filesize( $this->filename ) );
        $lines      = explode( chr( 0x0D ), $line );
        $outtext    = "";
        foreach ( $lines as $thisline ) {
            $pos = strpos( $thisline, chr( 0x00 ) );
            if (  $pos !== false )  {
    
            } else {
                $line_array[] = preg_replace( "/[^a-zA-Z0-9\s\,\.\-\n\r\t@\/\_\(\)]/", "", $thisline );
    
            }
        }
    
        return implode("\n",$line_array);
    }
    

    Now it saves empty rows and txt file looks row by row .

提交回复
热议问题