We have a web app that allows users to upload documents, create their own documents, and so on. Uploaded files are stored on Amazon S3, created information is stored in a My
Lucene is very good. And although it was originally written in java there is a php implementation http://framework.zend.com/manual/en/zend.search.lucene.html
There is a Ruby port of Lucene called "Ferret". In addition to the Ruby API, you can get at the underlying c implementation called "cFerret".
There is also Xapian which is fast and is quite customizable.
It has support for custom indexers allowing one to index data that is not stored in a database which might be useful for your documents stored on S3.
I imagine that Google will have a solution that meets your needs. Start here: Google Enterprise
Sphinx may be worth your consideration, as it works well with several common RDMS (notably MySQL)
Take a look at Solr. It's based on Lucene, so it's very fast, and it's really easy to use from any platform.