I\'m build a Portlet application deployed to a WebSphere Portal Server running on Linux. Every Portlet WAR uses Log4j for logging with a configuration like this, having ever
I'd try moving the log file location somewhere other than the temporary file system.
I think the problem is that you have multiple WARs writing to the same log file. In our experience, log4j cannot do this reliably, particularly with the rolling appenders. When one goes to roll it, the others are confused and unable to log further. Or continue to log to the old file.
I suspect you're going to have to have each WAR log to a different file.
You've provided some basic information, so I can only sketch some candidate causes and likelihood:
1. Problem with file locks/handles/IO stream
Triggerred by log rolling?
Negative in your case. Your two separate log files (info and debug) stop at the same time for any given WAR.
Each file rolls at the default maximum size (10MB). It's very unlikely that both logs would always roll at the same time. The error must not be triggered by log rolling. Extra confirmation by configuring log4j.appender.InfoAppender.MaxFileSize=200MB
Triggerred by users manipulating Linux files?
Negative in your case. It's possible that user/sysadmin manipulating files could create locks or stale file handles. Linux should never have problems with a user tail-ing a file (but windows does). Linux can have problems with users zipping or editing files. But your problem seems very repeatable, making this unlikely unless you have automated scripts manipulating log files.
Triggerred by "competitive" config settings in Websphere or Spring, with duplicate use of same log files by server/framework?
Seems unlikely in your case. Seems you haven't been setting Websphere commons logging configuration. Commons logging is automatically included in the websphere server parent ClassLoader and can be configured to "wrap" to Log4J by configuring:
File commons-logging.properties
# Set application classloader mode as PARENT_LAST when deploying in WAS as .ear
priority=1
org.apache.commons.logging.LogFactory=org.apache.commons.logging.impl.LogFactoryImpl
Triggered by hardware problems/disk failure?
??? Seems strange that such a problem would be very repeatable.
2. Problem with java threads?
massive thread processing/contention in "other" code, so that code with logging is not run
From your description, I assume that the application is still running and working fine with normal performance and functionality, but the logs are not written. Can you confirm? If so, then it's not a thread problem with the webapp threads.
Also I can confirm that it isn't a thread problem within the Log4J logic, because the only time it creates/uses its own thread is when one of AsynchAppender/ExternallyRolledFileAppender/SocketAppender/TelnetAppender is used OR when PropertyConfigurator.configureAndWatch or DOMConfigurator.configureAndWatch method is called.
i.e. Negative.
3. Changes to Log4J classes in ClassLoaders, with use of different configuration?
Parent ClassLoader clashes with Webapp ClassLoader
E.g. Your webapps initially start with thier own configured classes from WEBINF directory and all is good, but later after some time a different app causes (or one of the portal server admin tools) causes a clashing class to be loaded into the parent ClassLoader and your app "picks up" this new illegal version of the class and fails.
Quite possibly a problem - thousands of users on Google have struggled with Websphere class loaders.
Suggested Action:
ensure all your web apps use PARENT_LAST ClassLoading - go to the Admin console and ensure that they have PARENT_LAST set within ALL WebApp configurations
ensure you are getting Log4J internal error messages written to the console
E.g. Deliberately test by forceably deleting the error log as admin while app is running, creating a stale handle. If "Log4J:" error messages do not appear in console, then this is a serious problem.
Next time the problem occurs, trap any such console messages and report them. Also, you can set "-D log4j.debug" on JVM/websphere startup, to find out precisly what Log4J was doing before/during the problem - messages will go to console.
do you really need to set the logging level to DEBUG for all of your packages & classes? Better to set to INFO or WARN and only selectively set on when you are debugging specific problems?
That's alot of text.......... B^)
I'm not sure why log4j is stopping in your application. But you could (should) upgrade to log4j 2.0. Switching shouldn't be much effort. You will need to rewrite your log4j.properties file to an XML file because the new version doesn't support properties file any longer.
In the Java Magazin an article stated that log4j 2.0 is behaving more robust in multithreaded environments, so there is a chance it will fix your issue. If it doesn't you still have the benefit of the new version.
It brings some nice features and enhancements (copied from the log4j site):
API Separation
The API for Log4j is separate from the implementation making it clear for application developers which classes and methods they can use while ensuring forward compatibility. This allows the Log4j team to improve the implementation safely and in a compatible manner.
Improved Performance
Log4j 2 performs faster than Log4j 1.x in critical areas and similarly to Logback under most circumstances. See Performance for more information. Support for multiple APIs While the Log4j 2 API will provide the best performance, Log4j 2 provides support for the SLF4J and Commons Logging APIs.
Automatic Reloading of Configurations
Like Logback, Log4j 2 can automatically reload its configuration upon modification. Unlike Logback, it will do so without losing log events while reconfiguration is taking place.
Advanced Filtering
Like Logback, Log4j 2 supports filtering based on context data, markers, regular expressions, and other components in the Log event. Filtering can be specified to apply to all events before being passed to Loggers or as they pass through Appenders. In addition, filters can also be associated with Loggers. Unlike Logback, you can use a common Filter class in any of these circumstances.
Plugin Architecture
Log4j uses the plugin pattern to configure components. As such, you do not need to write code to create and configure an Appender, Layout, Pattern Converter, and so on. Log4j automatically recognizes plugins and uses them when a configuration references them.
Property Support
You can reference properties in a configuration, Log4j will directly replace them, or Log4j will pass them to an underlying component that will dynamically resolve them. Properties come from values defined in the configuration file, system properties, environment variables, the ThreadContext Map, and data present in the event. Users can further customize the property providers by adding their own Lookup Plugin.
For over 5 years, Log4j has hardly any bugs fixed: it's effectively a dead project. If acceptable, consider replacing it with Logback, which implements SLF4j directly.
Logback and SLF4J are written by the same guy who wrote Log4J (Ceki), has an even more liberal license and has a good community. It is the successor to Log4J 1 in every way possible (except for its name).