Pages

Saturday, November 10, 2012

SECJ0305I Error, Node Agent not Running

This morning I got to work and learned that one of our node agents was not running. What made this unique was that the node agent in question was running on a virtual machine that had recently been moved from one data center to another, across town changing the IP address. After checking the node agent's System.out  I determined that the last time the server been restated the node agent had started successfully.

WSVR0001I: Server nodeagent open for e-business

Puzzled at this point I pinged all three servers from each server and got a reply every time. I then restarted the windows service that was running the node agent and it failed to restart. Looking into the log I discovered a new error stack that started with a

CWWIM4512E The password match failed.

and ended with

SECJ0305I: The role-based authorization check failed for admin-authz operation

I did some goggling and found this tech note from IBM that recommends "restart(ing) the deployment manager, node agents, and servers." This resolved the original issue where the node agent showed as not running but when I tried to restart the node agent service to show what I had done to a peer I got the same error but the node agent status never changed from running in the Deployment manager  : (  I will post a follow up to this issue at a later date (when it is worked out).


Saturday, October 27, 2012

Growing access.logs for IBM HTTP Server

While working on some performance issues in Prd we discovered that our web server access.logs were in excess of 2GB and still growing strong. After some research we decided it was time to implement piped logs . We first tried to edit the httpd.conf file for the web server through the WebSphere admin console. We replaced the following 

CustomLog logs/access.log common

with

CustomLog "|bin/rotatelogs logs/access.%Y.%m.%d 86400" common 

Stopped the web server , tried to start it and after a 30 sec pause we got a msg back saying that the web server could not be started. Further research into the log files for the deployment manager and web server did not shed any light on why. After some digging on IBM's support site I found this technote explaining that the full path must be specified. I replaced the above with this, (adjusting the formatting to better fit our needs) and the server started without issue and our log files were now rolling on a 24hr bases.

CustomLog "|C:/IBM/HTTPServer/bin/rotatelogs.exe -l C:/IBM/HTTPServer/logs/access-%a-%m-%d-%Y.log 86400" common 

Saturday, October 13, 2012

Maximo 7.5.0.2 and WebSphere 8

We are in the process of setting up Maximo 7.5.0.2 and recently my lead ran across this link from IBM that outlines how to do just that but with WebSphere 8 as the middleware. Currently we have Maximo 7.5 set up with WebSphere 7 but are evaluating upgrading WebSphere to 8. This tech journal from IBM outlines some interesting points on why this might be a good idea. The one that I found the most interesting was the modification to the addnode command providing an add existing option.

Node recovery
A new optional argument for the addNode command, asExisitingNode, makes it easier to move or recover nodes. When this command argument is used, the node is added using the configuration from the Deployment Manager for that node. As a result, a node can be easily moved to a new server and, if need be, there are provisions for changing the host name during this process. In the case of a hardware failure, the node can be easily recovered back to the last configuration reflected in the cell configuration maintained by the Deployment Manager. These two scenarios are depicted in Figure 5.


Saturday, September 29, 2012

When in Dought Encrypt It !


Recently I ran into trouble getting two Maximo environments up one 7.1 and the other 7.5.  We got several different errors and none pointed us in a single direction. After working with IBM we identified one important step that we were missing, running the encryptproperties.bat . This is required with any change to the Maximo property file on all 7.x versions of Maximo.


Here are some of the errors we got
  • BMXAA6539E - Failed to initialize the MAXIMOStartupServlet 
  • WSVR0209E: Unable to start EJB jar mboejb.jar
    • java.lang.NoClassDefFoundError: psdi.iface.jms.JMSListenerBean
  • WSVR0100W: An error occurred initializing, MAXIMO
  • WAE0008E An error occurred reading mbojava.jar
Another anomaly that will be researched further had to do with the sequence required to get past the errors above.
  1. Deploy and Start the app to the App Server/ Cluster - Expect Class loading WSVVR like error 
  2. Stop App Server / Cluster
  3. Start App Server / Cluster  - App should load pass the error in step 1.  

Saturday, September 8, 2012

The Pain of Windows and WebSphere 6.1

This is a repost of a blog entry I did on the Webshereusergroup.org site. Since this blog is about me getting orgnized and keeping track of things I run into I am reposting it here.

Last Friday we started to experience slowness in the response from our production web servers. The CPU and memory on each server was not above normal. The application servers were responding as expected it was only traffic directed to or through the web servers that slowed. At first the report seemed to be only a nuisance and nothing more, but as time went on the web servers went from a 30 second response time to four or five minutes
In the web servers logs I found the following message sometimes several times a second

[Fri Apr 06 07:54:07 2012] [warn] (OS 64)The specified network name is no longer available. : winnt_accept: Asynchronous AcceptEx failed.

The frequency of the messages had increased and on each server we were getting as many as 6 or 7 a second. The slowness of the servers got progressively worse until our entire collection of web servers had crashed. After the crash of the first web server I tried to restart it using the console. The web server would not start. After the other two web servers crashed production was no longer available so  we restarted the OS on the first web server that crashed, it was already broken what more could happen. After the reboot the web server started back up normally and the error messages were no longer being generated.
We then logged a PMR with IBM for more information about this issue. Within an hour I received a response that indicated that this is a known issue in a Windows environment where "other vendor's software may be installed which does not correctly implement AcceptEx or other Winsock functions" http://publib.boulder.ibm.com/httpserv/ihsdiag/errorlog.html#LSP                                                                   
We had read online that other vendors software could include “anti-virus, firewall, virtualization, or vpn” http://rob.brooks-bilson.com/index.cfm/2008/1/4/Intermittent-Apache-Problems-and-winntaccept-Asynchronous-AcceptEx-failed  Post outage we returned to each server and verified in the add remove software that not updates or new software had been installed in the last day. Anti Virus updates had run but several hours before the first record of the error in the log file. IBM did let us know that a fix does exists for this error however the version of Apache that WebSphere 6.1 runs does not support the fix.

Saturday, September 1, 2012

MXServerRemoteImpl (Incompatible magic value 169877536)

Recently several of our servers have been complaining about an “Incompatible magic value” when we try to connect with the TRM Rules Manager IDE, Eclipse. We narrowed it down to an application issue and specifically the way that the ear file is built. Everything worked as expected when we do not use Application server security but when we enable Application Server security we get the incompatible magic value error.
After contacting TRM it turns out that this is related to the use of the FORM auth-method instead of the BASIC auth-method. The different between FORM and BAISC is that instead of getting the clean looking IBM Maixmo login page you get a generic looking browser prompt to enter your user name and password, but Eclipse can connect. 

Troubleshooting this issue required me to grow my ANT knowledge so I wanted to include some of the links I found along the way.
This link is a hack of sorts in that it is a directory and not a html page that represents all of the ANT commands.