Saturday, November 10, 2012

SECJ0305I Error, Node Agent not Running

This morning I got to work and learned that one of our node agents was not running. What made this unique was that the node agent in question was running on a virtual machine that had recently been moved from one data center to another, across town changing the IP address. After checking the node agent's System.out  I determined that the last time the server been restated the node agent had started successfully.

WSVR0001I: Server nodeagent open for e-business

Puzzled at this point I pinged all three servers from each server and got a reply every time. I then restarted the windows service that was running the node agent and it failed to restart. Looking into the log I discovered a new error stack that started with a

CWWIM4512E The password match failed.

and ended with

SECJ0305I: The role-based authorization check failed for admin-authz operation

I did some goggling and found this tech note from IBM that recommends "restart(ing) the deployment manager, node agents, and servers." This resolved the original issue where the node agent showed as not running but when I tried to restart the node agent service to show what I had done to a peer I got the same error but the node agent status never changed from running in the Deployment manager  : (  I will post a follow up to this issue at a later date (when it is worked out).

Saturday, October 27, 2012

Growing access.logs for IBM HTTP Server

While working on some performance issues in Prd we discovered that our web server access.logs were in excess of 2GB and still growing strong. After some research we decided it was time to implement piped logs . We first tried to edit the httpd.conf file for the web server through the WebSphere admin console. We replaced the following 

CustomLog logs/access.log common


CustomLog "|bin/rotatelogs logs/access.%Y.%m.%d 86400" common 

Stopped the web server , tried to start it and after a 30 sec pause we got a msg back saying that the web server could not be started. Further research into the log files for the deployment manager and web server did not shed any light on why. After some digging on IBM's support site I found this technote explaining that the full path must be specified. I replaced the above with this, (adjusting the formatting to better fit our needs) and the server started without issue and our log files were now rolling on a 24hr bases.

CustomLog "|C:/IBM/HTTPServer/bin/rotatelogs.exe -l C:/IBM/HTTPServer/logs/access-%a-%m-%d-%Y.log 86400" common 

Saturday, October 13, 2012

Maximo and WebSphere 8

We are in the process of setting up Maximo and recently my lead ran across this link from IBM that outlines how to do just that but with WebSphere 8 as the middleware. Currently we have Maximo 7.5 set up with WebSphere 7 but are evaluating upgrading WebSphere to 8. This tech journal from IBM outlines some interesting points on why this might be a good idea. The one that I found the most interesting was the modification to the addnode command providing an add existing option.

Node recovery
A new optional argument for the addNode command, asExisitingNode, makes it easier to move or recover nodes. When this command argument is used, the node is added using the configuration from the Deployment Manager for that node. As a result, a node can be easily moved to a new server and, if need be, there are provisions for changing the host name during this process. In the case of a hardware failure, the node can be easily recovered back to the last configuration reflected in the cell configuration maintained by the Deployment Manager. These two scenarios are depicted in Figure 5.

Saturday, September 29, 2012

When in Dought Encrypt It !

Recently I ran into trouble getting two Maximo environments up one 7.1 and the other 7.5.  We got several different errors and none pointed us in a single direction. After working with IBM we identified one important step that we were missing, running the encryptproperties.bat . This is required with any change to the Maximo property file on all 7.x versions of Maximo.

Here are some of the errors we got
  • BMXAA6539E - Failed to initialize the MAXIMOStartupServlet 
  • WSVR0209E: Unable to start EJB jar mboejb.jar
    • java.lang.NoClassDefFoundError: psdi.iface.jms.JMSListenerBean
  • WSVR0100W: An error occurred initializing, MAXIMO
  • WAE0008E An error occurred reading mbojava.jar
Another anomaly that will be researched further had to do with the sequence required to get past the errors above.
  1. Deploy and Start the app to the App Server/ Cluster - Expect Class loading WSVVR like error 
  2. Stop App Server / Cluster
  3. Start App Server / Cluster  - App should load pass the error in step 1.  

Saturday, September 8, 2012

The Pain of Windows and WebSphere 6.1

This is a repost of a blog entry I did on the site. Since this blog is about me getting orgnized and keeping track of things I run into I am reposting it here.

Last Friday we started to experience slowness in the response from our production web servers. The CPU and memory on each server was not above normal. The application servers were responding as expected it was only traffic directed to or through the web servers that slowed. At first the report seemed to be only a nuisance and nothing more, but as time went on the web servers went from a 30 second response time to four or five minutes
In the web servers logs I found the following message sometimes several times a second

[Fri Apr 06 07:54:07 2012] [warn] (OS 64)The specified network name is no longer available. : winnt_accept: Asynchronous AcceptEx failed.

The frequency of the messages had increased and on each server we were getting as many as 6 or 7 a second. The slowness of the servers got progressively worse until our entire collection of web servers had crashed. After the crash of the first web server I tried to restart it using the console. The web server would not start. After the other two web servers crashed production was no longer available so  we restarted the OS on the first web server that crashed, it was already broken what more could happen. After the reboot the web server started back up normally and the error messages were no longer being generated.
We then logged a PMR with IBM for more information about this issue. Within an hour I received a response that indicated that this is a known issue in a Windows environment where "other vendor's software may be installed which does not correctly implement AcceptEx or other Winsock functions"                                                                   
We had read online that other vendors software could include “anti-virus, firewall, virtualization, or vpn”  Post outage we returned to each server and verified in the add remove software that not updates or new software had been installed in the last day. Anti Virus updates had run but several hours before the first record of the error in the log file. IBM did let us know that a fix does exists for this error however the version of Apache that WebSphere 6.1 runs does not support the fix.

Saturday, September 1, 2012

MXServerRemoteImpl (Incompatible magic value 169877536)

Recently several of our servers have been complaining about an “Incompatible magic value” when we try to connect with the TRM Rules Manager IDE, Eclipse. We narrowed it down to an application issue and specifically the way that the ear file is built. Everything worked as expected when we do not use Application server security but when we enable Application Server security we get the incompatible magic value error.
After contacting TRM it turns out that this is related to the use of the FORM auth-method instead of the BASIC auth-method. The different between FORM and BAISC is that instead of getting the clean looking IBM Maixmo login page you get a generic looking browser prompt to enter your user name and password, but Eclipse can connect. 

Troubleshooting this issue required me to grow my ANT knowledge so I wanted to include some of the links I found along the way.
This link is a hack of sorts in that it is a directory and not a html page that represents all of the ANT commands.

Saturday, August 25, 2012

WebSphere host name changes

Recently we went from Physical to Virtual on all of our Nodes within our WebSphere environment. Recent networking issues made me a tad bit paranoid about this so I cut a PMR with IBM to find out how this could affect us. What I learned was that this would not be an issue because WebSphere only stores hostname and  IP address and does so in the serverindex.xml.

Saturday, August 18, 2012

Performance Monitoring Request Metrics (PMRM)

Notes from the following link  by Ken Gottry and IBM support page  about request metrics.

  • PMRM are transaction based recording  unlike Performance Monitoring Infrastructure (PMI) that provides information about average system resource usage statistics with no correlation across WebSphere Components. 
  • Records are written to the system.out log of the app server that the request is made on. 
  • The webserver has it's own log to write transaction (http-plugin.log)
  • The last two records record the servlet and response time and in the webserver log the size of the request and size of response are also recorded.  

  • Why use request metrics ? 
    • Request metrics allow you you to track individual transactions and the time in each WebSphere component. 

  • Request Metrics Filters 
    • Filters exists for 
      • EJB
      • URI
      • Source IP
      • Web Services 
      • JMS Filters
    • Allow you to focus on a specific area. 

Dynacache in WebSphere 7

Notes on Dynacache post and WebSphere doc on setting

Caching of too much data can cause performance issues and this makes sizing of the cache very important. Selecting a correct size of cache can be difficult and because of this an underutilized cache can occur. This problem is not easily solved  because Java does not have a size of operator that will tell us the size of a object on the heap.

Dynacache allows the administrator the ability to control the cache by setting high and low water marks on the heap size.

  • Dynamic Cache service setting can be found Servers - Server Types - WebSphere application servers - server name> Container services - Dynamic cache service
  • Service starts with caching is enabled in Web Container panel.
  • Cache size 
    • Positive integer  represents the maximum number of entries the cache can hold.
  • Default priority 
    • How long an entry stays in a full cache   
  • Limit memory cache size
    • Sets the size of the memory cache. Allows you to control the size of cache in terms of the JVM heap. 
    • The least recently used algorithm is used to remove items from cache. 
  • Memory Cache size
    • Allows you to set the cache in MB. 
    • High threshold and low threshold represent the high and low watermarks. 
      • expressed in terms of percentage of the memory cache

  • Enable disk offload

    • Allows items removed from memory to be moved to disk if needed later 
      • You CANNOT specify the number of items moved to disk
      • You CANNOT specify the amount of disk space to use.  
  • Offload location 
    • Location on disk to save entry's
      • Default ${WAS_TEMP_DIR}/node/server name/_dynacache/cache JNDI name  
        • ${WAS_TEMP_DIR} is install/temp dir
      • If location is specified then node, server name and cache instance name is appended
    • If you use the default dir and the server fills up WebSphere could stall
    • Depending on OS you may see disk full messages in the console. 
  • Flush to disk 
    • Indicates if in memory cache should be written to disk in the event that the app server is shutdown. 
  • Limit Disk size in GB
    • leaving blank indicates unlimited.
  • Limit disk size in entries
    • leaving blank indicates unlimited. 
  • Disk Cache Performance Settings (how memory resources should be used on background activity such as cache cleanup, expiration, garbage collection, and so on)
      • High - all metadata kept in memory
      • Balanced - some metadata kept, balance of performance and memory usage found
      • Low - limited metatdata is kept
      • Custom - Admin will explicitly configure memory setting 
        • Set with the  DiskCacheCustomPerformanceSettings  object
    • Disk Cache cleanup frequency 
      • set in minutes is set to 0 only happens at midnight. 
      • Only applies with Per Setting are low, balanced or custom
        • High does not require disk cleanup
    • Maximum buffer for cached identifiers per metaentry
      • Sets number of maximum number of cache identifiers that are stored for an individual dependency 
      • If limit exceeded data is off loaded to disk 
      • Only Applies to custom Per Setting 
    • Maximum buffer for dependency identifiers 
      • Sets the number of dependency identifier buckets in the disk cache metadata in memory.
      • Only Applies to custom Per Setting 
    • Maximum buffer for templates 
      • Only Applies to custom Per Setting 
      • Sets the max number of temp buckets
    •  Maximum buffer for templates 
      • sets the max  number of template buckets
      • Only Applies to custom Per Setting
  • Disk Cache eviction algorithm 
    • Only applies if disk offload is specified
      • None - once this disk cache reaches the disk size the service stops writing to disk
      • Random - 
      • Size - Largest are removed first
    • High Threshold 
      • Sets when the eviction policy runs
      • Percent of disk space
    • Low Threshold
      • Sets when the eviction policy ends
      • Percent of disk space
    • Enable Cache replication 
      • Uses cache replication to have cache entries copied to member of a replication domain
    • Full group replication domain
      • sets the replication domain
    • Replication type (Direct from IBM doc)
      • Specifies the global sharing policy for this application server.
      • The following settings are available:
        • Both push and pull sends the cache ID of newly updated content to other servers in the replication domain. Then, if one of the other servers requests the content, and that server has the ID of the cache entry for the previously updated content, it will retrieve the content from the publishing server. On the other hand, if a request is made for an ID which has not been previously published, the server assumes it does not exist in the cluster and creates a new entry.
        • Push only sends the cache ID and cache content of new content to all other servers in the replication domain.
        • When you use the Not Shared setting, as cache entries are created, neither the cache content nor the cache IDs are propagated to other servants or servers in the replication domain. However, invalidations are propagated to other servants or servers. You can set the sharing policy at different levels. A global sharing policy, which is the default policy for all caches, is defined when you configure the dynamic cache service. You can overwrite this sharing policy by modifying the cachespec.xml file. For more information on thecachespec.xml file, see the cachespec.xml file topic. Additionally, you can overwrite the sharing policy at the application programming interface (API) level when cache entries are being created.
        • The default is Not Shared.
    • Push frequency 
      • Time in seconds before new or modified cache entries are pushed to other servers

Notes on WebSphere Data Replication Services (DRS)

My Notes from the IBM Education Assistant  on DRS

DRS is a internal component of WebSphere that is used to move data within the app server process. Examples of this include

  • HTTP Session replication 
  • Dynamic Cache Replication 
  • EJB state replication (New at WebSphere 6)
To insure that data and request wind up in the same place DRS coordinates with workload management. 

DRS provides services in two scenarios
  • Fail over
    • Ensure HTTP session and EJBs can be move to another server transparent to the user.  
  • Caching 
    • When a servlet or JSP has been configured to have it's output cached repeated request are handled faster. This cached output is what DRS syncs. 
DRS in V5 WebSphere requires the administrator to configure the replicators, replication domain and partition.
  • Replicators
    • Producer and Consumer responsible for moving data
    • Data moves as JMS messages
  • Replication Domain
    • A set of one or more replicators
  • Partition
    • a group of replicators configured to communicate with each other.   
DRS is V6 WebSphere requires the administrator to only configure a Replication Domain as configuring replicators is no longer necessary and partition are masked from the user. 
  • Replication Domain
    • Consists of server or cluster members that have the capability of sharing HTTP Session or Caching data within the domain 
  • New in V6 is the coordination with Workload Management (WLM) to coordinate which members serve as backups for other members.   
    • Ideally session fail over data stateful session bean data should end up in the same place and somewhere other then where the data originated.  
  • Rewritten in V6 using a IBM propitiatory mechanization  to transport data. 
  • Only configuration option now is that the number of replications can be set (default is 1)

377 Question area 

  • Creating a replication domain can be done at cluster creation. 
  • Or can be done manually
  • Cache replication can be configured under Server - Container Services  - Dynamic cache Service

Best Practices 

  • Create distinct domain for HTTP and EJB data and one for caching 
  • Put EJB data and HTTP data in the same domain. 
  • Use smallest number of replicas as possible 1, 2, or 3 should work in most cases
  • Congestion messages can be resolved by increasing the transport buffer size to 50 MB 10 MB is the default. (App Server - <server name > Core Group Service) 

Mobile What !

I have been sitting on the fence about a tablet just for me for a while now. My wife and I got a iPad a while back and it have proven to be very popular with the both of us. Looking for better personal computing experence I am thinking about going all out for a new Nexus 7   or adding a Unbuntu OS to my Droid X2. Too bad I am so cheep otherwise I would just get both !

Saturday, August 11, 2012

Is WebSphere MQ for you ?

We are in the process of increasing our use of JMS messasging and at this point we can process about 1000 records an hour using out of the box configurations from IBM for the Maximo product. Our ramp up led me to doing a little research on JMS scaling in WebSphere and I came across this article comparing the SIB to MQ.

Saturday, August 4, 2012

WebSphere Proxy Servers

Cool older education assistant video on the use of proxy servers from IBM. Some interesting points were
  • They replace web servers
  • Are the direction for performance
  • Can be configured across cells
  • Require a unique profile

Saturday, July 28, 2012

dd_in_ear_load_EXC_ Exception when deploying Maximo ear file in WebSphere

I was trying to configure App security in Maximo so I updated the:
·         applications\maximo\maximouiweb\webmodule\WEB-INF\web.xml
·         applications\maximo\mboweb\webmodule\WEB-INF\web.xml
and when I tried to redeploy the ear file I got the following error after I browsed to the ear file.
The exception dd_in_ear_load_EXC_ ocurred. Check log for details.
The issue was that I had uncommented both FORM and BASIC login in the applications\maximo\maximouiweb\webmodule\WEB-INF\web.xml.  If you get this error go back and check the applications\maximo\maximouiweb\webmodule\WEB-INF\web.xml and make sure you only have one <login-config> uncommented.

Saturday, July 21, 2012

Monday, July 16, 2012

Getting Started with MQTT

This is the first in what I hope turns out to be a series of posts about one geek's journey into MQTT and let me tell you it was big doings tonight !!

I was successfully able to publish and subscribe a MQTT message for the first time ! It was not as tough as I thought it would be but only because of the IBM redbook Building Smarter Planet Solutions with MQTT and IBM WebSphere MQ Telemetry The first step was to launch a terminal client from my Ubuntu box that has mosquitto installed.

I then typed in the command to publish  "mosquitto_pub -t samples/topic01 -l"
 (well I copied the exact description of the command from the redbook which is mosquittopub and I got an error No command 'mosquittosub' found, did you mean: mosquitto_pub. I am very new to Linux and do not know why the redbook said one thing and another work, so I will have to add this to the research list down the road.  )

I then opened a second Terminal and entered the command for a subscription
mosquitto_sub -t samples/topic01
I then entered a message in the publish screen and pressed enter.
and low and behold it appeared on the second screen ! Page 46 of the redbook says that "The examples assumes that the client and the server are on the same machine" I will be interested to see what makes this assumption work but will have to save that for another night. 

Saturday, July 14, 2012

WebSphere Messaging Engine, Locking and Troubleshooting

I have run into a few issues with WebSphere Messaging Engines this year and this had leaded me to a few different resources on the subject. The best place by far is Ty Shrake’s  webcast replay of “WebSphere Application Server - Service Integration Bus Messaging Engine Data Store Connectivity Problems and Solutions” Ty does a really great job of describing messaging engines, datastore,  locks and how to troubleshoot them.

Wednesday, July 4, 2012

Understanding Maximo LDAPSYNC

Great technote from IBM on how the LDAPSYNC process syncs with Microsoft Active Directory Global Catalog. When I first started looking into this I was a little confused by the XML in the LDAPSYNC cron task. I was not sure what to add and where to add it but this cleared up much of my confusion.  Interesting points I found were
  • LDAPSYNC connects to the Microsoft Global Catalog on port 3268
  • LDAPSYNC can be configured to access the LDAP server directly, however; this does not follow Microsoft recommendations
  • Maximo administrators can map additional information from the LDAP server to Maximo using the SYNC task.

Friday, June 29, 2012

OSI Model Overview

Great video that provides an overview of the OSI model for someone like me that has very little networking background. 

Saturday, June 23, 2012

IBM WebSphere Java Health Center

Cool link that talks about setting up and using the Java Health Center. I have used this a couple times trying to diagnose random JVM crashes. It did not help me resolve the issue because we could not recreate the random event that crashed the JVM but it was fun watching the JVM work !

Wednesday, June 20, 2012

Friday, June 15, 2012

Microsoft Network Monitoring

We had a thread lock on one of our messaging engines that keep the entire cluster from coming up. The thread lock was waiting for a response from the database. After a routing table update the issue was resolved so I never got a chance to dig deeper into the Microsoft network monitor but hope to get a chance again someday cause it looks really cool !

More on Windows Performance and Reliability 

Thursday, June 14, 2012

Website Monitoring with Powershell

Very cool post on creating a website monitoring script in power shell. I learned that the webClient.DownloadString object will throw an error when the site is down so I did not bother to use the string parsing part of the script.

Sunday, June 10, 2012

Other Causes of Maximo Session Timeout

Our Maximo users were experiencing a time out where they would be prompted to log back into Maximo but would return to their last location in the application and not the start center. This is different from the Maximo time out where the user is sent back to the start center. After much research and some help from IBM support we were able to resolve this issue by adjusting the LTPA time out setting in WebSphere.  Here is a tech note on the subject from IBM.