Troubleshooting Guide

A number of common questions and answers are gathered here. Please watch for updates as this is likely to grow as time/development moves on.

Review Logs

CAS server logs are the best resource for determining the root cause of the problem, provided you have configured the appropriate log levels. Specifically you want to make sure DEBUG levels are turned on the org.apereo package in the log configuration:

1
2
3
4
<Logger name="org.apereo.cas" level="trace" additivity="false" includeLocation="true">
    <AppenderRef ref="casConsole"/>
    <AppenderRef ref="casFile"/>
</Logger>

When changes are applied, restart the server environment and observe the log files to get a better understanding of CAS behavior. For more info, please review this guide on how to configure logs with CAS.

Note that the above configuration block only addresses logging behavior of CAS components; not those upon which CAS depends. Consult the log4j configuration and turn on appropriate DEBUG logs for each relevant component. Those are usually your best data source for diagnostics and troubleshooting.

If your container of choice is Apache Tomcat, you may also want to look into your catalina.out and localhost-X-Y-Z.log log files to learn more about source of issues.

Deployment Problem; Configuration Issue X. Can You Help?

Yes. Study this.

How do I tune/extend MongoDb, MySQL, Hazelcast, Docker, etc?

If you have a question about tuning and configuration of external components utilized by CAS and you have a need to achieve more advanced use cases other than what the CAS defaults offer, your question is best addressed by the community in charge of that component’s development and support. As a general rule, you should always pick a technology with which you are most familiar, or otherwise, shoot a question to the Spring Webflow, MongoDb, Hazelcast, Redis, etc forums to have experts review and recommend ideas.

Typical questions in this category that are best answered elsewhere are:

  • How do I change TLS version from A to B?
  • Why does Azure/AWS/GCP work this way?
  • How do I configure SSL for Apache Tomcat, Jetty, Active Directory, etc?
  • How do I pass variables from one flow to the next in Spring webflow?
  • How do I tune up a hazelcast cluster?
  • Can you explain the steps needed to configure Redis Sentinel?
  • What is the recommended strategy for making MongoDb highly available?

Try Latest Patch Release

It is quite possible that the problem you are trying to resolve has already been solved by the next patch release. A patch release is a conservative incremental improvement that includes bug fixes and small enhancements and is absolutely backward compatible with previous PATCH releases of the same MINOR release. For example, if you are currently on CAS version 6.5.1 and have run into a possible issue, you should consider upgrading to 6.5.2, and 6.5.3 and so on to investigate further, assuming releases are of course available and published.

The project release schedule is available here, and you can always have a look at published releases.

Using SNAPSHOT Versions

There may be cases where you learn that a fix is available for the defect or behavior relevant for your CAS deployments and you may be advised to upgrade to the current available SNAPSHOT release. Depending on your choice of installation, you will need to find the setting in your deployment configuration and build scripts that describes your current CAS version and bump that to the next SNAPSHOT. The build scripts should also have additional instructions on how to obtain and build SNAPSHOT releases in README files and such.

To find out what SNAPSHOT version applies to your deployment, you can either look at the release schedule or the appropriate branch of the CAS codebase. For instance, if you have deployed CAS 2.0.4 and the release schedule shows the next release is targeted for a 2.0.5, then the available SNAPSHOT release would be 2.0.5-SNAPSHOT. You can also take a look at the milestone setting assigned to the issue/pull request and determine the SNAPSHOT release. SNAPSHOT releases are always postfixed with -SNAPSHOT. If the assigned milestone to an issue is for instance 1.2.5-RC1, then the SNAPSHOT release would be 1.2.5-RC1-SNAPSHOT.

Configuring SSL Behind Load Balancer/Proxy

You might be running CAS inside a servlet container such as Apache Tomcat behind some sort of proxy such as haproxy, Apache httpd, etc where the proxy is handling the SSL termination. The connections to the user are secured via https, yet those between the proxy and CAS service are just http.

With this setup, the CAS login screen may still warn you about a non-secure connection. There is no setting in CAS that would allow you to control/adjust this, as this is entirely controlled by the container itself. All CAS cares about is whether the incoming connection request identifies itself as a secure connection. So to remove the warning, you will need to look into your container’s configuration and docs to see how the connection may be secured between the proxy and CAS.

For Apache Tomcat, you may be able to adjust the connector that talks to the proxy with a secure=true attribute.

Application X “redirected you too many times”

“Too many redirect” errors are usually cause by service ticket validation failure events, generally caused by application misconfiguration. Ticket validation failure may be caused by expired or unrecognized tickets, SSL-related issues and such. Examine your CAS logs and you will find the cause.

Not Receiving Attributes

If your client application is not receiving attributes, you will need to make sure:

  1. The client is using a version of CAS protocol that is able to release attributes.
  2. The client, predicated on #1, is hitting the appropriate endpoint for service ticket validation (i.e. /p3/serviceValidate).
  3. The CAS server itself is resolving and retrieving attributes correctly.
  4. The CAS server is authorized to release attributes to that particular client application inside its service registry.

Please review this guide to better understand the CAS service registry.

Application Not Authorized

You may encounter this error, when the requesting application/service url cannot be found in your CAS service registry. When an authentication request is submitted to the CAS login endpoint, the destination application is indicated as a url parameter which will be checked against the CAS service registry to determine if the application is allowed to use CAS. If the url is not found, this message will be displayed back. Since service definitions in the registry have the ability to be defined by a url pattern, it is entirely possible that the pattern in the registry for the service definition is misconfigured and does not produce a successful match for the requested application url.

Please review this guide to better understand the CAS service registry.

Invalid/Expired CAS Tickets

You may experience INVAILD_TICKET related errors when attempting to use a CAS ticket whose expiration policy dictates that the ticket has expired. The CAS log should further explain in more detail if the ticket is considered expired, but for diagnostic purposes, you may want to adjust the ticket expiration policy configuration to remove and troubleshoot this error.

Furthermore, if the ticket itself cannot be located in the CAS ticket registry the ticket is also considered invalid. You will need to observe the ticket used and compare it with the value that exists in the ticket registry to ensure that the ticket id provided is valid.

Out of Heap Memory Error

1
2
3
4
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOfRange(Arrays.java:3658)
        at java.lang.StringBuffer.toString(StringBuffer.java:671)
        at 

You may encounter this error, when in all likelihood, a cache-based ticket registry such as Hazelcast is used whose eviction policy is not correctly configured. Objects and tickets are cached inside the registry storage back-end tend to linger around longer than they should or the eviction policy is not doing a good enough job to clean unused tickets that may be marked as expired by CAS.

To troubleshoot, you can configure the JVM to perform a heap dump prior to exiting, which you should set up immediately so you have some additional information if/when it happens next time. The follow system properties should do the trick:

1
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath="/path/to/jvm-dump.hprof" 

Also ensure that your container is configured to have enough memory available. For Apache Tomcat, the following setting as an environment variable may be configured:

1
CATALINA_OPTS=-Xms1000m -Xmx2000m

You will want to profile your server with something like JVisualVM. This will help you see what is actually going on with your memory.

You might also consider taking periodic heap dumps using the JMap tool or YourKit Java profiler and analyzing offline using some analysis tool.

Finally, review the eviction policy of your ticket registry and ensure the values that determine object lifetime are appropriate for your environment.

SSL & Certificates

PKIX Path Building Failed

1
2
3
4
5
6
7
8
9
10
11
12
Sep 28, 2009 4:13:26 PM org.apereo.cas.client.validation.AbstractCasProtocolUrlBasedTicketValidator retrieveResponseFromServer
SEVERE: javax.net.ssl.SSLHandshakeException:
sun.security.validator.ValidatorException: PKIX path building failed:
sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
javax.net.ssl.SSLHandshakeException:
sun.security.validator.ValidatorException: PKIX path building failed:
sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
      at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Unknown Source)
      at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(Unknown Source)
      at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Unknown Source)
      at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Unknown Source)
      at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(Unknown Source)

PKIX path building errors are the most common SSL errors. The problem here is that the CAS client does not trust the certificate presented by the CAS server; most often this occurs because of using a self-signed certificate on the CAS server. To resolve this error, import the CAS server certificate into the system truststore of the CAS client. If the certificate is issued by your own PKI, it is better to import the root certificate of your PKI into the CAS client truststore.

By default the Java system truststore is at $JAVA_HOME/jre/lib/security/cacerts. The certificate to be imported MUST be a DER-encoded file. If the contents of the certificate file are binary, it’s likely DER-encoded; if the file begins with the text ---BEGIN CERTIFICATE---, it is PEM-encoded and needs to be converted to DER encoding.

1
keytool -import -keystore $JAVA_HOME/jre/lib/security/cacerts -file tmp/cert.der -alias certName

If you have multiple java editions installed on your machine, make sure that the app / web server is pointing to the correct JDK/JRE version (The one to which the certificate has been exported correctly) One common mistake that occurs while generating self-validated certificates is that the JAVA_HOME might be different than that used by the server.

No subject alternative names

1
javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative names present

This is a hostname/SSL certificate CN mismatch. This commonly happens when a self-signed certificate issued to localhost is placed on a machine that is accessed by IP address. It should be noted that generating a certificate with an IP address for a common name, e.g. CN=192.168.1.1,OU=Middleware,dc=vt,dc=edu, will not work in most cases where the client making the connection is Java.

HTTPS hostname wrong

1
2
3
4
java.lang.RuntimeException: java.io.IOException: HTTPS hostname wrong:  should be <eiger.iad.vt.edu>
    org.apereo.cas.client.validation.Saml11TicketValidator.retrieveResponseFromServer(Saml11TicketValidator.java:203)
    org.apereo.cas.client.validation.AbstractUrlBasedTicketValidator.validate(AbstractUrlBasedTicketValidator.java:185)
    org.apereo.cas.client.validation.AbstractTicketValidationFilter.doFilter

The above error occurs most commonly when the CAS client ticket validator attempts to contact the CAS server and is presented a certificate whose CN does not match the fully-qualified host name of the CAS server. There are a few common root causes of this mismatch:

  • CAS client misconfiguration
  • Complex multi-tier server environment (e.g. clustered CAS server)
  • Host name too broad for scope of wildcard certificate

It is also worth checking that the certificate your CAS server is using for SSL encryption matches the one the client is checking against.

No name matching X found

1
2
3
Caused by: java.security.cert.CertificateException: No name matching cas.server found
    at sun.security.util.HostnameChecker.matchDNS(Unknown Source) ~[?:1.8.0_77]
    at sun.security.util.HostnameChecker

Same as above.

Wildcard Certificates

Java support for wildcard certificates is limited to hosts strictly in the same domain as the wildcard. For example, a certificate with CN=.vt.edu matches hosts a.vt.edu and b.vt.edu, but not a.b.vt.edu.

Unrecognized Name Error

1
javax.net.ssl.SSLProtocolException: handshake alert: unrecognized_name

The above error occurs mainly in Oracle JDK CAS Server installations. In JDK, SNI (Server Name Indication) is enabled by default. When the HTTPD Server does not send the correct Server Name back, the JDK HTTP Connection refuses to connect and the exception stated above is thrown.

You must ensure your HTTPD Server is sending back the correct hostname. E.g. in Apache HTTPD, you must set the ServerAlias in the SSL vhost:

1
2
ServerName your.ssl-server.name
ServerAlias your.ssl-server.name

Alternatively, you can disable the SNI detection in JDK, by adding this flag to the Java options of your CAS Servers’ application server configuration:

1
-Djsse.enableSNIExtension=false

When All Else Fails

If you have read, understood, and tried all the troubleshooting tips on this page and continue to have problems, please perform an SSL trace and attach it to a posting to the CAS mailing lists. An SSL trace is written to STDOUT when the following system property is set, javax.net.debug=ssl. An example follows of how to do this in the Tomcat servlet container.

Sample setenv.sh Tomcat Script follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Uncomment the next 4 lines for custom SSL keystore
# used by all deployed applications
# KEYSTORE="$HOME/path/to/custom.keystore"
# CATALINA_OPTS=$CATALINA_OPTS" -Djavax.net.ssl.keyStore=$KEYSTORE"
# CATALINA_OPTS=$CATALINA_OPTS" -Djavax.net.ssl.keyStoreType=BKS"
# CATALINA_OPTS=$CATALINA_OPTS" -Djavax.net.ssl.keyStorePassword=changeit"
 
# Uncomment the next 4 lines to allow custom SSL trust store
# used by all deployed applications
# TRUSTSTORE="$HOME/path/to/custom.truststore"
# CATALINA_OPTS=$CATALINA_OPTS" -Djavax.net.ssl.trustStore=$TRUSTSTORE"
# CATALINA_OPTS=$CATALINA_OPTS" -Djavax.net.ssl.trustStoreType=BKS"
# CATALINA_OPTS=$CATALINA_OPTS" -Djavax.net.ssl.trustStorePassword=changeit"
 
# Uncomment the next line to print SSL debug trace in catalina.out
# CATALINA_OPTS=$CATALINA_OPTS" -Djavax.net.debug=ssl"
 
export CATALINA_OPTS