Logging into the NSX-T Manager gives error “Some of appliance components is not functioning”

Recently, while I was working in my lab environment, I faced an issue with my NSX-T deployment it was not allowing me to log in to it. The error message did not help me much to understand the root cause of the issue. I am documenting it here in this blog just in case if anyone hits the same issue.

Symptoms :
  • Once you click on login, the below error appears
  • cbm.log @ /var/log/cbm on the NSX-T manager gives the below error
2019-09-20T09:05:38.498Z INFO CorfuRuntime-0 CorfuRuntime - fetchLayout: Invoking the systemDownHandler.
2019-09-20T09:05:38.543Z INFO netty-16 NettyClientRouter - connectAsync[tcp://]: Channel connection failed, reconnecting...
2019-09-20T09:05:39.545Z INFO netty-17 NettyClientRouter - connectAsync[tcp://]: Channel connection failed, reconnecting...
2019-09-20T09:05:39.999Z WARN CorfuRuntime-0 CorfuRuntime - Tried to get layout from but failed by timeout
2019-09-20T09:05:39.999Z WARN CorfuRuntime-0 CorfuRuntime - Couldn't connect to any up-to-date layout servers, retrying in PT1S, Retried 7312 times, systemDownHandlerTriggerLimit = 90
2019-09-20T09:05:39.999Z INFO CorfuRuntime-0 CorfuRuntime - fetchLayout: Invoking the systemDownHandler.
  • corfu.9000.log @ /var/log/corfu on the NSX-T Manager gives the below error
2019-09-20T06:03:43.097Z | ESC[34mINFO ESC[0;39m | CorfuServer | o.c.i.ServerContext | getClientGroup: Type NioEventLoopGroup with 24 threads
2019-09-20T06:03:43.152Z | ESC[34mINFO ESC[0;39m | CorfuServer | o.c.i.ServerContext | getWorkerGroup: Type NioEventLoopGroup with 24 threads
2019-09-20T06:03:43.154Z | ESC[34mINFO ESC[0;39m | CorfuServer | o.c.i.ServerContext | getBossGroup: Type NioEventLoopGroup
2019-09-20T06:03:43.167Z | ESC[34mINFO ESC[0;39m | CorfuServer | o.e.sizeof.impl.AgentLoader | Unavailable or unrecognised attach API : java.lang.ClassNotFoundException: com.sun.tools.attach.VirtualMachine
2019-09-20T06:03:43.180Z | ESC[34mINFO ESC[0;39m | CorfuServer | o.e.sizeof.impl.JvmInformation | Detected JVM data model settings of: 64-Bit HotSpot JVM with Compressed OOPs
2019-09-20T06:03:43.185Z | ESC[34mINFO ESC[0;39m | CorfuServer | org.corfudb.util.MetricsUtils | Corfu metrics collection and all reporting types are disabled
2019-09-20T06:03:43.836Z | ESC[39mDEBUGESC[0;39m | CorfuServer | i.n.util.ResourceLeakDetector | -Dio.netty.leakDetection.level: simple
2019-09-20T06:03:43.837Z | ESC[39mDEBUGESC[0;39m | CorfuServer | i.n.util.ResourceLeakDetector | -Dio.netty.leakDetection.targetRecords: 4
2019-09-20T06:03:43.845Z | ESC[39mDEBUGESC[0;39m | CorfuServer | i.netty.buffer.AbstractByteBuf | -Dio.netty.buffer.bytebuf.checkAccessible: true
2019-09-20T06:03:43.846Z | ESC[39mDEBUGESC[0;39m | CorfuServer | .u.ResourceLeakDetectorFactory | Loaded default ResourceLeakDetector: io.netty.util.ResourceLeakDetector@5efe6e77
2019-09-20T06:03:47.434Z | ESC[1;31mERRORESC[0;39m | CorfuServer | o.c.i.log.StreamLogFiles | Checksum mismatch detected while trying to read file sun.nio.ch.FileChannelImpl@82dd757
2019-09-20T06:03:47.671Z | ESC[1;31mERRORESC[0;39m | CorfuServer | o.c.infrastructure.CorfuServer | CorfuServer: Server exiting due to unrecoverable error:
org.corfudb.runtime.exceptions.DataCorruptionException: null
at org.corfudb.infrastructure.log.StreamLogFiles.parseEntry(StreamLogFiles.java:525)
at org.corfudb.infrastructure.log.StreamLogFiles.readAddressSpace(StreamLogFiles.java:562)
at org.corfudb.infrastructure.log.StreamLogFiles.lambda$getSegmentHandleForAddress$1(StreamLogFiles.java:677)
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1688)
at org.corfudb.infrastructure.log.StreamLogFiles.getSegmentHandleForAddress(StreamLogFiles.java:660)
at org.corfudb.infrastructure.log.StreamLogFiles.initializeLogMetadata(StreamLogFiles.java:136)
at org.corfudb.infrastructure.log.StreamLogFiles.<init>(StreamLogFiles.java:112)
at org.corfudb.infrastructure.LogUnitServer.<init>(LogUnitServer.java:114)
at org.corfudb.infrastructure.CorfuServer.startServer(CorfuServer.java:304)
at org.corfudb.infrastructure.CorfuServer.lambda$main$0(CorfuServer.java:270)
at java.lang.Thread.run(Thread.java:748)
Cause :

From the above logs, it’s clear that the Corfu database is not available due to data corruption.

Resolution :

Unfortunately, the only way to recover from this state is to restore from a backup or deploy a new NSX-T manager.

Note : If anyone of you faces this issue, I highly recommend you reach out to the VMware NSX-T technical support team for assistance before making any further changes.

When I saw the error message at first, it did not give me much of an idea about what is causing the issue or whether the issue is recoverable or not. After reviewing the logs, I am clear about what the next steps are. I hope this blog post helps if you face this issue in your environment.

3 thoughts on “Logging into the NSX-T Manager gives error “Some of appliance components is not functioning”

Add yours

  1. Had the same issue, since the NSX-T Mgr runs on top of Ubuntu, you can use the “last” command to see if one of the users ended it’s session with a crash, which is most likely an indicator of a power outage:

    root@nsx-t-node-2:/var/log/corfu# last
    admin pts/0 Thu Apr 30 16:26 still logged in
    reboot system boot 4.14.74-nn3-serv Thu Apr 30 15:54 still running
    admin pts/0 Thu Apr 30 12:26 – crash (03:28) ===> here!!
    reboot system boot 4.14.74-nn3-serv Thu Apr 30 10:02 still running


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: