Database Replication plays a wide role if you have a call manager cluster environment. It should function properly to provide uninterrupted services to the users.
How to check the DB Replication status of your Call Manager Cluster?
At first, we will have to run command
Check the output and verify if Cluster Replication State contains old synchronization information. If the BROADCAST SYNC is not updated with the recent date/time run the following command to check all the tables and replications. If any error/mismatch comes up, it will be shown in the output, and the REPLICATION SETUP (RTMT) state changes accordingly.
After running the above command, all the tables and replication are checked again for consistency and the exact status is displayed.
command to check the progress. Wait for it to complete the process and run the command again and check the real-time status of the database replication.
|0||Initialization State||This state means Replication is currently setting up. A setup failure may have happened if replication is in this state for over 60 minutes. (Either no subscribers exist, or the Database Layer Monitor administration has not been running since the subscriber was installed.)|
|1||The Number of Replicates is Incorrect||Replication has been created, but the count is incorrect.|
|2||Replication is Good||Logical connections are established, and the tables are matched with the other servers on the cluster.|
|3||Mismatched Tables||Logical connections are established but there can be a mismatch in the tables.|
This issue can occur because the other servers are unsure whether there is an update to the User Facing Feature that has not been passed from the subscriber to the other device in the cluster.
|4||Setup Failed/Dropped||The server no longer has dropped the logical connection in order to get any database table across the network. Replication does not occur in this state.|
Troubleshooting methodology to diagnose and resolve the broken Database Replication.
1. Ensure the Network Reachability of all the Nodes.
Check if all the nodes are up and accessible by pinging from a machine in management VLAN (from where all nodes should be accessible) to all the nodes.
If all the nodes are accessible, move to the next step, otherwise check the power status of unreachable nodes or the network issues causing unreachability of the node.
2. Ensure the Network Reachability Between all the CUCM Nodes.
Check the ping reachability using the command mentioned below from the Publisher node to all the subscriber nodes and vice versa.
If connectivity between the nodes is proper, move to the next step, otherwise, troubleshoot the network issues and ensure that the reachability is present between all the nodes.
3. Ensure DNS and Reverse DNS Record Availability of all the Nodes.
If the DNS does not function correctly and the servers are defined using the hostnames, it can cause database replication issues.
Check DNS config using the command below:
If DNS is not configured properly, configure it using the following command and verify the connectivity, if already configured properly move to the next step.
Once you have the properly configured DNS, Verify A and PTR records using the command mentioned below.
4. Check NTP Reachability
Fully Functional NTP is extremely important in order to avoid any database replication issues.
Stratum also plays a role here, the stratum of the NTP server must be less than 3 for the proper functioning of database replication.
Check the NTP status using the following command:
Run the command below and check the following output:
If you get any error in the utils ntp status command, check if the configured NTP is correct and network reachability is there.
In case you get the following output:
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvs38748 can be the reason for this issue. There is a root workaround for this bug. So, to resolve this issue contact Cisco TAC. Collect the output of the following command before contacting TAC and share it while opening the case, this will save your time.
5. Connectivity Status Check of all the Nodes.
If all the above checks are functioning properly, run the following command on all the nodes in the cluster to check the database connectivity is successful or not.
If you get an error message. Check if required TCP/UDP ports are opened in the network.
Now run the command to check the authentication of all the nodes.
If you get any nodes status as unauthenticated, check if:
- Network connectivity is there.
- Security Password is the same on all the nodes.
Refer How to reset passwords on CUCM and CUCM Operating System Administrator Password Recovery to change or recover your password:
6. Check Required Services are Running.
Check if A Cisco DB service is running on all the nodes using the command mentioned below:
If A Cisco DB service is down on any node run command to start the service. If you face an issue in starting the service contact Cisco TAC (Restarting the node can resolve your issue but this is not recommended).
7. Repairing Specific/all the Tables from Database Replication.
If all the above steps have been followed and ensured that everything is functioning properly except the database replication. In order to fix the issue of database replication, the next step is to repair the database tables.
To repair the database replication, use one of the following commands as per the requirement of your environment and the issue you are getting.
Run the command to check the status of the replication again after repair. If the status does not change, move to the next step.
8. Reset Database Replication.
If the above step does not resolve your issue, proceed with the database replication reset from the scratch. In order to properly resetting the database replication, the following step must be followed:
– On the Subscriber node run the command and wait for it to complete before starting the next step.
– On the Publisher node run the command and wait for it to complete before starting the next step.
-On Publisher and subscriber run command. Ensure that both the servers are RPC reachable.
-On the Publisher node run the command and wait for it to complete before starting the next step.
-On the Subscriber node run the command and wait for it to complete before starting the next step.
-On the Publisher node run command and wait for it to complete before starting the next step.
-Now Restart the Subscriber and wait for it to come up with all the services running.
-On both Publisher and Subscriber nodes, periodically run the command and monitor the RTMT state. They should process and end up at (2) i.e., Replication is a Good state if replication sets up properly. The process can take some time, wait for it.
-If RTMT states do not become (2) after enough amount of time, run the following command on all the nodes and collect the information:
Now open a TAC case and share the collected information with the TAC.
We hope this article gives you an understanding, how to check the database replication status, diagnosing and resolving the DB replication issues, if any.
Are you looking for consulting, advisory and professional services to deploy a Collaboration Environment for your organization? Zindagi can help.
Zindagi Technologies Pvt. Ltd. is an IT consultancy and professional services organization based out of New Delhi, India. We have expertise in planning, designing, and deployment of collaboration environments, large-scale data centers, Private/Public/Hybrid cloud solutions. We believe in “Customer First” and provide quality services to our clients always.
Network Consulting Engineer