I encountered an Availability Group sync issue with a couple of error messages that were new to me. Since I didn’t find much helpful information when searching for the error codes online, I wanted to make sure to document the “fix” for others and my future self.
Note: Apologies for the vague blog post title. I pondered ways to note the different error messages mentioned but didn’t want the title to be a mile long.
Health Check
The availability group in question was unhealthy, and none of the added databases were syncing. By the time I started investigating, the SQL service on the secondary had been restarted. There were also no recent errors in Failover Cluster Manager.
I checked the SQL Server Error Log and found some clues. The SQL Server Error Log was filled with “Always On: DebugTraceVarArgs” errors for each database that included the message:
“Seeding encountered a transient failure ‘108’, retrying…”
I also checked sys.dm_hadr_availability_replica_states. Checking this DMV, I found that the secondary had a connected_state_desc value of DISCONNECTED and a last_connect_error_description of:
“An error occurred while receiving data: ‘10054(An existing connection was forcibly closed by the remote host.)’.”
I followed the steps here to rule out some network issues and issues with the endpoint. Everything looked good. I did some more research and was coming up short.
Remove and Re-Add
I decided to take a noncritical database out of the availability group and add it back in, thinking I might get a different error if it failed to add. Nothing strange when removing the database from the availability group, and likewise nothing strange when adding it back to the availability group. This all completed successfully.
In fact, as soon as the database was re-added, the availability group state switched to a healthy state and all databases synced. I’m still not sure what caused the issue in the first place, but if I see the same groups of error messages again, I’ll start by removing and re-adding a noncritical database to see if that kickstarts the other databases.
Thanks for reading!
One thought on “Another Availability Group Sync Saga”