Skype for Business Edge Server replication troubleshooter

Many times in Skype for Business/Lync the main issue I come across is the deployment is that the edge server(s) fails to replicate.
We see this when we run the command in Skype for Business Management Shell:
Get-CsManagementStoreReplicationStatus and we see that the edge servers have False and no LastStatusReport or ProductVersion is shown as below.



There are a number of things which can cause this and I am going to list them here. Try them in order here as this will contain the most common to the least common issues I found.
Ø  Routing
Ø  Firewall rules
Ø  DNS suffix
Ø  DNS (Local and LAN)
Ø  Registry entries for SCHANNEL
Ø  Certificates

A check we can use to see if the replication is working can be done by trying to connect to the following URL from the Front End Server whish hosts the CMS: https://edgeinternalfqdn.domain.ie:4443/Replicationwebservice
You should get the following screen:

Without this response there is an issue, even with this response there can still be issues so please go the whole way through the process below.

 

Routing

In Skype for Business Edge server deployments we use two interfaces on the Edge Servers; one for External DMZ – Directly connected via NAT to the Internet side and non-routable to the LAN & one for Internal DMZ – which is on a unique subnet and is routable by ALL LAN clients and Skype for Business/Lync servers.
As we have two NICs on the server, we can only have ONE single Default Gateway assigned to a NIC. We MUST have this default gateway assigned to the EXTERNAL DMZ:
We see above, that our INTERNAL DMZ Default Gateway will be shown as blank.
So how do we route traffic back to our LAN? We use Static Routes!
To add a route, we need to first know what the default gateway IP address is for the INTERNAL DMZ interface (Usually the Firewall interface assigned to that subnet).
In a Command Prompt on the Edge Server we can type the following:
Route Add MASK -p
The -p at the end is very important as it makes it Persistent and will not be removed upon reboot.
For example and needed for each of the internal subnets used:
Route Add 192.168.62.0 MASK 255.255.255.0 10.1.20.1 -p
Once put in we can type Route Print to show the local server routing table and pay attention to the Persistent Routes table section.


You may or may not (Probably not in most secure environments) be able to pint the gateway address to confirm connection.
NOTE: It is vital important that the firewall has a route on to the Internal LAN and vice versa that the LAN default gateway can route back to the INTERNAL DMZ.
If this checks out, move on to the next section. If not, please check with the networks team.

Firewall Rules

There are a number of firewall rules which must be in place for replication to work to the edge. The important thing here is that the rules are in place from the Skype for Business or Lync CMS server. This may still be on a Lync 2013 Front End server in the case of a migration scenario.
You can easily find which Front End server hosts the CMS by using the command:
Get-CsManagementStoreReplicationStatus -CentralManagementStatus

We must have TCP ports 443 and 4443 (most importantly) open from the CMS to the Internal DMZ interface on the Edge Server.

We can test this from the Front End server which hosts the CMS by doing the following test:
From a command prompt run Telnet 4443
If successful, you will get a flashing prompt in the command prompt window. You can safely close the window once this is checked.
This will prove your Routing is correct and Firewall port is open.
If this checks out, move on to the next section. If not, please check with the firewall team.

DNS Suffix

On the edge server, remember the server itself MUST be in a WORKGROUP and not a domain. We must however apply the domain suffix to the server name.
To do this open the properties of My Computer and click  Change Settings > Change > More and then here enter the internal LAN domain name. Click OK the whole way back and then at the prompt reboot the server.
If your domain name is already here, then you don’t need to redo or reboot.
If this checks out, move on to the next section. If not, please check with the server team, if you are unable to change yourself.

DNS

DNS plays a huge role in Skype for Business. This includes the Edge server naming. It is often forgotten to add the Edge Server(s) DNS name to the Local DNS server. All clients and Skype for Business servers need to be able to resolve the Edge Server(s) by DNS name. These must be added manually as they are not part of the domain which would normally auto register in DNS.
On the Edge Server(s) themselves, we need to either allow access to the LAN DNS servers from the INTERNAL DMZ interface or we use a HOST file on the Edge Server(s) and populate them with all the Skype for Business server names.
Ensure that other Edge Servers names are also added to the HOST file along with Front End Servers, pool names and CA servers if required.
You should be able to resolve the Front End Server(s) and pool addresses through ping, even though they may not respond to a ping.

If this checks out, move on to the next section. If not, please check with the server team or networks/DNS team, if you are unable to change yourself.

Registry Entries

Due to some certificate handling issues in Windows 2012 R2, it is required many times to have the following Registry entries in REGEDIT.
The location is: [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL]

Copy the below into a .reg file and run:

****
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL]
"ClientAuthTrustMode"=dword:00000002
"SendTrustedIssuerList"=dword:00000000
"EnableSessionTicket"=dword:00000002
****
NOTE: It is important that the Edge Server is rebooted once these entries are created. It will not update without a reboot.
This particular fix is a common fix and usually resolves many of the replication issues.
If this checks out, move on to the next section. If not, please check with the server team or, if you are unable to change yourself.

Certificates

Certificates are a MAJOR part of the replication process in Skype for Business. We need to ensure a few things when it comes to certificates.
Certificate on the Edge server(s) is applied correctly in the Deployment Wizard.
The Edge Server has a copy of the Internal Root CA and Intermediate CA Certificates installed manually. This should include ALL CA certs used in Skype for Business or Lync if required as they may be using different Certificate Authorities.  We can check the certificates by connecting to the Local Computer account on the server. We can quickly get to this service by using the command certlm.msc from the start/run line.
We also need to check the certificate itself which is assigned to the server role in the Deployment Wizard to ensure it is not expired and the names are correct. Subject Name must be the Pool Name in a pool deployment or the server name in a single server deployment.
Check Valid TO date and that the private key exists on the cert as per below:
We also check the Certification Path to ensure any Intermediate Certificate Authority and Root Certificate Authority are shown correctly.



Following this we need to check the Certificate Stores for any inconsistencies as Windows 2012 R2 is now very delicate on how it handles certificates. A cert in the wrong store or with a duplicate Friendly Name can stop everything in Skype for Business even if the cert is not related to the service!!
Do the following checks below, all from Windows Powershell, our examples below show errors for reference:

Check #1 – Misplaced certificates in Trusted Root CA

First we check for certs in the Root CA store which are not Root Certs (Intermediate or personal).

Get-Childitem cert:\LocalMachine\root -Recurse | Where-Object {$_.Issuer -ne $_.Subject} | Select Issuer, Subject, Thumbprint | fl

To solve this we need to move the certificate to the proper Store. In this case, we should move it to the Intermediate Certification Authority.

Check #2 – Duplicates in Trusted Root CA

Next we check for any duplicate certs in the Root CA store.
Get-Childitem cert:\LocalMachine\root | Group-Object -Property Thumbprint | Where-Object {$_.Count -gt 1} | Select-Object -ExpandProperty Group | Select FriendlyName, Issuer, Subject, Thumbprint | fl

Check #3 – More than 100 certificates in Trusted Root CA

This is important, as it may cause sign-in issues for users. Most of the time, we have less than 50 certificates. Limit here is 100.
Get-Childitem cert:\LocalMachine\root | Measure

To solve this we have to keep just the certificates that we need. In a Front End, this is actually an easy task, but in a Edge Server we need to be more careful, since the federation with other Skype for Business/Lync environments might get broken if we delete the wrong certificate. From a quick look we should see any foreign certs which are not required. Bring the number below 100 and reboot to speed up the refresh.

Check #4 – Root CA certificates in Personal Store

We should also remove any Root CAs from the personal store. These should not be here.
Get-Childitem cert:\LocalMachine\my -Recurse | Where-Object {$_.Issuer -eq $_.Subject} | Select FriendlyName, Issuer, Subject, Thumbprint | fl
Simply remove or better still, move to the correct store to ensure it is there. Root or Intermediate usually.

Check #5 – Duplicated Friendly Name

Usually, we add different Friendly Names so it gets easier to assign the certificate. In this case, however, it actually gets to be a requirement:
Note: Each certificate Friendly Name must be unique in the computer store.
Get-Childitem cert:\LocalMachine\my | Group-Object -Property FriendlyName | Where-Object {$_.Count -gt 1} | Select-Object -ExpandProperty Group | Select FriendlyName, Issuer, Subject, Thumbprint | fl
Rename the Friendly Name on the cert if required. Simply click the properties of the cert to find where.
Change the name and click OK

Check #6 – Misplaced Root CA certificates in Intermediate CA store

Finally check the Intermediate store for Root Certificates. The example shown below will come up but this is an inbuilt certificate from Microsoft and you DO NOT need to remove it. Any others though you should.
Get-ChildItem Cert:\localmachine\CA | Where-Object {$_.Issuer -eq $_.Subject} | Select Issuer, Subject, Thumbprint | fl

Recheck

Hopefully by now this has resolved your replication issues. To recheck simply run the command below in the Skype for Business Management Shell.
Get-CsManagementStoreReplicationStatus and we see that the edge servers have False and no LastStatusReport or ProductVersion is shown as below.

Hopefully this has resolved your replication issues!
Martin




Comments

Popular posts from this blog

Teams Device Health Monitoring and Reporting

Unassigned Numbers in Microsoft Teams using Audiocodes SBC