I have been working with this issue for a time now, and are waiting for a fix:) Since I have not found any threads or information on this on the web, I thought it could be helpful for others to get the information.
Quick information about the Exchange environment:
We go to sites with Exchange 2010. Each site with two Exchange servers and TMG. One server is running DAG and the other CAS/HT. Both sites are internet-facing, and TMG is publishing in both sites. One is publishing e.g mail.domain.com and the DR site is publishing mail-dr.domain.com. Site A is the primary exchange site, Site B is for DR. All clients connects to the primary site except autodiscover pointing to DR site.
So what is the issue:
When DAG failing over, or if you switch over manually, some random ActiveSync devices will not get redirected to the new active Site. If you manually enter the correct URL the sync start. OWA/Outlook works as expected. When you do an new failover, the same device could be correct redirected. But then some other device could have the problem.
So what is going on:
I first saw the issue when after we applied Exchange 2010 SP3. Before this I the issue did not exist (not that I am aware of). When I first get saw the information, I started to collect information. First of all, I checked the event log on the Exchange servers. And on the site that it was failing from I found this error in Application log on the CAS server:
The Client Access server doesn’t have the InternalURL value set for the Microsoft-Server-ActiveSync virtual directory. This prevents Exchange ServiceDiscovery from finding the MobileSyncService information for user "MBX home server" At least one Client Access server in the user’s mailbox Active Directory site must have the InternalURL value set. The format for the InternalURL value is https://hostname/Microsoft-Server-ActiveSync"
I found out that this event came every time the devices that was unable to redirect tried to sync. But there was not any other events regarding this.
I then checked IIS logs on the site where the DAG was not active (default location C:inetpublogsLogFilesW3SVC1) and found this.
2013-05-15 02:16:29 IP_CAS_Server OPTIONS /Microsoft-Server-ActiveSync/default.eas &Log=RdirTo:https%3a%2f%2fPrimarysiteURL%2fMicrosoft-Server-ActiveSync_V0_LdapC1_LdapL109_Cpo19890_Fet19999_S130_Error:DiscoveryInfoMissing_Mbx:FQDN_MBXServer_Budget:(D)Conn%3a1%2cHangingConn%3a0%2cAD%3a%24null%2f%24null%2f1%25%2cCAS%3a%24null%2f%24null%2f0%25%2cAB%3a%24null%2f%24null%2f0%25%2cRPC%3a%24null%2f%24null%2f0%25%2cFC%3a1000%2f0%2cPolicy%3aDefaultThrottlingPolicy%5F221b878b-f351-48f1-b524-2ae00cda8947%2cNorm_ 443 domainusername IP-address from TMG publishing Apple-iPhone4C1/1002.329 403 0 0 19999
So what does this mean:
1. We can see that the ActiveSync virtual directory is trying to redirect to the primary site http://primarysiteurl
2. The mailbox server name in primary site is FQDN_MBXServer in primarysite
3. The redirect failed: the error information is ‘DiscoveryInfoMissing’ and the error code is ‘403’ instead of the 451 redirect it should have got.
I have not found any more information on this, so I opened a case with Microsoft. And this was the answer I got.
Then I do some researches on our situation, after applying the SP3, when we manually switch the DAG over back to primary site again, the ActiveSync device cannot automatically redirect to the primary site, and get a 451 redirect, instead getting a 403 error. And it is a known issue, we have reported this problem to our product group. It will be fixed in the next generation SP3 RU2. Then let’s wait the update patiently. Hope you can understand.
I will continue work with support to hopefully get a fix for this before RU2.