I recently ran into a little bit of an issue at a company I was working with regarding their Exchange 2007 installation. At the company, we had installed 2 Edge Transport servers in their main site in NY, and 1 Edge Transport server at their secondary location in Boston.
One of their user’s complained that when they sent email to a specific domain, say xzy.com, they were getting delivery delayed messages, and eventually full blown NDR messages:
As we can see by the NDR, the message expired. This happens when Exchange cannot connect to a certain destination mail server (such as the receiving company’s server is down), or there is no path to the destination. The 1st is usually the fault of the receiving organization, the second is usually a configuration problem on your end. Needless to say, we set out to troubleshoot.
Some basic stuff out of the way first. All three Edge Servers were configured identical, and all experienced the same exact behavior. All use Send Connectors, configured to deliver email using DNS, and thus connect directly to the destination domain.
The first thing I did, was the old telnet to the email server trick, and see if I can send email manually through telnet. First, you need to figure out the receiving email server for the domain in question. You can do this using the nslookup command. In a command prompt, enter the command nslookup and hit return. This will cause the machine your working off of to connect to its default DNS server, in NSLOOKUP mode.
Now, how do we find the server or servers that are responsible for receiving email on behalf of the organization? Why, by searching for MX records of course! In NSLOOKUP, tell the DNS server that you only want to know the MX records for a domain. You do this by setting the query to MX only with the set q=mx command:
Next, enter the domain you are having trouble with, such as espn.com:
All the entries listed, are servers that receive email on behalf of espn.com. Next, we want to see if we can send email to those servers, using telnet. Which server should we pick? Honestly, since they are all the same priority, to be really thorough, you can do all. If one had a lower priority, start with that one, and work your way back up through the highest one.
Open up another command prompt, and run the command telnet nameofserver.domain.com 25 and hit enter:
This is your greeting from the remote server. Now, to send email, you must issue several commands. First, introduce yourself using the helo yourserver.yourdomain.com command. Replace yourserver.yourdomain.com with the external DNS name of the server your connecting from. If you don’t, the remote server may tell you that you forged your name:, and it may reject your email:
The whole process will end up looking like this:
Okay, we’ll this means email was successfully sent from my Edge Transport server, to their email servers. So that means that their wasn’t any connectivity issue between the servers. Hmmm. Next, on to Connectivity Logging!
Connectivity logging is not turned on by default, but can be configured by right clicking on the server in the console, and going to properties->Log Settings:
Check the box to “Enable Connectivity Logging” and then select browse for the location.
When I checked the log, this is what I found for the domain in question:
The 72.21….. IP address is actually my DNS server for this particular Edge Transport server, and as you can see, it’s returning a DNS failure for that domain. Well, that’s odd, being with the NSLOOKUP we already determined that we could resolve names for that domain on this particular server.
As a test, I changed DNS servers to be ones from a completely different company and……same problem.
We’ll, now we now why the emails are expiring, they cannot figure out how to get to the domain in question, so they sit in the queue until they expire and the server generates an NDR.
So we know its a problem with name resolution, but have no idea why it would be failing. The next step I took was to use a network analyzer like WireShark to see what was happening. Here is a sample capture for the domain in question that was giving me problems:
Well, well well, here we have a DNS failure of Standard Query AAAA. So, what’s going on here? We’ll it seems that the server is performing ONLY an IPv6 DNS lookup, and not doing a IPv4 lookup. Since the domain doesn’t have an IPv6 record for their domain, the query is failing.
I reached out to Microsoft and it seems that this is a fairly common issue with Exchange 2007 servers working on Windows Server 2008 machines, as they come installed with IPv4 and IPv6. Microsoft does not have an official fix for it, but does have three potential solutions for the problem.
- Place a regular SMTP server ahead of the Edge Transport server, and have the Edge Server forward all outgoing mail to this server as a smart host. (this is in my opinion the worst solution)
- Create a new Send Connector for that particular domain as an address space. Set all email to forward through this connector as a smart host and add all MX records as potential smart hosts for this connector (this works very well, with the only potential problem being if the problem is spread across a lot of domain’s, there is a lot of work for the administrator)
- On the Send Connector, check the “Use the External DNS Lookup on the transport server”
Even if you do not configure specific External DNS servers on the Edge or Hub Transport servers, this will cause the server to only perform a standard IPv4 lookup, which will resolve the issue.
Hopefully this article can help you when troubleshooting email delivery issue’s, giving you some of the steps you can take to track down the specific issue your users are having.