Become a Columnist Microsoft Exchange Site Microsoft Support SiteMSDN Exchange Site

       How did you like this article? Please vote and let us know.          

Subscribe to OutlookExchange
Anderson Patricio
Ann Mc Donough
Bob Spurzem
Brian Veal
Catherine Creary
Cherry Beado
Colin Janssen
Collins Timothy Mutesaria
Drew Nicholson
Fred Volking
Glen Scales
Goran Husman
Guy Thomas
Henrik Walther
Jason Sherry
Jayme Bowers
John Young
Joyce Tang
Justin Braun
Konstantin Zheludev
Kristina Waters
Kuang Zhang
Mahmoud Magdy
Martin Tuip
Michael Dong
Michele Deo
Mitch Tulloch
Nicolas Blank
Pavel Nagaev
Ragnar Harper
Ricardo Silva
Richard Wakeman
Russ Iuliano
Santhosh Hanumanthappa
Shannal L. Thomas
Steve Bryant
Steve Craig
Todd Walker
Tracey J. Rosenblath

 

 
   

Testing Exchange 2007 - Part 2

Page 1 | Page 2 | Page 3

Areas to Test and to Consider

As outlined so far there are countless areas that need to be evaluated and testing with Exchange 2007.  Below I am going to try to prioritize some of the key areas but these will vary for each organization.

Applications

A single business critical application that doesn't work with E2k7 can halt your transition\migration even before it starts.  Therefore, it is essential that all applications that interface with Exchange, besides those using standard SMTP, must be tested.  Below is a prioritized list of the type of applications that should be tested.

1.      Those that use any of the discontinued features must be identified and replaced.

2.      Any application that runs on the Exchange server, the x64 OS may break them.

3.      Those that use the deemphasized features should be identified and tested.

4.      MAPI or Outlook Object Model based applications, if they are built around Outlook 2003 they will need to be tested for Outlook 2007 support.

Message Routing

For large organizations link state routing required a bit of black magic sometimes to be able to foresee how messages would be routed or should have been routed.

In E2k7 all routing is now based on AD sites.  With routing based on AD sites, routing should be more predictable but below are some key factors to consider.  For more information see this TechNet reference.

1)      AD site link cost is critical for message routing so they must be re-evaluated.

2)      Messages are now transmitted directly from the HT server in the source AD site directly to HT server in the destination site, that contain the sending and receiving mailboxes.

a)      Delayed fan-out or message bifurcation will cause message to be delivered to the HT server in the AD site that will produce the fewest messages to other HT servers.

b)      If the HT server in the destination site cannot be contacted messages will be delivered to the next closest AD site with a HT server, which is available.  This behavior is commonly called queue at point of failure.

Disaster Recovery

There are many new changes that impact DR planning in E2k7.  Microsoft has made some major improvements that should allow an organization to recovery quicker and easier from a loss of a database, server, or even an entire site.  Data replication is now included with Exchange, something that multiple 3rd party provided for Exchange 2003 before.  The out of the box replications methods include Local Continuous Replication (LCR), Continuous Custer Replications (CCR), and Standby Continuous Replications (SCR) [coming in SP1].  LCR, CCR, and SCR all use logging shipping to replicate changes from a source database to a target database.  With LCR the replication support allows for any changes, stored in transaction logs, to be copied to another location where they are then committed to the second copy of the database.  When LCR or CCR is enabled (not sure about SCR at this point since it was not in Beta 1 of SP1) a storage group can only contain one database\store.  This should not be an issue since E2k7 now supports up to 50 storage groups and databases (50 max databases across all storage groups).  To reduce the chance of data loss and to address other factors Microsoft reduce the transaction log file size from 5MB to 1MB in E2k7.  CCR works in a similar fashion but logs are replicated from the primary node, CCR requires Windows Clustering, to the secondary node, which could be in the same physical site or a different one (Note: There are major limitations with spanning sites with E2k7 and W2k7, Windows 2008 will resolve most of those).  SCR provides similar support to but doesn't require clustering but does support replicating from one E2k7 server to another.  SCR supports one-one, one-many, many-many, and many-one relationships between storage groups and servers.  For example, you could have five servers with five storage groups each and use SCR is replicate data in those twenty-five storage groups to a single server with twenty-five storage groups.  SCR looks like it is going to provide the critical support needed for most organizations to implement site level disaster recovery without the need for 3rd party products.

In addition to the data replication support in E2k7 Microsoft has also updated Exchange to support true database portability.  Before if a server were to die or a database needed to be recovered on another physical server that recovery server had to have the same name, domain membership, administrative group name, and be setup with the /DisasterRecovery switch.  This required most organizations to have a server standing by adding no value.  With E2k7 databases can be copied to or restored to another server, which was possible in E2k3 with Recovery Storage Groups in a limited fashion.  The big difference is that client can now connect to these "restored" databases, after user's settings have been updated in the Active Directory and a few other steps (which can all be scripted with the EMS) have been carried out.  So instead of spending a couple hours building a server and then using the Mailbox Recovery Wizard or ExMerge to copy data, a database can just be mounted and a script run on an existing server to allow users to connect to it.

So in addition to the new OS (x64) and many other changes in E2k7 that might impact exiting DR plans the new features may drastically change those plans.

Clustering and HA

As mentioned above a new type of clustering has been added to E2k7 called Cluster Continuous Replication (CCR).  CCR uses the Majority Node Set clustering model, unlike previous versions of Exchange that uses a shared data model, now called Single Copy Clusters (SCC).  With SCC the OS and storage system required that multiple nodes could connect to the drives or LUNs with the Exchange and quorum data on them.  A SCC config could be created with a basic SCSI storage system, special cables, and SCSI controllers to support a two node cluster.  To support 3+ nodes required a SAN, iSCSI, or other storage system that support multiple connections from servers and device reservations or locking.  Both of these were fairly complex to setup and in many environments DECREASED uptime due to this complexity.  Due to these reasons Microsoft and experienced Exchange consultants would only recommend clustering in special circumstances.  With SCC you can't address the #2 cause of downtime (#1 being human error), which is database issues.  The most "common", which is not very common, problem with Exchange 2003 and earlier versions was database corruption and storage system failures.  In both cases all users on the cluster who are in an affected database maybe taken off-line, depending on the level of corruption of failure.  With CCR, and similarly with LCR, data can be replicated across to completely different storage systems.  CCR does this by requiring two nodes, the maximum, and data for each node is accessed directly by the server.  This data can be stored locally, on iSCSI, or on a SAN but both nodes should NOT store data on the same iSCSI or SAN storage system, otherwise you still have single point of failure in your storage system.

LCR provides higher availability by allowing data to be stored in two places.  Each location should be on a different storage system, with a dedicated RAID\iSCSI\HBA controller, dedicated external storage cabinet\SAN, and dedicated network\fabric for each.  This way there is no single point of failure in the storage system.  If a failure was to occur, the Exchange admin would need to change the database paths in EMC\EMS to point to the secondary location or copy\move the secondary files to the primary location and remount the failed storage groups.

CCR takes this one step further by providing server redundancy and automatic failover support.  Unlike CCR, LCR does require manual or scripted intervention in the case of database or storage system failure.  With CCR an entire server can be loss and the standby server will start servicing users, after several minutes while the standby node takes ownership of the cluster.  One area CCR doesn't address is individual database failure.  Unfortunately, if a single database fails in a CCR cluster the entire cluster must be failed over to the standby node, during the failover all users will be disconnected until the standby now has taken full ownership and mounted all stores.  Therefore, a business decision must be made in such a case to keep the users on the failed database offline while the problem is troubleshot or to take all users down for a brief period while the cluster fails over.

Similar to LCR, SCR replicate data to another location, this must be on a different server.  Because the data is on a different server the database cannot be just mounted and have users access them.  So SCR required additional manual\scripted steps to enable users to connect to the new server that is now hosting their data, in the case of a failure.  I plan on writing an entire article on this process at some point but basically the Active Directory and DNS needs to be updated so Outlook clients know where to find the users mailbox.

CCR, LCR, and SCR are major new additions and should significantly affect exiting DR plans.  They are also one of the key features that should help justify the deployment of Exchange 2007.

Scalability

Everyone by know should know that Exchange 2007 requires an x64 OS (Windows 2003 SP2 or Longhorn) [Note: W2k3 SP2 is required for E2k7 SP1, W2k3 SP1 is only required for E2k7 RTM] so I'm not going to go into detail on this.  The key thing this affects is caching on Exchange and this directly affects I/O operations (IOPS) generated by end users.  With E2k3 only 700MB of RAM could be used for caching but E2k7 can use GBs, the current sweet spot is about 24GB but this might be improved with SP1.  Past 32GB of RAM the cost of 4GB DIMMs become cost prohibiting and the additional memory it doesn't provide a linear scalability.

So what does all this extra memory and caching allow?  Well with E2k3 you could deploy about 3-4K medium use Outlook users on a single server, I have deployed over 6K on a single server but this was for a 24x7x365 operation where only 40% of the users would ever be connected at once.  As mentioned above the major limiting factor was the amount of memory available for caching.  Since the most commonly accessed user data could be cached for everyone, each user would generate .5 - 1 IOPS.  Therefore, the storage system became the bottle neck.  With E2k7 this changes due to 64-bit memory addressing and other database changes.  The IOPS profile can be reduce between 50-70% with E2k7 and enough memory.  The decision that now must be made is how much money should be spent on memory verse the storage system.

What does this have to do with testing you might ask?  Well, the obvious thing is the lab environment must be able to simulate say 5,000 users now where before it only need to simulate 2,000 users.  In addition, the DR and backup plans will have to be modified to support the profile of a server with this many more users, if the business decision can be made to put this many users on a single server.  LCR, CCR, and SCR should help justify putting this many "eggs in one basket" for most organizations

Testing Exchange 2007 - Part 2

Page 1 | Page 2 | Page 3

Disclaimer: Your use of the information contained in these pages is at your sole risk. All information on these pages is provided "as is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Pro Exchange. OutlookExchange.Com and Pro Exchange shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages.

© Copyright Pro Exchange, Inc., 2006