Rick Mitchell Solutions - RMSBlog

With Rick Mitchell Solutions, you get the experience of over 10 years dealing with these very same problems you face every day. Large businesses that are in the Fortune 500 down to the small business with aspirations to become global can rely on us to understand and design solutions that fit your needs and your budget.

Tuesday, May 18, 2010

Dell Equallogic limitations

If you have been reading my blog, you will have noticed that I am a fan of Equallogic SAN's and have been using them successfully for several months now. However, there are some things that you must know before you make your purchase to decide if it is the right storage environment for your needs. These are talking points that the competition to Equallogic should be focused on but Equallogic does a good job of hiding its weaknesses.

The concept of a Equallogic chassis is that it is virtualized storage, meaning that the entire chassis must contain one and only one storage pool. If you have been involved with an EMC SAN before, the concept of RAID Groups will not work here. For example, if you buy a 48 drive Equallogic SAN, then that entire chassis must be dedicated to a single RAID policy. There is no concept of dedicating certain disks to a certain volume or make certain disks in the chassis RAID 10 vs RAID 5. Equallogic wants you to buy another chassis to use with high performance applications or a dedicated application which can obviously get expensive.

I found this out the hard way over the last couple of weeks when I was trouble-shooting performance issues with a linux application. I wanted to create a RAID group like I can with an EMC array but was unable to and then was told to just buy another chassis to dedicate spindles to a particular application. I even tried to install and configure some QLogic iSCSI host bus adapters and they actually produced worst results than standard multi-pathing of NIC's on this linux server.

Equallogic is really good at bulk storage and simplicity of administration, but if you are interested in doing more with your SAN then it may be best to look elsewhere. Otherwise, you will need to buy several chassis's in order for it to work like an EMC array.

Thursday, April 22, 2010

Reverse DNS for a non Class C range

One of the many spam detection techniques certain ISP's and mail servers use today is the ability to get a valid reverse DNS entry for a mail server. The theory is that if the mail server has a reverse DNS entry tied to a legitimate domain then someone actually meant to set it up as an email server and not some random spam-bot. This is a good theory but one that legitimate mail servers sometimes have trouble getting set up right.

Chances are that you are a small to medium size company and that you do not have an entire Class C subnet of public IP's assigned to you. You probably are running your own DNS servers across different physical locations and want to provide maximum reliability and uptime for your network. In this scenario, you have two options:

1. Your ISP is handling reverse DNS entries for your subnet and you need to provide them with what DNS names you want assigned to a particular IP address.
2. You have been delegated authority for your subnet by your ISP and you are responsible for proper reverse DNS entries on your DNS server

A lot of folks do not realize that they have option 2 and usually rely on the ISP to manage your reverse DNS entries. This can cause some support and management headaches so I tend to try to host DNS completely in-house.

When trying to set up a reverse DNS range, it is easy if you have a large subnet of IP's, but what if you just had:

12.233.182.230/224

How would you create this zone?

It is a bit tricky, but you first need a reverse DNS zone of:

224/27.182.233.12.in-addr.arpa

This signifies that the zone is a reverse DNS zone for the 224/27 network which is not a class C. You then create an entry for 230 which would map to:

mail.somecompany.com

And then if you do a nslookup on that IP address you can then be mapped to the proper host name.

Clear as mud? Microsoft has a good article here:

Opening firewall ports in VMWare ESX

One of the things that you can do with a Dell PowerEdge server is to install the OpenManage utilities on your VMWare ESX platform which is quite helpful at managing the physical hardware that hosts your VM's. Unfortunately, the installer does not open up the ports inside of ESX automatically so you need to know how to open these ports yourself. From a shell or via the console as root, use the following command:

esxcfg-firewall -o 1311,tcp,in,OpenManageRequest

This command will open TCP port 1311 for inbound requests on the ESX firewall. You can then go to:

https://yourvmserver:1311

And successfully connect to the OpenManage interface.

McAfee Antivirus fallout from yesterday

Yesterday was one of those moments in IT when you are called into action with little to no idea as to what is going on. There is no magic book or class you can take to prepare you for a debacle like what we experienced yesterday. The morning was going fine until I received a report from a user that their machine was going to shut down within 60 seconds because of a DCOM server error on their PC. I didn't think much of it until I got a second report from a different user. I checked BigFix to make sure that nothing was being pushed to the organization that might be causing these issues and found that nothing was getting rolled out. I then went to physically investigate a single PC and immediately saw a McAfee memory error. I shut down our ePolicy Orchestrator server and started trying to deal with the fallout.

The machines that were affected had no services started except for maybe 5, no internet or network access, and no access to the start menu or other GUI functions.

After some quick investigating, I found that the file c:\windows\system32\svchost.exe was 0 bytes and was modified that morning. I concluded that McAfee must have been wiping this file so I did the following:

dir /s svchost.exe

to locate another copy of the svchost.exe file on the machine. I found two located in:

c:\windows\system32\dllcache
c:\windows\servicepackfiles\i386

I copied the file overtop of the 0 byte file, started the Windows Installer service and then uninstalled McAfee. Once rebooted, the machine worked okay. I made some notes and sent them out to our IT team to get them going. This was in place by 10:30am EST after our initial report at 9:57am EST.

Compared to others, we were very fortunate. The speed at which we disabled the updates prevented complete meltdown and affected approximately 240 machines in total. Using faxes, emails, phone calls, some contractors we were able to get 30+ field sites back up and running including two corporate headquarters. No servers were affected and for the folks that we prevented this from happening to were able to continue to work.

Looking back on the events of yesterday, I am pleased at how well everyone worked together and kept their cool. You don't often see that in a time of crisis but the IT department here did a fantastic job.

Obviously we will not be using McAfee anymore after yesterday and I will never recommend anyone using them in the future. How this passed any type of quality assurance testing is beyond me no matter what BS they may come up with in a PR release to try to offload some of the responsibility for this disaster. I have some meetings today to look at other AntiVirus products and am looking forward to not having to go through this again.

Sunday, March 28, 2010

Migrate DHCP service from Windows 2003 to Windows 2008

http://blogs.technet.com/networking/archive/2008/06/27/steps-to-move-a-dhcp-database-from-a-windows-server-2003-or-2008-to-another-windows-server-2008-machine.aspx

Just a quick note that this also works going to 2008 R2 as well. I did this recently and it worked flawlessly.

Cisco 2800 series router password recovery

http://www.ciscosystemsverified.biz/en/US/products/hw/routers/ps259/products_password_recovery09186a0080094675.shtml

If you get a router off of EBay or your favorite local reseller lets you borrow one until you are able to purchase a new one as part of a VOIP project, then you will have to blow away the current config in order to get it in a state that you can actually use. The above article will walk you through how to do this. One thing you will want to pay particularly close attention to is that once you have set up your config, do not forget to set the config register back so that the router will reboot into the config you have set up:

config-register 0x2102

Once that is done, then you can reboot the router normally and it will boot your new config.

VMWare ESX and changing IP address of iSCSI SAN

During a recent subnet change project, I ran into a very unexpected problem. I was going to a new IP range and subnet based upon some expansion going on at a data center and had my Dell Equallogic PS6500 connected to two ESX servers to host my VMWare disk images as well as iSCSI connections inside of those VM's. Once I made the change on the SAN to the new IP address after shutting down the ESX servers and then started to boot the ESX boxes, I noticed it was taking an extremely long time. Further investigation revealed that the box was was looking for the iSCSI LUN's and could not find them, so it was stuck in the middle of a boot. This was obviously not a good thing.

I was able to get the box to boot into single user mode and then was able to get into it. Once there, I attempted to remove the iSCSI connections to the old host but was unable to do so with the following command:

#esxcfg-swiscsi –d
#esxcfg-swiscsi –k

Both did not work. I then found in the following directory:

/etc/vmware/vmkiscsid/vmkiscsid.db

This file contains a list of all of the iSCSI connections that the box is using. I simply renamed this file to vmkiscsid.db.bak and then was able to start my iSCSI config from scratch. Obviously the LUN's were okay as they were on the SAN and all I had to do was point my iSCSI connection back to the LUN's on the new IP range.

Once this was done, everything was back to normal and I was able to get the connections back to the SAN.

In hindsight, I should have disabled the iSCSI connections BEFORE I rebooted the ESX hosts so I ended up making it more difficult than it should have been.

VMWare and the CLI

If you want to access your VMWare hosts from the CLI, you are not allowed to login directly as the root account from SSH. You need to create a local user that has shell access and then SSH with that account before you do a su - to access the root account.

You must create this account by logging in via the VSphere client to the ESX hosts directly and not through VCenter first.

If you are on windows, I would recommend PuTTY for a SSH client and if you are on a Mac, well, just use your terminal application.

VMWare ESX hosts show their network cable unplugged

I ran into a very annoying "feature" of ESX the other day which will also affect your ESXi deployments as well. During a physical server migration, I did a P2V (physical to virtual) conversion of many hosts at once and one of the servers acted like its network cable was unplugged. I had 12 NIC's set up as part of my VSwitch for these machines so it didn't make any sense to me. I ran across the following KB article from VMWare which described exactly what I was experiencing:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004883

It turns out that you need to add more ports to your virtual switch which is by default set at 24. No big deal, right? I then ran across this little tidbit:

"Reboot the ESX host for changes to take effect."

Do what?! I really love ESX but that is terrible. I had to reboot the entire ESX host in order for this to take affect. I really hope VMWare fixes this to give you a bit more flexibility in your environment when it comes to adding network ports to your VSwitch in the future.

Configuring SNMP on a Juniper SSG firewall

Another rant about Juniper firewalls after my experience trying to configure something as simple as SNMP on the device. I am by far from a Juniper expert and I tend to use the web-interface GUI to do the majority of the configuration. Unfortunately, there is no SNMP configuration available in the GUI. I have no idea why this is the case but trying to navigate that web interface is an exercise in frustration. In any event, here is how you can configure SNMP through the CLI:

set snmp community "readonlystring" Read-Only Trap-off version v2
set snmp host "readonlystring" 192.168.1.1 255.255.255.255 src-interface ethernet0/0
set snmp location "Company - HQ"
set snmp name "ssg5-v92.domain.us"
set snmp port listen 161
set snmp port trap 162

Cisco ASA site to site tunnel error message

One of the things that I think Cisco does a very poor job with is explaining error messages on the ASA platform when you are trying to connect a site-to-site tunnel. I am not sure why it has to be so difficult to explain simple misconfigurations. For example, here is an error message from a recent site-to-site tunnel I was building between a Cisco ASA 5520 and a Juniper firewall on the other end:

Received non-routine Notify message: No proposal chosen (14)

Obviously there is something wrong with the IPSEC proposal, but what? Would it be too difficult to say exactly what did not match?

It turned out that this message indicated a problem with perfect-forward secrecy being enabled on one side of the tunnel but not the other. This took some googling and scratching my head in order to figure something out that should have been quite simple. I did not have access to the other device to double check settings so I had to guess as to the problem. Not exactly what I would call the "self-healing" network.

Thursday, March 11, 2010

VMWare ESX clustering

Now that I have a couple of ESX 4.0 U1 boxes deployed I started messing around with VMotion and the HA/DRS clustering configuration. VMotion is fairly slick and works without much headache IF you have the same type of processor in each of your hosts. I ran into this first hand because the second box in my cluster had a different type of processor (Intel Xeon X5570 vs Intel Xeon L5410).

Fortunately, there is a work-around for this problem located at:


This involves shutting down each of your VM's that you want to use VMotion on in the future and then changing the CPUID to be hidden from the host. This will bypass the CPU check in the future and as long as it is supported (in my case it was) then VMotion will work. I tested it out on a few machines and it did work flawlessly across the shared storage I had set up on my SAN. It sucks that you have to shut down your VM's first to make this change but at least it is a one time problem.

As for the cluster itself, my new ESX 4.0 box only has 4 NIC's as compared to the 12 I have in my first box so the host profile's do not match. I have ordered some additional NIC's in order to get them equal so that I can do more testing around DRS/HA but for now that is going to have to wait.

Thursday, February 18, 2010

Cisco ASA and AD integration to block specific users from VPN access

Most administrators realize the need for a centralized and single point of authentication for your network since you don't want separate credentials for every application. Active Directory provides a simple way for you to leverage the credentials that your users already use for a multitude of reasons, such as your VPN access. Cisco ASA's provide a simple way to integrate with AD or any other LDAP provider but unfortunately there is not a way to keep out certain users or groups with this method. You probably do not want to provide access to everyone in your company or every user for that matter to your VPN. There is not a simple click in order for this feature to be turned on and you must visit the CLI. You need to map the Deny or Allow dial in access in Active Directory to be something that the ASA can understand. This will provide a simple way to block VPN access to your network once you have this set up. Here is some sample config:

!--- The LDAP attribute map. msNPAllowDialin is
mapped to cVPN3000-IETF-Radius-Class
!--- A value of FALSE is mapped to a value of NOACCESS
!--- A value of TRUE is mapped to a value of ALLOWACCESS

ldap attribute-map CISCOMAP
map-name msNPAllowDialin cVPN3000-IETF-Radius-Class
map-value msNPAllowDialin FALSE NOACCESS
map-value msNPAllowDialin TRUE ALLOWACCESS


!--- AAA server configuration

aaa-server LDAPGROUP protocol ldap
aaa-server LDAPGROUP host 172.18.254.49
ldap-base-dn dc=rtpsecurity, dc=cisco, dc=com
ldap-scope subtree
ldap-naming-attribute sAMAccountName
ldap-login-password *
ldap-login-dn CN=Administrator,CN=Users,DC=rtpsecurity,DC=cisco,DC=com
server-type microsoft
ldap-attribute-map CISCOMAP



!--- The NOACCESS group policy.
!--- vpn-simultaneous-logins is 0 to prevent access

group-policy NOACCESS internal
group-policy NOACCESS attributes
vpn-simultaneous-logins 0
vpn-tunnel-protocol IPSec webvpn
webvpn
svc required



!--- The ALLOWACCESS group policy

group-policy ALLOWACCESS internal
group-policy ALLOWACCESS attributes
banner value This is the ALLOWACCESS Policy
vpn-tunnel-protocol IPSec webvpn
webvpn
svc required

!--- The tunnel group that users connect to

tunnel-group TESTWEBVPN type webvpn
tunnel-group TESTWEBVPN general-attributes
address-pool CISCOPOOL
authentication-server-group LDAPGROUP
tunnel-group TESTWEBVPN webvpn-attributes
group-alias TestWebVPN enable

Advanced Disk Based Option of Symantec BackupExec - a waste

You may have read some marketing material about the Advanced Disk Based Option (ADBO) for BackupExec that allows you to take a snapshot of a LUN and then transfer the LUN to your backup server in order for the backup to take place directly to your backup server. This is a great idea and would dramatically reduce backup times for key applications since snapshots of a LUN are done in a few seconds. Unfortunately, the implementation of this product is not practical.

The major gripe I have is the fact that in order for this to work, all of your OS's that you are snapshotting from must be the same exact version as your backup server. If you are running a data center, you know that getting every server to the exact same version is next to impossible. Apparently the reasoning behind this is the fact that it relies on VSS (volume shadow copy) as the engine to perform the actual snapshot of the data and NOT the SAN provider itself. Since each VSS engine is different in various OS's (Windows 2003 vs 2003 64 bit vs 2003 R2 vs etc. etc. etc.) then it will not work.

I can't stress how ridiculous this is and how much of a hindrance to implementation this must be for folks. Dell has even marketed this integration in their marketing guides for Equallogic and how there is a partner relationship between Symantec and Dell. Do not let these guides fool you because this one is a show stopper that is hard to overcome.

This is true not only with version 12.5 of BackupExec but also the latest 2010 version as well.

Monday, February 8, 2010

Move your BackupExec Database files location

http://seer.entsupport.symantec.com/docs/281824.htm

One of the things that BackupExec has managed to make more difficult than it needs to be is to change the physical location of the database files used in BackupExec. You would think that if you change the location of the database files inside of SQL Server that this would be enough however you would be mistaken. There is a registry key in the above mentioned article that you need to tell the BackupExec service where these files are located. I still am not quite sure as to why you have to do this, but if you fail to do it the files will automatically be put back to its original location.

I ran into this during a SAN migration recently where I attempted to move my BackupExec database from a local C drive over to the SAN. After moving the database files and starting up the database to confirm the move had taken place - BackupExec happily changed everything I had just done. After some head scratching I found the above article which mentioned the magical registry key.

Prometric testing - support is a joke

I have used Prometric testing centers several times throughout my career to take IT certification tests and only had a problem one time when the particular testing center I was using wasn't open the day I was scheduled to take the test. No big deal and I was able to get it rescheduled without a hassle. Unfortunately, I have ran into a problem because of a medial emergency surrounding my mother in law and I was unable to cancel or reschedule the test before the 24 hour window was up. I had to run out of town and be with my family during this time and never even thought about the test until the next morning. I attempted to call Prometric and was told that according to policy I was going to have to forfeit my testing fee of $200 for the Apple certification test I had already paid for. The lady said I had to fax a written request for reimbursement and I would have an answer within 48 hours. Frustrated, I decided to attempt that and faxed a written request stating my hardship in order to try to get the test rescheduled.

48 hours passed and I never heard from Prometric so I decided to give them a call. Magically they said they never got the fax and now I would also need a doctors excuse in order to be excused from the test. Keep in mind that all I wanted to do was simply reschedule a test I already paid for - not cancel completely or get my money back.

I finally was able to get the doctors excuse and faxed it today to see if I can get my money back. I went ahead and rescheduled the test and paid for it again because I was tired of waiting for them. Unfortunately they are the only game in town as far as taking tests but this support has been horrible.

Friday, February 5, 2010

Dell EqualLogic SAN HeadQuarters 2.0

Dell EqualLogic SAN HeadQuarters 2.0: Providing In-Depth Information for Enhanced SAN Management

One of the knocks in my previous article about our new PS6500 SAN's centered around performance monitoring. Lucky for me a user commented on my post about SAN Headquarters 2.0 which is something I had not heard of before. I quickly downloaded it and took it for a spin - this was EXACTLY what I was looking for but could not find via the web interface to the SAN itself. Great performance data and easy access to all of your SAN's across your enterprise.

I will be messing with this tool over the next few days but you can safely strike that complaint from my list. I just wish my sales rep would have told me about this tool to begin with!

Dell Equallogic PS6500 SAN's - impressions

I have had the pleasure of working with two Dell Equallogic PS6500 SAN's over the past month or so and I thought it would be a good idea to put my ideas out there.

I am going to start with my complaints with the product because overall I am very happy with our purchase. However, with any product there is always room for improvement.

My biggest gripe is the performance monitoring aspect of the SAN itself. This is obviously a big deal to data administrators and probably more so in the iSCSI world where bandwidth is everything. The performance monitoring is basically watered down to the point of being too simplistic. I would like to see more raw data and less Java induced graphs. I realize that the target market for these SAN's are businesses who do not have SAN experience on staff but there should be some better tools to go deeper into looking at performance.

I have spoken about the firmware update process in the past, but I still feel this needs attention. I am not sure why the SAN itself cannot go out and grab the new firmware - then alert the administrator that new firmware is available if you want to update. It feels cumbersome to go through the manual steps of getting the firmware updated for the box.

The Auto Snapshot Manager software which is part of Dell's Host Integration Toolkit is a nice idea but the software feels a bit flaky to me. There are two editions that I have used - one is for Windows Applications and one is for VMWare. The Windows edition will make "application aware" snapshots of SQL Server/Exchange databases so you could actually restore from that snapshot without the worry of data corruption. The software works as expected but after a reboot sometimes the manager will not know how to find the vss-control volume (the volume that the software uses to induce the volume shadow copy aware snapshot) so you have to go back into the iSCSI initiator and and connect to the volume before it will work. The VMWare piece is for some reason a web-based piece of software that looks like an afterthought in appearance but does actually work. I don't like the ability to not be able to send email alerts for failed snapshots with the VMWare package but I hope that is something that will be fixed soon.

I have not tried the replication piece yet as I am waiting for the 100 megabit point to point Cogent circuit to be installed to my second data center but I am anxious to see how it works in the real world.

Overall, I am very happy with the SAN's and still recommend them but there is some room for some minor improvement along the way.

VMware KB: USB devices not supported in ESX host virtual machines

VMware KB: USB devices not supported in ESX host virtual machines

I am going through a virtual server migration at one of my data centers. The idea of moving away from older, non-standard hardware and going to a virtual platform is exciting for any IT nerd but there are some pitfalls along the way that you must keep into account. One of these pitfalls is around USB devices that your servers may use today. One of the applications that we use has an old USB Key that is used for license verification. Unfortunately, ESX/ESXi does not support adding USB devices to individual virtual machines. Apparently this support is in the works but you have to buy a USB over IP device in order to make it work properly. Who knew?

It goes to show that when you plan on doing a large scale conversion, you need to think about everything that the server does and to make sure it is supported on a virtual platform before you dig in. Support is probably a bad word since there are still many vendors out there that will not officially support their software on a virtual machine (Hello Landmark!). Of course their software will run just fine on a VM but when you call them, do not under any circumstances tell them it is running under a VM or they will stop talking to you immediately.