Tuesday, November 11, 2014

First experience using AWS Virtual Private Cloud

All the while, Amazon Web Services (AWS) - the leading public cloud provider - has been advocating for the demise of On-Premise Private Cloud. Its nearest alternative offering is Virtual Private Cloud (VPC) where you could really build one anytime, anywhere. Using its free usage tier, I've built a base VPC with two EC2 instances (or VMs) as depicted below.

To learn AWS, we have to understand its terminology:

  1. What's EC2 Instance? It's Virtual Machine.
  2. What's VPC? Virtual network on cloud where you can create multiple IP subnets on it. EC2 instances may be hosted on a VPC.
  3. What's Security Group? Think of it like a L2 firewall where you can configure the network access rules e.g. only allow HTTPS to the public Web instance from Internet etc. It is associated to one or more instances.
  4. What's Subnet? The usual IP subnet that we knows of. A VPC is made up of one or more subnets. You can configure which subnet is public facing and which are not. In my example, 10.10.1.0/24 is public facing and 10.10.4.0/24 to host my internal instances.
  5. What's Network ACL? Think of it like the usual network ACL applied to router interfaces. The ACL is stateless, so you've to define both inbound and outbound for a particular traffic. It can be used to complement the Security Group. For example, allow inbound TCP 443 to the subnet that hosts the above Web instance.
  6. What's Elastic IP (EIP)? A public IP assigned to a public facing instance, although only private IP is assigned physically on its NIC. Think of it like an NAT address on the invisible Internet gateway.
To begin with free trial:

  1. Of course, create an AWS account using your credit card. Don't worry, AWS won't charge anything to your card, as long as you stay within the free usage tier. You can enable bills monitoring if you're concern that you would exceed the free tier limit. As for me, the ultimate backstopper is to make friend with the extremely friendly AWS account managers.
  2. Start with the AWS Quick Start guides, especially the RD Gateway guide.
  3. Create a VPC with 2 subnets - one public facing (i.e. RD Gateway for remote admin) and another private subnet to host the internal instances.
  4. Launch new instances with a wide range of Amazon Machine Image (AMI) templates to select, including various Windows Server and Linux OSes.
  5. Configure the Security Group to allow inbound RDP TCP 3389 for the initial setup of RD Gateway instance.
  6. After the RD Gateway is successfully setup, you can tighten network security by allowing only HTTPS traffic.

So far, the usage experience on AWS is good, as though I’m working on my own private cloud. The free SDN feature provided by AWS is also almost as agile and flexible as the VMWare NSX that I've recently experimented with. I’m also impressed by the AWS powershell supports embedded in the Windows template. Most importantly, all the AWS features are well documented. The only ‘complaint’ so far is the relative slow loading of html AWS documentation (probably not hosted/cached in Singapore?)

But can AWS really replace all on-premise private cloud networks? It definitely hold promises due to its great elasticity and flexibility. The next challenge depends on how fast its metering jump, whereas in private cloud world where metering is rarely looked at (lest even use). Much like the debate of whether it's more economical of hiring taxi daily vs owning a car, which cost can be astronomical in Singapore. 



Saturday, November 1, 2014

My latest DIY Computer

My latest DIY computer: i7 CPU, 16GB RAM, full SSD drive, Nvidia GTX 650 GPU, Gold-class PSU and a brand new LED monitor. In a blink of eye, Windows and most apps will fire up instantly without delay. All in for just SGD 1,600. Realised my dream to have a home "Data Center in a Box" by enabling Hyper-V for both entertainment and R&D purposes. I can bet that this monster can run faster than all the 5-figure and 6-figure servers at my workplace.

As for Cisco routers simulation, I'll need VMWare ESXi for the CSR1000V. I'll work on ESXi USB stick for alternate boot, using my old laptop to vSphere in.

Friday, October 24, 2014

Trying out VMWare NSX Hands-On-Lab (HOL)

Just did my first lab on VMWare NSX Hands-On-Lab on network virtualization. The task is to create a logical L2 network between 2 VMs, even though they could be separated by underlying L3 physical network and even residing on different clusters. Finally, the lab will bridge the VXLAN logical switch to VLAN 100 on production network.

Let's understand the key NSX components:
  1. NSX Manager is the centralized network management component of NSX, and is installed as a virtual appliance. It provides an aggregated system view.
  2. NSX controller is the central control point for all logical switches within a network and maintains information of all virtual machines, hosts, logical switches, and VXLANs. The controller is running as a VM. For redundancy, a second controller can also serve as a standby VM. The controller supports two new logical switch control plane modes, Unicast and Hybrid. These modes decouple NSX from the physical network. VXLANs no longer require the physical network to support multicast.
  3. NSX Edge provides network edge security and gateway services to isolate a virtualized network and to bridge or route to physical network. You can install NSX edge either as a logical router or a services gateway.
  4. NSX vSwitch replaces the default Virtual Distributed Switch (VDS) on the kernel mode hypervisor on each host.
I won't go through the detailed step-by-step. Rather, I would highlight the high level steps to serve better understanding on the deployment scenario.

Step 0: Preparing the network and clusters. There are 3 clusters of hosts. Compute A and B are 2 clusters are meant for hosting VMs. NSX vSwitch resides on all hypervisors. Other NSX components mentioned above reside on the "Management and Edge Cluster".


Step 1: Enable VXLAN Tunnel End Points (VTEPs) and VXLAN using vSphere client.

Step 2: Create a VXLAN Transport Zone spanning the 3 clusters

Step 3: Create a logical switch and attach it to an NSX Edge. The Edge gateway has an interface of 192.168.100.1 connecting to the transport zone. Note the new L2 logical network (172.16.40.0/24) created in green.

Step 4: Add two Web VMs and their vNICs to the new logical network as shown below. Both static IP and DHCP should work fine on the VMs. Test connectivity between both VMs.

Step 5: Bridging the logical switch to the physical network via NSX Logical Router. In this case, the VXLAN is bridged to VLAN 100 on the production network.


Saturday, October 4, 2014

Windows Azure AD with your Active Directory

I've just watched a Microsoft jump-start video on how to integrate Windows Azure AD (AAD) with your on-premise AD infrastructure. By doing so, your users can experience seamless authentication experience between public Windows Azure (e.g. Office 365, Sharepoint online etc) and on-premise network. Here is the link: AD to Windows Azure AD.
In summary, there are 3 possible options:
1) No integration. Users logon to Azure and on-premise AD separately with different sets of credential.
2) Directory sync (DirSync) only: On-premise AD user accounts and password hashes are synced to Azure. Users logon to both using same set of credential. No Single Sign-On (SSO) between AD and AAD. In other words, users have to authenticate twice, even though they may use the same set of user ids and passwords.
3) AD Federation (ADFS with DirSync): AD user objects (but no password hash) are synced to Azure. Establish one-way federated trust (i.e. Azure trusts your AD). This option supports SSO and even smart card authentication.

Wednesday, September 3, 2014

Active directory or sysvol is not accessible on this domain controller or an object is missing

I saw this error message on Group Policy Management when I did a status check on the AD replication. All domain controllers were stuck with replication in progress with their respective Sysvol "inaccessible" against the PDC emulator. I couldn't find any error events on "DFS Replication" at all - the replication just got stuck in progress.

When this happens, follow the steps on How to perform an authoritative synchronization of DFSR-replicated SYSVOL.

Thursday, August 28, 2014

Trying out Lync 2013 Deployment

I did my first test deployment for Lync 2013 - a Skype-like application for Intranet. For quick step-by-step installation, I've followed this guide: How to Install Lync Server 2013 Std. Edition on Windows Server 2012

As installing Lync server requires modifying the AD forest, I've decided to make it cross-forest i.e. Lync on resource forest. It has similar concept of Linked Mailbox in Exchange i.e. disabled user account on resource forest that map to the actual user SID on user forest. To do so, I've followed this guide: User Enabling in Resource Forest

If you do not have an Exchange server on resource forest, you can simply just (on resource forest):

  1. Create a new disabled user account with same email address as the user.
  2. Copy the objectSID attribute from the User Forest account to the msRTCSIP-OriginatorSID attribute of the disabled account. You can simply do so using the "AD Users and Computers" console by enabling "Advanced Features" on the "View" menu.

Thursday, May 15, 2014

Virtualised Domain Controllers Replication Issues

I noticed virtualised domain controllers often have issues replicating new settings in Group Policy Objects. This warning message was also observed:

Error: 9036 (Paused for backup or restore)
After reading this Technet article on backing up virtual domain controller, I realised the cause was due to the snapshot back at Hyper-V level. The only supported backup method is running the backup job at the guest VM level. Since then, I've stopped backing up domain controllers at Hyper-V host level and disabled the backup integration services at VM configuration.

Monday, May 12, 2014

WS2012 Domain Controllers stop replication after Power Outage

We had some power outage and noticed newer Group Policy Objects (GPOs) weren't replicated across the AD. After running dcdiag /a diagnostic command, we noticed DRS-R event errors on some WS2012 Domain Controllers. After doing some research, we realised that WS2012 stopped auto-replication by default.

To enable it back, configure this setting on the registry and restart the affected DCs.

  1. Set HKLM\System\CurrentControlSet\Services\DFSR\Parameters\StopReplicationOnAutoRecovery registry key to a DWORD value of 0.
  2. On evelvated command prompt, run wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set StopReplicationOnAutoRecovery = FALSE

Thursday, May 8, 2014

Verify Domain Controller Certificate for Smartcard Logon

To enable user smartcard logon, all domain controllers must be enrolled with KDC enabled certificates. The correct cert template to deploy is Domain Controller Authentication. If you enrolled the domain controllers with wrong certs, you might encounter this error event on the domain controllers:
This event indicates an attempt was made to use smartcard logon, but the KDC is unable to use the PKINIT protocol because it is missing a suitable certificate.
To resolve, you'll have to delete the invalid cert and request for a new valid cert. To verify after enrolling domain controller certificates, run this command:
certutil -dcinfo verify
Reference: Event ID 19 — KDC Certificate Availability

Wednesday, April 9, 2014

Rebuilding WID Database for WSUS in Windows Server 2012

If you're using Windows Internal Database (WID) for WSUS in WS2012 and you think you've screwed the configuration, you can force the WSUS to rebuild its contents and database.

Steps:
  1. Remove WSUS and WID roles from server manager. Reboot server.
  2. Go to C:\Windows\WID\Data
  3. Move both "SUSDB.mdf" and "SUSDB_log.ldf" to another temp folder
  4. Re-install WSUS server role again
Found a comprehensive guide on http://prajwaldesai.com/troubleshooting-wsus-3-0-sp2-on-windows-server/

Files and Folders Copy with NTFS ACL Preservation

To bulk copy files and folders from one place to another and to preserve ACL permissions and folder structure, an easy way is to use Robocopy.exe like this:
> ROBOCOPY [source] [target] /MIR /SEC /SECFIX 
For example, to copy from local drive source to file share destination, the command should be
> ROBOCOPY D:\Shares \\UNC\Shares /MIR /SEC /SECFIX

Monday, March 17, 2014

How to clear old RMS Templates on FCI

If you're using FCI to perform automatic RMS encryption and you're setting up new RMS server, you'll find both old and new RMS templates appearing on the File Management Task like this:

How to remove and clear away old RMS templates? Clear all files under
C:\ProgramData\Microsoft\DRM\Server\Templates\S-1-5-18

Wednesday, March 12, 2014

Co-existence: Pre-production and Production AD RMS

We have developers wishing to develop AD RMS applications based on AD RMS SDK 2.1. Any applications developed out of this SDK is considered pre-production until its application manifest are signed with certs from Microsoft (a.k.a moving from pre-production to production).

However, pre-production applications won't work with production AD RMS server and vice-versa. Otherwise, you'll see this error: "Cannot use test manifests against production servers"

Hence, you'll have to follow this guide "How to install and configure an RMS Server" for pre-production. If there is already an existing RMS server in your AD, you've to re-setup this server for pre-production. It would effectively remove the production RMS server and Office RMS would stop working as a consequence. So, how can we make both RMS servers (one production server for Office RMS users and another pre-production for developer) to co-exist?

Our strategy is to setup a separate pre-production RMS server for developers to use that server. Remember that RMS clients would always refer to its registry settings before checking the AD SCP. Have the development PCs manually configured with pre-production server while the rest of Office clients refer to the SCP on Active Directory for the production RMS server.

Assuming that you already have a production RMS server, this is the outline plan:

  1. Prepare a new Windows server for AD RMS
  2. Prepare the registry settings on the new server for pre-production setup.
  3. Unregister existing SCP using RMS administrative toolkit
  4. Install the AD RMS role on the new pre-production server
  5. On the production RMS server, change the SCP back to its original URL

Thursday, January 16, 2014

Hotfix patch needed for existing Windows 7 clients when installing new AD RMS server (WS2012) in Crypto Mode 2

The initial default crypto key length for WS2K8 R2 and Win7 is only RSA 1024. After I setup a new WS2012R2 AD RMS server in crypto mode 2 to replace the old WS2008 RMS server in crypto mode 1, the crypto key length is increased from RSA 1024/SHA-1 to RSA 2048/SHA-256. I have to install this hotfix patch for my Win7 RMS clients to increase crypto key length. There is also another update for Office 2010 clients.

If need be, clear the existing AD RMS client caches as well.

More details on AD RMS Cryptographic Modes.

Tuesday, January 7, 2014

Installing OpsManager Database on AlwaysOn SQL cluster

According to this Technet link, AlwaysOn database instance is supported for System Center Operations Manager 2012/R2. You'll just need to supply the AlwaysOn Group Listener name and port number to the installation wizard. The first management server will use the Group listener to get the primary SQL instance, and will install the databases on that instance. Subsequently, you can manually add it to a Availability Group.

This method won't work. After a long wait, the wizard will return an error asking you to ensure sufficient permissions. A closer look to the installation wizard logs located at (%LOCALAPPDATA%\SCOM\LOGS\OpsMgrSetupWizard.txt) reveals that the wizard was unable to connect to the hidden drive share of the active SQL host
[13:06:59]: Info: :Info:Creating db path: \\SQL_LIS\D$\MSCMDB\MSSQL11.MSCMDB\MSSQL\DATA\
[13:22:21]: Error: :Could not create valid path: \\SQL_LIS\D$\MSCMDB\MSSQL11.MSCMDB\MSSQL\DATA\: Threw Exception.Type: System.IO.IOException, Exception Error Code: 0x80070043, Exception.Message: The network name cannot be found.
[13:22:21]: Error: :StackTrace:   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.Directory.InternalCreateDirectory(String fullPath, String path, Object dirSecurityObj, Boolean checkHost)
   at System.IO.Directory.InternalCreateDirectoryHelper(String path, Boolean checkHost)
   at Microsoft.EnterpriseManagement.OperationsManager.SetupCommon.SetupUtils.CreateDirectoryForDatabase(String physicalSqlServerInstance, String localPath, Boolean& createdDirectory)
[13:22:21]: Error: :Error:Could not create the directories for the specified DB Path
[13:22:21]: Always: :Database creation permission check failed for CMDB_AG_LIS\MSCMDB instance
You can try to access the hidden SMB share using the listener name, it won't connect. You'll have to supply the actual active host name to the wizard. Hence, the workable approach should be:

  1. Supply active host name to the installation wizard. Complete the installation.
  2. Ensure that the Operations Manager console can log in successfully.
  3. Rename the database server to the Group Listener name using the same procedure as "How to move Operations Manager database"
  4. Restart the OM service.
  5. Stop the primary SQL service to force a SQL cluster service move.
  6. Start the OM console to check the connectivity.