Closure Report
Background
As the operating system, Windows 2008r2, currently utilised by both the Kx Application Webserver and the Database server, is scheduled to go out of support on 14th January 2020, there is now the requirement to upgrade the environments (Test and Live) to run on the a fully supported platform of Windows server 2016. This upgrade will ensure compliance with the regular security updates whilst also providing the opportunity to take advantage of newly developed functionality.
Scope
The scope of this project will focus only on firstly; the upgrade of the operating system to Windows 2016 on both the Test and Live environments and secondly; evaluate the feasibility of implementing improved resilience to the Application Webserver environment through load balancing. It is the intention to retain the current version of Sql Server 2014 SP2 to minimise change and any associated risk of multiple change, especially as SQL Server has extended support to 9th July 2024 (ref. https://support.microsoft.com/en-gb/lifecycle/search/1044)
Any additional technical or software changes are deemed out of scope
Project Summary
This project has delivered
- A new resilient and supported Database infrastructure running under the operating system Windows 2016 utilising Always on Availability Groups (AOAG)
- A new and supported Application infrastructure running under the operating system Windows 2016. In addition the resilience on the majority of the websites has been been increased through utilising two active servers via the load balancer. Of the ten web based applications
- Eight are being fully load balanced across two active Application Webservers , namely;
- KxBnB External
- KxBnB Internal
- Kx Calendar
- Kx Catering
- Kx Parcels
- Kx Proposals
- EIT Wb Service
- Kx WebStudent
- One is being run through the load balancer as a single site due to issues communicating with WPM
- Kx Registration
- One is being run through the main Application websever as the application caches data on the actual Application Webserver
- Kx Inspections
- A new cluster for services has been configured to enable the services to run across the load balanced Application Webservers. This also provides greater resilience with automatic failover if one of the Application WebServers fail. The windows based services are
- Kx Channel Manager
- Kx western Union
- Kx Messages
- Kx Parcels
- Kx ReportSchedulerService
- Kx TigerService
- Kx WaitListService
- Eight are being fully load balanced across two active Application Webservers , namely;
- A new Technical Architecture Document (TAD) including contributions from all technical stakeholders in the project team, namely; Development Technology, Applications Management, Technology Management, KSL Technical Lead and ACE Technical team
- An updated operating Level Agreement (OLA)
- A detailed implementation and roll-back plan for both the Application Webservers and the Database Servers
This project has also seen a more collaborative approach where open communications between the ACE Technical Team, Supplier Technical Lead, Applications and ITI has taken place throughout the entire project (something that has not always been evident in past projects) and this has greatly enhanced the multiple deliveries achieved through this project. This collaborative approach was evident in how the project overcame a number of challenges as follows;
- A new technical lead was assigned by the supplier, replacing an experienced member of staff who very cognizant of the UoE configuration.
- During the implementation of the WebServers
- Due to lack of IP addresses, there was the requirement to use Server Name Indication which enables the server to host certificates for multiple sites under a single IP number. However, this resulted in a number of unexpected application issues which had to be worked through. Additionally as a consequence of this change,
- a code change was required to enable the MyEd related channels to render
- further IP changes were required to communicate with the EIT server
- Due to lack of IP addresses, there was the requirement to use Server Name Indication which enables the server to host certificates for multiple sites under a single IP number. However, this resulted in a number of unexpected application issues which had to be worked through. Additionally as a consequence of this change,
- With the introduction of the load balancers and multiple Application Servers
- there was the requirement to have health check pages created for each of the websites
- further application issues were encountered which had to be worked through resulting in the testing period being extended
- it was discovered that a number of the application related windows services could not run simultaneously on multiple servers and so
- due to timescales it was agreed to go-live with the windows services active on only the Primary Application WebServer
- subsequently these windows services (as noted above) were incorporated into the new cluster for services and subsequently implemented post the go-live
The outcome of this project is the delivery of a fully supported and greatly enhanced resilient technical infrastructure which will support the business in the coming years.
The project manager would like to acknowledge the excellent technical input and support of the project team throughout the project with regards to, and in no particular order;
- Alister Webb - Technical Lead
- John Chan - Applications Management Lead
- Mark McGowan - Technology Management Lead
- David Morse - ITI Technical Lead
- Andrew Glass - ACE Business Lead
- Andrew Taylor - ACE Technical Lead
- Thomas Bourne - KSL Technical Lead
- Stefan Kaempf - IS Senior Suppler
In addition, thanks should also be expressed to the
- ACE business team who completed many iterations of testing throughout the project
- the Applications Resource Manager, Paul McNulty for his assistance in resourcing the project from an IS perspective
Objectives
Phase | Achieved |
O1. Deliver a fully supported technical environment for the Kx Application | Yes |
O2. Improve operational resilience on the KX Web application across multiple servers | Mostly- The project was able to implement eight out of the ten websites (as noted above) through the load balancer. However, due to application requirements, it was not possible to implement the other two |
O3. Provide technical documentation to support the updated environments | Yes |
Deliverables
Phase | Priority | Achieved |
D1.1 Provide replacement Application Webservers running Windows 2016 for both the Test and Live environments | Must | Yes |
D1.2. Provide a replacement Database server running Windows 2016 for both the Test and Live environments | Must | Yes |
D1.3 Ensure that system performance mirrors the performance on the current infrastructures | Must | Yes |
D2.1 Establish a technical solution on how operational resilience on the Kx Web Application can be implemented across multiple servers | Must | Mostly - refer to Objective 2 |
D2.2.Implement operational resilience on the Kx Web Application across multiple servers on both Test and Live environments | Should | Partly |
D3.1 Update to Technical Architecture Document (TAD) | Must | Yes |
D3.2 Update the Operations Level Agreement (OLA) | Must | Yes |
Analysis of Resource Usage:
Staff Usage Estimate: 100 days
Staff Usage Actual: 99 days
Staff Usage Variance: -1%
Other Resource Estimate: £0
Other Resource Actual: £0
Other Resource Variance: £0
Breakdown by Team
Team | Estimate | Actual | Difference | Reason for Difference |
Project Management | 32 | 34 | 2 | |
Project Governance | 3 | 1 | - 2 | |
Development Technology | 45 | 45 | 0 | |
Applications Management | 10 | 8 | - 2 | |
Technology Management | 10 | 11.5 | +1.5 | |
Total | 100 | 99.5 | - 0.5 |
Explanation for variance
There appears to be little difference regarding the project resource between the actual and the estimates. As there were a number of uncertainties regarding the building the new application servers and the incorporation of the load balancers, the initial estimates contained a fairly high level of contingency, which was utilised mainly in the areas of Project Management and Development Technology. There was however the need to extend the length of projects by around three months
- WebServers deployment was originally scheduled for 11/01/19, but were deployed on 13/03/19 due to
- Initial planning phase was delayed by four weeks to incorporate formal sign-off by the Supplier Senior user
- The decision to refresh the database on Test, to ensure application data up to-date at the project outset took longer than anticipated and impacted on initial system QA testing
- System requirements were not fully defined prior to handing over the environment to the supplier resulting in additional time to build the environments
- Unexpected technical issues associated with the build of the WebServers (as noted above)
- After the introduction of the load balancers, further application issues were encountered which had to be worked through resulting in the testing period being extended
- Database servers deployment was originally scheduled for 22/02/19, but were deployed on 25/04/19 due to
- Delays with the Application Webserver activity taken longer than expected
- Investigation of post go-live issues with the Application WebServers associated with the application Kx Registrations communicating with WPM - leading to card payments not being properly recorded. Due to the severity of the problem, the decision was made to return the application back to the old Application Webserver. A solution was finally determined and subsequently deployed to the live system
- During the initial testing, issues were encountered regarding database access for the back-end applications due to different connection string naming requirements
- Closure was scheduled for 15/03/19, but will be closed on 28/06/19 due to the above, plus;
- Resolution of a few post go-live issues associated with scheduled overnight EIT transfer; Kx Conferencing report and Kx Inspections sql agent
- Resolution of the Kx Registrations payment issues to allow migration to the new Application WebServers on 13th June
Key Learning Points
- Providing direct access to JIRA for both the ACE Technical Team and the Supplier Technical Lead greatly enhanced how issues were both logged, investigated and subsequently resolved
- Weekly project team conference calls enabled the project team to fully contribute to all aspects of the project
- The formation of a Project Governance Team provided the platform to escalate project related issues if required. Whilst this was never required, which I believe a testament to the collaborative team working, the Governance Team were updated via the weekly project team meeting minutes. The Governance Team consisted of representatives from the business (Project Sponsor); Applications (Senior Supplier) and from the business (a Senior Director)
- Whilst a change freeze to the KSL Application was initiated at the start of the project, there was a requirement to lift this temporarily to enable important functional changes (student related) to be incorporated. In such circumstances, changes must be properly managed to ensure no adverse impact
- The impact of changes to the server environment especially regarding the compatibility of Kx Web Applications with load balancing was not fully realised at the start of the project. A more detailed impact analysis conducted by KSL and UoE would have been beneficial
Recommendations
As part of the work undertaken within the following observations were made and it is recommended that they be reviewed and acted upon;
- The scope of EdLAN IP ranges, Direct Access, VPN is very wide and consideration should be given to restricting access points. In addition, it was noted that security of the application could be improved, if the application client did not directly connect to the database, but via a terminal server - logged as Unidesk call I190626-0209
- Due to the way user groups are configured on the database server, it would appear that the entire users of the system (250+) have by default full access to the backend database servers - logged under Unidesk call I190626-1187
- It was agreed that two UniDesk calls would be created with recommendations to be followed up with KSL
Outstanding Issues
- The decommissioning of the old Application and database servers - these will be completed as part of KSR