EBO Idea Exchange
Have ideas on how to improve the EcoStruxure Building Operation? Please share and get votes from our Community to influence development efforts.
Visit the brand new Software & Firmware Center, within EcoXpert Extranet
What are the main changes vs the previous solution :
For a better user experience and usability, files will be organized by “Software packages” related to versions (ex : EBO 2022).
A package now includes all needed items for the installation of a given version:
User | Likes Count |
---|---|
6 | |
5 | |
4 | |
2 | |
2 |
True OS-level application redundancy for StruxureWare Building Operation Enterprise Server.
This is a revisit of the Redundancy of Enterprise Server idea created by MARCOS OLIVEIRA FELICIO in 2014.
Problem statement:
In mission critical environments, where customers demand 99.99%+ uptime, such as Healthcare and Correctional Services, Enterprise Server application redundancy may be necessary. Schneider Electric does not currently have an in-house application redundancy solution for StruxureWare Building Operation. Instead, we achieve redundancy through relying on third party solutions such as Stratus everRun. These third party applications require two instances of an Enterprise Server to be virtualized and managed from an external application running on each host machine. In the everRun solution, each ES runs as a virtual machine in a Citrix XenServer environment. everRun MX runs in conjunction with XenServer to handle server application failure by failover to a synchronized backup server.
Alternatively, physical-only redundancy can be achieved by technologies such as VMware vSphere Fault Tolerance and High Availability.
Apart from being a third party vendor technology, each of these solutions has disadvantages, to name a few:
Disadvantages of everRun MX:
Disadvantages of VMware vSphere Fault Tolerance (FT) and High Availability (HA):
Proposed solution:
Enterprise Server OS-level application redundancy.
I have been playing with some ideas to demonstrate proof of concept, based on some old projects of mine. I thought I had better post the idea and get feedback before devoting any more spare time to this.
There are five challenges to be addressed in such a solution:
See diagram below, the redundancy application is referred to as gemini, for the sake of giving the project a name:
Network connectivity
Basically, the gemini application must have control of the network interfaces. Virtual IP addresses is the obvious solution for ensuring all devices only connect to the primary ES. This can be emulated by creating a virtual TAP adapter and bridging it through the physical network adapter on each ES. The TAP will have a permanent static IP address for communicating between primary and standby servers only:
In this demonstration, the virtual TAP adapter can be created and managed using the tap-windows6 package by OpenVPN. The gemini application running on both primary and standby ES monitors the Building Operation License Server and Building Operation Enterprise Server Windows services and sends heartbeat packets to one another over the network via the TAP adapter. The primary ES physical NIC has the external ES IP address as seen by the rest of the BMS network whereas the standby ES physical NIC has a link-local IP address. In the event of the primary server failure or a manual failover command, the primary server physical NIC drops its IP address and adopts a link-local address, the standby ES (now primary) adopts the external ES IP address. As its database is synchronized with the former primary, all connections, events, alarms and logging resumes on the new primary ES.
Failover control
Gemini is just a bunch of tools and python scripts, running on each ES. The scripts are run on python 2.7 (same as tap-windows6). The application is enabled to run as a Windows service. The failover control is based on the heartbeat receipt of packets or manual failover control as mentioned above. When ES is primary, Building Operation License Server and Building Operation Enterprise Server Windows services are started. When ES is standby, these two services are stopped.
Data synchronization
There are many ways to synchronize data between two remote hosts, including SMB, robocopy, BitTorrent Sync and rsync. Fortunately with StruxureWare Enterprise Server, there is no SQL Server, so we just need to worry about the contents of the db directory. License information may also need to sync but for this demonstration we are using demo license and won't worry about it yet. rsync is the suggested choice as it is powerful, fast and can implement the delta transfer algorithm. It can also run as a service. Obviously after a failover, the file replication direction is reversed as directed by the gemini application. To encrypt the file transfer, we can rsync over ssh by installing open-ssh on each ES. Why not transfer using Windows SMB? SMB is slow, vulnerable and requires the db directory to be configured as a network share.
Configuration and management from SBO
As gemini is running in python, a CherryPy web server can be configured as localhost only and serve the redundancy status to SBO via a XML Web Service or the SendWebRequest script method and be displayed on a graphic. Controlling failover from SBO can sent via a RESTful http POST request using the SendWebRequest script method, again from a pushbutton on a graphic.
Software licensing
The redundant servers must be able to operate each with a local License Server. Without testing, we aren't certain if the License Server trusted storage will prevent the replicated license config from being readable by the standby ES. If all else fails, we can use a demo licence in the meantime. If application redundancy ever became an officially supported SBO solution, presumably changes the licensing system would be updated to accommodate this.
Keen to hear your thoughts, negative and positive as well as alternate solutions.
Originally posted 2016-04-22 19:15:00.000 UTC
Link copied. Please paste this link to share this article on your social media post.
Create your free account or log in to subscribe to the forum - and gain access to more than 10,000+ support articles along with insights from experts and peers.