diff_months: 17

Network resiliency and self healing networks :Case Study

Download Solution Now
Added on: 2023-02-02 06:32:29
Order Code: 484749
Question Task Id: 0


Until recently, network resilience focused on service restoration. However, with the introduction of technologies like VoIP and video broadcasting, having a connected link is not enough anymore. Other challenges in managing the networks such as minimizing down times and having a fault free network are faced by the network managers. This report aim is to investigate the current network challenges, which affect networks resilience; further, to describe a structured approach to build a resilient and self healing network.


The rapid growth of IP networks and its technologies has made it a critical asset for many organizations and businesses. A well known example is the internet, which has changed the way of service delivery and by time more and more businesses are being oriented to use the internet services. In fact, many organizations’ existence may rely solely on the availability of the internet; this concept generated the urge to explore the arena of network resiliency to achieve high availability.

Clearly, technological advancement in the areas of hardware and software diverted the challenges from the possibility of building IP networks to how to maintain IP networks and make it highly available. Networks resiliency is defined according to Erdene-Ochir et al. (2012) as ‘the ability of a network to endure and overcome the presence of unreliable and/or compromised components’.

Another definition as Whitson and Ramirez-Marquez (2009) state ‘Network resilience is the ability of a

network to maintain acceptable level of service in the face of challenge to its normal operation, such as malicious attacks, natural disasters or human errors’.

Next generation of IP applications and systems are already in production environments such as Voice & video over IP applications, IP storage and e-commerce applications, these demand a high level of network resiliency more than ever. VoIP and IP storage applications demand resiliency that only the transport layer provides. Further, in mission critical systems having a resilient network is not an option, but a must (Nagarajan & Ooghe. 2008).

This report objective is to discuss IP networks resiliency and self healing networks. Moreover, to discuss the current needs to build resilient networks, the challenges affecting network resiliency, the requirements, features and strategies to build a resilient network.

1.IP networks resiliency challenges

Understanding the cause of network failures is critical practice in the process of building a resilient IP network. IP networks are an open standard, that means multivendor equipments are put together to operate networks, this interoperability is an issue itself, which could elevate network issues to another level of complexity (Piliouras. 2004). In the following section, traditional and untraditional challenges will be discussed.

1.1Traditional issues

External or internal factors could affect networks resiliency, and no matter how well the network is designed it is prone to these factors; yet, the challenge is to keep its impact to the minimum.

Firstly, hardware failures, which are failures on the physical level such as physical components faults, power supplies failures and communication links cuts (Prowse 2012). Secondly, software and operating systems crashes, it could be as a result of a hardware failure or a software bug (Marcus & Stern 2003). Thirdly, network failures that occur beyond the administrative domain of network engineers, which are uncontrollable in general. Another challenge is errors caused by humans, which are usually unpredictable, and could lead to any of the previously mentioned faults (Saran. 2004).

Challenges caused by disasters. For example, floods, earthquakes, wars or extensive power outages. Natural disasters are typically beyond control (Prowse 2012).

1.2  IP, TCP, UDP characteristics affects network resiliency

A detailed understating of how IP, TCP and UDP protocols function helps to manage these protocols and to resolve challenges that network engineers face. Consequently, it will help to design resilient network infrastructures.

1.2.1Internet Protocol (IP)

Some of the IP protocol characteristics could affect the resiliency of networks. For example, regardless of the transmission media condition, the IP layer will continue to transmit packets which could result in high error rate (Lee 2005). Therefore, when designing the network a deep consideration of which technology to implement is critical. An illustration of that is in certain implementations SONET interfaces are always preferred over Gigabit Ethernet as it is less prone to the issue of signal attenuation (Lee 2005).

Another issue is fragmentation and defragmentation of packets. IP transmits packets over different technologies like Ethernet and SONET, and these technologies have different maximum transmission unit (MTU); thus, large packet are broken to smaller ones for transmission.

However, this process overloads the routers CPUs, which eventually affect networks performance (Kent & Mogul. 1995).


First implication of TCP nature on network resiliency is the three-way hand shake. In this process, which TCP use to establish connections, the initiating side send a SYN request to the host, normally the host responds with a SYN ACK response then the initiator send an ACK for the peer’s SYN. This process can be exploited by attackers when a SYN flood attack is initiated, as a result of that the host resources are exhausted while maintaining the SYN requests (Prowse 2012).

Another implication is the sliding window, it is a flow control mechanism used by TCP to control the speed of packets transmission. The speed is negotiated through an entity called the window size, which is the number of packets that can be transmitted while waiting for an acknowledgment; further, the window size is negotiated dynamically, which means the size changes during the conversation life. This feature could be exploited by attackers to slow down application; thus, affecting the network resiliency (Lee 2005).


According to Lee (2005) UDP is a connectionless protocol, which means it does not provide TCP recovery mechanisms, these mechanisms are left to applications to deal with; hence, from a network resiliency perspective it will be hard to control. For instance, in the event of a congested network the applications cannot be slowed down; although, policing could be applied. However dropping too many packets could affect the service quality.

2.Building a resilient network<

The key to build a resilient network is not just having redundant devices or links. The design strategy should include all components of the network; moreover, to follow structured approaches in all building phases (Oppenheimer 2010). For example, the design should

start with the logical design of the network taking inconsideration the customers’ requirements and growth factors, after that the design should be reflected to the physical design of the network specifying the hardware requirements and its corresponding specifications. Further, the strategy must remain consistent; otherwise the whole network could be compromised.

2.1Redundancy strategy

2.1.1Logical redundancy

Logical redundancy in networks is usually composed of the network path redundancy and functional entity redundancy. Network path is the route where the packets traverse from a source until it reaches a destination; routing protocols are usually responsible for path selection by performing calculations using routing algorithms, then the results are stored in the various routers within the network (Hutton, Schofield & Teare 2008). Therefore, the objective is to ensure the existence of an alternative path to each important network resource in case of failure.

The other part of logical redundancy is the logical functions performed by the routers, routers acts as a default gateway connecting hosts to the outside world, and if a single gateway exists, a great risk of having a single point of failure poses itself, which results in an inability to traverse traffic from and to hosts except the ones which resides in the same subnet (Hutton, Schofield & Teare 2008). Thus, it is important to include a logical gateway redundancy strategy to overcome this failure.

2.1.2 Physical redundancy

Generally, physical components of networks are devices, communication links and hosting sites. Devices redundancy is achieved by having connected redundant devices that support the same service (Hutton, Schofield & Teare 2008). Redundant communication link is very important and must correspond to the logical design. For instance, having redundant Ethernet links between core routers and switches is recommended to achieve

physical redundancy (Oppenheimer 2010). A limitation of this strategy, especially in WAN links implementations is the finical cost, which could be an obstacle to achieve the desired redundancy. However, Bisti et al. (2011) state that deploying Multiprotocol Label Switching (MPLS) technology could be the answer, as the costs to implement it is much lower than traditional technologies such as Leased Lines and Frame Relays; further, under MPLS, fast recovery of the network in the event of failure is supported through the use of Fast Reroute function (FRR) which enables links to be recovered with in milliseconds.

1.1Scaling strategy

Provisioning expansion possibilities is one of the most critical activities when designing networks. This is due to the fact that when an upgrade or a change is needed in the network, only limited parts of the network should be affected to achieve the change, not the entire network (Pasricha 2004).

1.1.1Logical scalability

Logical scaling refers to IP addressing schema, subnets sizes, and number of networks within each subnet, routing design and deploying virtualisation technology (Lee 2005). For example, when designing the IP addressing schema the number of required hosts and subnets must be taken in consideration, also taking in consideration any expansion plans in the future; furthermore, the designed IP schema must be correlated to the routing design.

1.1.2 Physical scalability

Physically, the devices chosen must support scaling capabilities such as extra interface ports slots. In addition, choosing equipment with the right specifications to satisfy the current and future network needs (Oppenheimer 2010). Lee (2005) describes another aspect of physical scalability, which is upgrading link speeds. For instance, when an upgrade is planned to increase the speed of a 1 Gb link in the backbone network, a decision must be make whether to add another 1 Gb link or upgrading the existing link to

OC-48 link, the first choice is called horizontal scaling and the other choice is called vertical scaling, and in both cases the decision is made depending on the resource available to perform the upgrade.

1.2Modularity strategy

Breaking down the network to smaller manageable components is more advantageous than having a big complex network (Teare & Paquet 2005), these components are:

  • Core: it is the backbone of the networks, which connects the other modules together (Teare & Paquet 2005).
  • Access: it is the part of the network, where the end users are connected (Teare & Paquet 2005).
  • Internet: it is the part of the network, which connects the entire network to the internet (Bruno & Jordan 2011).
  • WAN: it is the part of the network, which connects remote sites together (Bruno & Jordan 2011).
  • Datacenter: it is the module that uses the network to enhance server, storage and application servers (Bruno & Jordan 2011).

Because each component has its own functionality within the network; each module has its own characteristics and functionality. For example, IP routing functions are performed within the core module and switching functionality is mainly performed in the access module, this segregation of functionalities aids the network managers to install the right equipment for each module, rather than ending up with over powerful equipment at each level, which eventually does little (Oppenheimer 2010). Additionally, having this design will ease the process of troubleshooting and isolating problems in case of failures.

1.3Security strategy

Resilient security implementation in the networks is equally important to any other network resiliency aspects discussed. Because aiming for a constant uptime cannot be achieved without protecting the network from various security breaches (Sonderegger et al. 2009). For today’s businesses and organizations implementing 

security is a compulsory part of network designing process due to the great risks involved when the data availability and integrity are compromised; further, network security is achieved by blocking external attackers from accessing the internal network, providing access to the authorised users only, complying with legislations that control security such as PCI-DSS standards (Bruno & Jordan 2011).

1.1Network management

To build a fully resilient network, a network management system must be implemented to inform network managers in case of failures (Clark 2000). For example, in case of communication links disconnection the network manager or administrators must be notified, this type of proactive management is vital to sustain network resiliency. To accomplish proactive network management, there are a set of tools exists to serve this needs such as network baselining, capacity planning, service level agreements, and collecting various reports and event reports from the network. According to Clark (2000) proactive network management aim is to sustain services level agreements and to make sure that its conditions are met, and to satisfy this objective network managers must identify current network bottlenecks and congestions; moreover, network managers must collect various statistical data to monitor and control the entire network.


In conclusion, Building a resilient IP network is not an easy task any more, due to the emerge of new technologies and the rapid increase of business needs, which is not excluded on maintaining communication channels for business data, but also the current networks must be prepared for growth and must be qualified to operate different types of applications that requires a high level of quality, availability and security.

Network managers are required to have an adequate understanding of the faced challenges and develop structured approaches to overcome these challenges, also network managers must be service oriented and armed with knowledge, technology and resources to be

  • Uploaded By : Katthy Wills
  • Posted on : February 02nd, 2023
  • Downloads : 0
  • Views : 332

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan


80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing


30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%


20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more