March 28, 2024

Energy Industry Must Take Three Coordinated Steps to Minimize Occurrence of Blackouts

by Damir Novosel, President and General Manager , T&D Consulting at KEMA
The North American blackout of August 14, 2003 has compelled the utility industry, government regulators and general public to ask the same question: “How did this happen?” While finding the root cause of a specific event like this offers some benefit in preventing a recurrence, the electric industry should take this opportunity to step back and look at the big picture of how power system design flaws, regulatory lapses and market changes created the current environment where wide-area blackouts can strike quickly and easily.

Research into the 2003 incident revealed that a series of cascading events over the course of several hours, rather than a single instantaneous problem, initiated the major disturbance phenomenon that toppled multiple power grids. The blackout itself was preceded by line tripping caused by overgrown trees in the First Energy right-of-way accompanied by failures of EMS/SCADA alarm systems, which prevented operators from diagnosing problems. A lack of communication and coordination between First Energy and Midwest ISO (MISO), and MISO and PJM Interconnection also played a role.

As the grid became overloaded and additional lines tripped due to actions of protective equipment, reactive power was consumed and voltages dropped spreading outages throughout the interconnected power. Sequential outages of power system equipment due to overloads, power swings, and voltage fluctuations led to a blackout of large proportions that operators simply could not respond to fast enough.

But the overall blame for August 14 can just as correctly be pinned on a succession of occurrences that took place in the industry over a period of years. First, networks were interconnected to enable support during stressed conditions. However, next came deregulation, which required massive transfers of power from remote locations across the networks in ways the grids were not designed to handle.

Deregulation also tightened operator margins, delaying investment in building new generation capacity, replacing aging equipment and maintaining rights-of-way. Utilities began cutting back on programs as basic as tree trimming and computerized system modeling. As population centers grew, power network became congested with inadequate local reactive support. Each of these elements dramatically increased the possibility that an otherwise benign blip in the network might snowball into a major incident.

With such a precarious situation already in place, the energy industry faces a simple choice – continue applying a patchwork of individual measures to avoid future blackouts, likely with minimal success, or take a balanced approach to fixing the system as a whole by equally weighing the costs, performance impacts and risks associated with each new investment in the system.

To accomplish this, utilities should begin asking themselves how they want the power system to operate 10, 20 and 30 years from now. Within the context of this forward-looking system overhaul, they can also address specific solutions to reduce the likelihood of outages – because once the overall causes of wide-area disturbances are minimized, the smaller contributing factors are easier to handle, further diminishing the incidence of failures.

Although there is no silver-bullet to completely prevent blackouts, there are three steps that can be taken by electric utilities, industry regulators and local government legislators to deploy a well-defined and coordinated strategy that will defend the power grid network from disruptions.

Audits and Analysis
The energy industry must rediscover the benefits of conducting regular analysis and audits of their networks and components to detect equipment malfunctions and system design flaws that can result in a failure. At the regulatory level, the North American Electric Reliability Council (NERC) has taken the lead in scrutinizing the 2003 blackout and simulating the chain of events to isolate the contributing factors so they can be corrected.

In the course of this study, NERC released a 14-point plan, based largely on analysis and auditing activities, to prevent incidents similar to the 2003 situation. (See Sidebar for the complete list of recommendations.). There are a number of audits being presently done across the country. NERC clearly believes that utilities must make more rigorous and frequent use of widely available software tools to model the management, planning and operations of power grids.

Utilities have models for planning and embedded in energy management systems to study load flows, voltage stability, angular stability and protection coordination, but it is questionable that this is done regularly as network conditions change, that adequate models are used, and that the overall process is properly coordinated with the neighbors.

Operating these models takes time and money, but they can reveal the small system problems that can trigger a network disturbance. Moreover, NERC believes the 2003 blackout uncovered the fact that many models are inaccurate. The council is calling on utilities to validate the accuracy of their models and ensure that the input parameters are up to date. Official certification of both models and data by outside contractors could be mandated in the near future.

NERC has also announced plans to implement routine audits of control area and reliability coordinators. Proposals have been made to require training of certain key control room personnel to assist them in identifying and responding to a nascent problem more quickly before it spreads. Related to this concept is a recommendation to enhance alarm filtering and improve communication between neighboring control centers so that one grid is not blind-sided by a problem growing in the adjacent network.

Although NERC’s auditing and analysis approach may be viewed as a piecemeal solution to preventing short-term problems, it dovetails with the next two steps that bring a broader and longer-term focus to the overall strategy of blackout prevention.

Corrective and Preventive Actions
Properly implemented auditing and analysis will identify a spectrum of preventive and corrective actions that individual utilities should undertake to begin fixing many of the network weaknesses that have become endemic due to years of non-existent or reactionary investment. These activities will include many, if not all, of the following:

Improve maintenance and assess condition of aging infrastructure – Numerous blackouts have been traced to lines sagging into trees in the right-of-way. Utilities must implement regular schedules to clear these areas of vegetation and other objects that can interfere with transmission or equipment access. Aging infrastructure must be serviced or replaced on a routine basis. Industry vendors have introduced equipment monitoring and diagnostic tools to identify components not performing within established parameters. Detailed modeling algorithms that can be applied to determine when a piece of equipment should be upgraded, replaced or merely repaired to extend its life have been developed. These models include complex financial analysis programs to prioritize maintenance investments based on impact upon the entire power system.

Study protection coordination – Protection designs must be reviewed regularly across regions as system conditions change. In areas particularly vulnerable to blackouts, designers must ensure that protection devices are both secure and dependable. For example, they must be designed to avoid tripping generators too early (e.g. avoid lack of coordination of volts/Hertz relays with voltage regulators and excitation limiters).

Implement special protection schemes – It is unfair to expect operators to act fast enough during sequence of outages, and it is very difficult for operator to decide to shed load, even if conditions are deteriorating. Implementation of wide-area special protection schemes can improve power system security and reliability. A wide-area special protection system detects abnormal network conditions and takes pre-planned corrective action to restore acceptable performance. For example, it is quite possible that disturbance propagation during August 14th events could be prevented by implementing under-voltage load shedding schemes. Wide-area special protection schemes should be based on preplanned, automatic corrective actions established as a result of system performance studies.

Implement adaptive protection – Adaptive protection offers multiple setting groups that adjust automatically to changes in the system. Microprocessor relays now support the use of adaptive protection schemes to provide better response of the system under stressed conditions. Test protection applications and relays – Not only individual relays, but protection applications should be tested to prevent malfunctions and identify design flows. Special protection schemes must be tested along with those in neighboring areas to ensure coordinated performance.

Study voltage and transient stability with appropriate tools and models – Comprehensive studies of these and other system disturbance conditions should occur regularly, especially to detect problems that evolve over time. Models are available but they should be certified as the right ones for the job. For example, voltage stability should be studied using time domain simulation tools rather than steady-state programs.

Improve monitoring, diagnostics and control center performance – Telecommunications and data handling capabilities make it possible to improve SCADA and EMS functionality so that they can filter, display and analyze only critical information. And the availability of these critical functionalities must be boosted to 99.99 percent. Alarms must also be enhanced to feed only the crucial failures to operators. In addition, faster and more accurate state estimators must be developed using modern fast transducers to provide time-synchronized measurements from across the grid.

Include advanced algorithms and calculation programs in SCADA – Faster-than-real-time simulations could assist the operator in calculating power transfer margins based on a variety of contingencies.

Require certification training – Control center operators must undergo more rigorous and routine training that includes coordinated simulations involving multiple centers in interconnected systems.

Introduce cyber security to control systems – In the absence of regulatory pressure or a major utility network incursion by a worm, virus or perpetrator, the cyber security of control systems must be strengthened. This means that security must be built into the development of new control system. Tighter security policies pertaining to control room personnel must also be enacted.

Establish real-time operating limits on a daily basis – More exact line overloading limits must be determined by using monitoring and protection equipment based on dynamic line ratings. These should be calculated depending on ambient temperatures, wind, pre-contingency loading and other factors.

Long-Term Policy Changes and Investments
Fallout from the 2003 blackout has renewed calls for tighter regulatory policies and controls, which will have potentially long-term impacts on the industry when enacted. The first order of business is for state and federal regulators to resolve lingering uncertainties over accountability and jurisdiction. For example, a compromise must be reached regarding eminent authority over the siting of transmission lines and FERC jurisdiction over publicly owned transmission. Currently, ISOs are accountable for reliability and security, while transmission asset owners maintain the physical system.

Regulatory actions must be aimed toward ensuring compliance of existing standards and coordinating blackout prevention and response among control areas. These policy activities should also enable efficient system planning, permitting and market operations. Conservation should be encouraged wherever appropriate, and policies should facilitate establishing new sources of generation closer to the loads.

Complementing these regulatory and policy changes must be focused investment on the part of energy utilities. The following investments will improve overall system performance long into the future and minimize blackout factors in the process:

Strengthen the power network – The transmission grid must be expanded and upgraded to handle increased power flows. This can be accomplished by installing extra transmission lines and cables and properly applying distributed generation for situations when remote resources are rendered ineffective by system conditions. Reactive power requirements must also be satisfied by implementing additional shunt capacitor banks and SVCs. Finally, the planned installation of reactive resources in distribution networks can perform conservation voltage reduction, which can shed soft loads in emergencies.

Improve transmission power flow control – Utilities should install high-voltage power electronics devices, FACTS and HVDC links to allow for faster and more precise switching. This will enhance overall system control and increase the level of power transfer that can be accommodated by the existing grid.

Design more robust power systems – Perhaps led by government research, utilities should examine new energy storage and power delivery technologies, including superconductivity and micro-grids, to build power systems that are less susceptible to blackouts.

Implement wide-area monitoring, control and protection – The ability of computer relays to communicate remotely with the control center and each other to monitor quickly developing disturbances can change the philosophy of system-wide protection and control. Adaptive system-wide protection schemes are improvements on event- or alarm- based special protection schemes. The input data to the decision-making logic is taken from the continuously monitored data, stored in the database. A low speed communication interface for SCADA communication and operator interface should also be available as an enhancement for the SCADA state estimator. Actions ordered from SCADA/EMS functions, such as optimal power flow, emergency load control, etc., could be activated via the system protection terminal. The power system operator should also have access to the terminal, for supervision, maintenance, update, parameter setting, change of setting groups, disturbance recorder data collection, etc.

System protection terminals can utilize GPS-stamped synchronized phasor measurements for protection applications and contain decision-making logic to derive appropriate output control signals, such as circuit-breaker trips, AVR boosting and tap-changer action. Groups of these terminals can be integrated into local protection centers and system protection centers (SPC) can coordinate several local protection centers to achieve a multi-layered protection system that prevents disturbance cascading.

No Silver Bullet
Each of the above-described prevention techniques, both short- and long-term, plays an incremental role in diminishing the current environment that fosters and propagates disturbance events. When cost, performance and risk are balanced properly, these steps can have a dramatic cumulative impact on reducing the likelihood of blackouts.

Third-party vendors have developed software tools that can assist utilities in balancing these objectives and devising a comprehensive set of priorities to guide the investment strategies. Independent audit programs can also assist utilities in determining where their greatest vulnerabilities reside, which further helps power companies decide where their investment dollars are best spent.

However, for every unlikely outage contingency that is protected against, there will be others, even less probable and possibly more devastating. Blackouts will occur, which means the energy industry must put as much effort into preparing for them as preventing them. Therefore, interconnected grid operators must develop coordinated restoration procedures to bring networks back on line as soon as possible.

Reliable and efficient restoration software within the EMS/SCADA system can assist operators in executing optimal procedures quickly. And regular staff training and simulation enable personnel to take appropriate steps immediately following a disruption. Although automated power restoration technology has not been widely deployed, recent advancements in communications and measurement techniques make this a more viable option. This solution is worth considering so that loads can be shed automatically to avoid exacerbating the event, thus returning power to customers more quickly.

About the Author
Damir Novosel is the President and General Manager of T&D Consulting at KEMA, with North American headquarters in Burlington, Mass.
He may be reached at DNovosel@kema.com