Tech Briefs: Could you tell me how this project began?
Aaron Wilson: I was not the originator of the Grid Event Signature Library . It was fairly new when I joined the lab back in 2019, but it didn't really get off the ground until around 2021. I took over in spring of 2022.
We received funding from DOE to create a library of event signatures from the power grid. It would be a go-to resource for people to look at how some of these waveforms behave on the system so that they would know what to look for on their own systems. As we've gotten more data from different partners and have been continuing to promote this over the past couple of years, it's really started to gain traction.
Tech Briefs: Let me try to get a better picture of the system. Do you have a standardized way of classifying the waveforms and labeling the associated events?
Wilson: One of the challenges we had a couple of years ago was assigning a uniform labeling scheme to all the different data types we were getting from providers. We noticed that everybody had their own way of assigning a label to something that happened. It was usually given to us in the form of a textual description of what the operator or engineer noted when they made a record of the event. So, after we combed through everything we had at the time, we decided to put together a hierarchical system that encompasses everything we have, both from that set of data and what we have from general domain knowledge about the power system. We then came up with a taxonomy that's fairly uniform across our entire set of data. So, whenever we receive new data, we have enough breadth in that taxonomy to be able to give it an appropriate slot.
Tech Briefs: Can you give me some idea of what the taxonomy is?
Wilson: It has three layers, we call them groups, classes, and subclasses. Groups are the very high-level kinds of categories, things like which phases are affected on the power system — you typically have three phases. There's a group called conditions, where you would find things such as natural disasters and weather-related events. Underneath those in classes and subclasses is where you start to get to drill down into the more specific categories of that we would assign to those different events.
Tech Briefs: Could you give me an example of what kinds of categories you might have?
Wilson: For example, we might have a category called “conditions” as the group, and then we might have “weather” as the class, and then the subclass might be “lightning storm.” Or we might have a group that's just called “events,” and we might have a class underneath labelled “power quality,” and underneath that we may have a “current surge” or a “transient.”
Tech Briefs: How do you relate current surges or transients to events?
Wilson: A transient is an event in the way that we're defining it. The way that we define an event is something that occurs that is an anomaly — an abnormal behavior that you were not expecting.
So, think of voltage and current as it behaves on the grid. Since we have an AC system, it behaves sinusoidally and the system is designed such that those phenomena, those electromagnetic fields, are traversing as cleanly as possible, meaning there's no noise or anomalous events. An example of an anomaly could be when something happens that either causes a circuit breaker to trip or causes a section of line to disconnect. Those types of things are what we're calling events here. A transient might be something like maybe a capacitor bank has switched on or off and that causes a short surge of current that inrushes into the system. If your system is not designed to handle that level of current, it can cause damage to your equipment.
Tech Briefs: What can a user do with this information?
Wilson: The user is able to create an account on the site, and then they can access what we call our dashboard. From there you can use different query criteria to find events in our library. Inside the dashboard, you can see information related to the data, such as date and time; the textual description that was provided to us; the sampling rate at which it was reported; the type of sensor that was used to record it. And there is a button to download it. There's also a feature next to each event in the records that will allow you to see a plot of it on the website prior to downloading.
Tech Briefs: So, if I'm a utility, do I record an event that's happening on my grid and then compare it to your data bank?
Wilson: That's definitely one way. One thing that we've been talking about among our group of three labs is developing a tool we'll integrate into the library that would allow somebody to perform what could be called a reverse image search. You would receive a piece of data from the utility side, you don't know what caused it, and you want to find something in the database that matches as closely as possible. That might help you get some idea as to what could have possibly happened. That's an ongoing project.
The other thing that it could be used for is just a general education, to learn what some of these events look like, just to train the eye of an engineer who doesn't know how some of these things work. You could go in and search by, say, the name of the type of the event, like arcing — I want to see what arcing looks like to compare it to what I have.
Suppose I'm an operator, and I see this weird thing that happened in my data and I'm trying to figure out what it is or what it represents because something bad happened as a result. I can say, “I think it's arcing.” I can then go in and search “arcing” and compare what my waveform looks like to what the arcing data looks like in the database and say: “Are these similar? Oh, that that looks like it's pretty related, that might be it.” Or it might help to rule that out.
Tech Briefs: If I'm the grid operator, once I know the kind of event it is, how do I use that information?
Wilson: Well, it depends on what the type of event is. If you think you see what we would call an incipient failure, if you've seen a pattern that might indicate future arcing, you can match it against something in the database and you're able to stop it before it actually breaks down an insulator or damages equipment. That would be quite valuable — arcing is a hard anomaly to detect.
Tech Briefs: But I still have to localize it somehow.
Wilson: Of course, but this is just a database, it's not necessarily a system that would tell you where something might occur, because every circuit is different.
Tech Briefs: When you're collecting the waveforms for your database, wouldn’t the waveform look different depending on, say, the distance between the sensor and the event?
Wilson: Absolutely. There are many factors that come into play when organizing a database such as this. If you wanted full observability into your system, you'd have to install sensors at every X number of feet. That's just not feasible economically. So, you have to make educated judgments when you're using something like this.
In the arcing example, the measurements would likely be very close to where that particular event is happening, just based on the physics of how frequencies of arcing events travel along the conductor. So, you could reasonably say that it was happening close by — I couldn't give you a number of how many feet or miles, that information is not necessarily contained here in the library. That's a pretty big research problem, actually.
Tech Briefs: Do you want to add anything?
Wilson: This is something we have been working on for a couple of years now. There is a strong sense of building out this database for use in the academic and research worlds, not just as a tool for utilities. There's a lot of AI development going on now, especially for the power grid. DOE's putting out a lot of calls for different AI and data-based applications on the grid. This is also meant to help support some of those efforts.
Tech Briefs: Are you working on upgrading the database quantity or quality?
Wilson: Yes, and yes. Those are ongoing efforts. We're always hungry for data everywhere.
Tech Briefs: Who are you hoping will provide more data?
Wilson: Well, we always want data from utilities because they typically have things that are more representative of the real world. We could simulate as much data as we want and put it in there, but at the end of the day they don't cover all of the corner cases that you might see in the real world. And so, everything we have in here right now is from real sensors that recorded data in the field, it's not simulated from a program. I don't mean to say that there's no value in simulated data, but we can't capture all anomalous behaviors in simulation without really knowing the physics of every circuit out there in the country and the world.