Life | Of people, paying attention to those who need medical devices and critical services. |
Living | Situational, major social factors such as elections or concerts. |
Nature | Vegetation around the area, particularly the dry plants and their fuel level. |
Energy | Power grid data, such as the location and type of a conductor. |
Service | Last time SDG&E serviced an area for maintenance. |
Season | Environmental data such as wind gust and rainfall. |
Weather | Forecasted & observed data from SDG&E website. |
Vegetation | Tree density, VRI (Vegetation Resource Inventory). |
Geographic | Elevation, HFTD (High Fire Threat District). |
Conductor | Material type, age, wire risk, historical maintenance. |
Living | Population density and medically vulnerable customers. |
We coded a few different types of machine learning models in Python, with the help of data science packages such as Pandas, NumPy, and Scipy.
We have 4 models in total, which is 2 less of our proposed factors in ILLNESS because some of the factors are similar enough to be combined into one model. Each model uses a different machine learning algorithm that is best suited to their data type. For each model, a corresponding type of dataset is taken as input and learns the mathematical parameters according to the algorithm. The learned model predicts the wildfire risk using a test dataset and outputs a score, which we collect for the final composite ILLNESS model.
Model | Features | Output |
---|---|---|
Weather Model - MLP | Temperature, dryness, windspeed | A weather risk score, reflecting how weather conditions contribute to potential wildfire ignition or spread. |
Nature Model - Linear Regression | Latitude, longitude, VRI, strike trees, elevation. | A weighted Nature Index (scaled 1-10). |
Energy/Service Model - Random Forest | Upstream HFTD, Days since work order (upstream/downstream), miles. | R²: 0.537, showing moderate predictive power. |
Life Model - Custom Weighted Function | Population density, number of customers served, presence of critical facilities. | A custom score out of 100 that takes into account critical customers such as essential service and customers on life support for each region of SDG&E's territory. |
The "Features" column shows the most critical features taht are used in our intermediate models. They help us understand what our model is looking at when considering wildfire and/or PSPS risk.
We created a mathematical model to assess wildfire risk and impact of power shutoffs. It includes all the aforementioned factors: weather, nature, and infrastructure wildfire risk within an area, plus the impact on customers if power is shut off. Each varaible is weighted according to its importance, and produce a final score that indicates the PSPS risk.
The overall composite score measures the magnitude of deviation from zero, with larger absolute values indicating stronger recommendations. A negative score suggests that the area should not undergo a PSPS, meaning power should be maintained to minimize impact. Conversely, a positive score indicates that the area should be PSPSed, prioritizing wildfire risk mitigation over potential disruptions.
Please check our report (linked at the bottom of the page) if you would like to see more details about the composite function.As we can see from the heatmaps, the east side of San Diego County has higher wildfire risk but lower PSPS risk. Geographically, the east side has more valleys that guide strong and dry Santa Ana wind into the area, which makes wildfires more likely. Those districts are also less populated so the impact of an PSPS is predicted to be low (as illustrated by the green color), and a PSPS can be safely issued without too much concerns from the neighbors.
On the flip side, the west side of the county has a lower wildfire risk but higher PSPS risk. This is because most of the population live in or close to downtown San Diego, which is on the coast. Shutting off the energy of this area could lead to many problems, from general inconvenience to larger safety on the road or in public areas. There are special cases like Ramona, where the heatmap colors change from red to green. Even though these districts are populated and the life factor should be high, the risk of wildfire still outweighs the risk on the community, and a PSPS is strongly advised to mitigate damage on the houses and on people.
This project could be for internal use, implemented by SDG&E as a more data-informed decision maker for issuing a PSPS. The datasets can be expanded to include more and newer data, such as the meteorology data collected from SDG&E's weather stations every 10 minutes. We could also include data from different sources, such as public weather data, satelite images of vegetation, and public census for the life factor.
Our project can also benefit the general public, as the ILLNESS model provides a single numerical value that is easy to interpret, even for those who may not understand the PSPS decision process entirely. Although our current geographical heatmaps are static, we could improve them by creating a live dashboard that updates the score periodically. It would again include the tooltip details with score breakdowns, as we want to provide transparency to people who consumes energy from SDG&E so they are aware of possible PSPS events and the reasons.