• Contact us
  • E-Submission
ABOUT
BROWSE ARTICLES
JOURNAL POLICIES
FOR CONTRIBUTORS

Articles

Page Path

Original article

Spatiotemporal associations between air pollution and emergency room visits for cardiovascular and cerebrovascular diseases in Korea using a multivariate graph autoencoder modeling approach: an ecological study

Ewha Med J 2025;48(3):e43. Published online: July 23, 2025

1Department of Environmental Medicine, Ewha Womans University College of Medicine, Seoul, Korea

2Graduate Program in System Health Science and Engineering, Ewha Womans University College of Medicine, Seoul, Korea

3Convergence Medical Research Institute, Ewha Womans University Mokdong Hospital, Seoul, Korea

4Institute of Ewha-SCL for Environmental Health (IESEH), Ewha Womans University College of Medicine, Seoul, Korea

*Corresponding email: 01637@eumc.ac.kr (SJ), eunheeha@ewha.ac.kr (EH)

These authors contributed equally to this work as corresponding authors.

• Received: July 3, 2025   • Revised: July 16, 2025   • Accepted: July 17, 2025

© 2025 Ewha Womans University College of Medicine and Ewha Medical Research Institute

This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

  • 148 Views
  • 7 Download
prev
  • Purpose
    This study aimed to assess the spatiotemporal associations between air pollution and emergency room visits for cardiovascular and cerebrovascular diseases in South Korea using a graph autoencoder (GAE). A multivariate graph-based approach was used to uncover seasonal and regional variations in pollutant–disease relationships.
  • Methods
    We collected monthly data from 2022 to 2023, including concentrations of 6 air pollutants (SO2, NO2, O3, CO, PM10, and PM2.5) and emergency room visits for 4 disease types: cardiac arrest, myocardial infarction, ischemic stroke, and hemorrhagic stroke. Pearson correlation coefficients were used to construct adjacency matrices, which, along with normalized feature matrices, were used as inputs to the GAE. The model was trained separately for each month and region to estimate the strength of pollutant–disease associations.
  • Results
    The pollutant–disease network structures exhibited clear seasonal variations. In winter, strong associations were observed between O3, NO2, and all disease outcomes. In spring, PM2.5 and PM10 were strongly linked to cardiac and stroke-related visits. These connections weakened during summer but became more pronounced in autumn, especially for NO2 and cardiac arrest. Urban areas displayed denser and stronger associations than non-urban areas.
  • Conclusion
    Our findings underscore the necessity for season- and region-specific air quality management strategies. In winter, focused control of O3 and NO2 is needed in urban areas, while in spring, PM mitigation is required in urban and selected rural regions. Autumn NO2 control may be especially beneficial in non-urban areas. Spatiotemporally tailored interventions could reduce the burden of air pollution-related emergency room visits.
Background
Cardiovascular and cerebrovascular diseases are among the leading causes of morbidity and mortality worldwide. In South Korea, their burden is steadily increasing due to rapid urbanization, population aging, and lifestyle changes [1]. Traditionally, risk factors such as hypertension, diabetes, dyslipidemia, smoking, and physical inactivity have been the main focus of disease prevention and management efforts. However, in recent years, environmental influences, particularly air pollution, have attracted increasing attention as significant and modifiable contributors to disease onset and progression [2].
Numerous epidemiological studies have reported that both short- and long-term exposure to ambient pollutants—including fine particulate matter (PM2.5 and PM10), nitrogen dioxide (NO2), and sulfur dioxide (SO2)—is associated with elevated risks of myocardial infarction (MI), ischemic stroke (IS), and cardiac arrest (CA) [2-5]. For instance, time-series analyses and cohort studies conducted in Korea and globally have demonstrated significant associations between pollutant concentration spikes and increased hospital admissions or mortality due to cardiovascular events [6-8]. Nevertheless, these studies often employ traditional statistical methods, which primarily focus on single-variable relationships or use linear models to estimate risk ratios or odds ratios [6-8]. While such approaches provide valuable insights into individual pollutant effects, they have limitations in addressing the complex, nonlinear, and multivariate nature of real-world environmental exposures and disease dynamics [9].
Moreover, both air pollution and health outcomes show pronounced spatiotemporal variation. Pollutant concentrations fluctuate seasonally due to meteorological and atmospheric factors, and regionally owing to differences in industrial activity, traffic density, and urban infrastructure. At the same time, the health impacts of pollution may vary depending on regional healthcare access, population vulnerability, and baseline disease prevalence [10-12]. These multifaceted interactions prompt important questions: How do pollutant–disease relationships vary across seasons and geographic regions? Which pollutants are most strongly associated with cardiovascular or cerebrovascular diseases in specific contexts?
To address these questions, more advanced analytical tools are needed. Recent developments in artificial intelligence and deep learning, especially within the field of graph neural networks, offer new possibilities for modeling complex relationships between interconnected variables. Among these, the graph autoencoder (GAE) has shown promise in learning hidden structures and association strengths from multivariate graph data [13].
Objectives
In this study, we applied a GAE framework to nationwide data from Korea spanning 2022 to 2023, integrating monthly air pollution measurements with emergency room visit data for 4 major cardiovascular and cerebrovascular disease outcomes: CA, MI, IS, and hemorrhagic stroke (HS). By constructing graphs based on correlation structures and training GAE models across different time points and regions, we aimed to uncover latent patterns in the pollutant–disease network. The objective of this study is to provide empirical evidence that can inform season- and region-specific public health strategies, enhance risk prediction, and support environmental health policy initiatives to reduce acute disease events triggered by air pollution exposure.
Ethics statement
This study used publicly available, de-identified secondary data obtained from national databases; therefore, it did not involve direct human participation or identifiable personal information. As a result, institutional review board approval and informed consent were not required under local regulations.
Study design
This research is a nationwide ecological analysis. The study is described in accordance with the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) statement, available at https://www.strobe-statement.org/.
Setting
This study explored the association between air pollutant concentrations and emergency room visits for cardiovascular and cerebrovascular diseases in Korea from January 2022 to December 2023 (Supplement 1). To address the complex, nonlinear, and multivariate nature of environmental health data, a graph-based machine learning model (GAE) was applied to identify and visualize seasonal and regional association patterns.
Participants
A total of 378,951 air pollution records and 396,713 emergency room visit cases were initially collected. After excluding 5,693 cases with missing or ambiguous regional information, the final analytic sample included 391,020 cases (Fig. 1, Supplement 1).
Variables
Exposure variables included 6 pollutants: SO2, NO2, ozone (O3), carbon monoxide (CO), particulate matter ≤10 μm (PM10), and particulate matter ≤2.5 μm (PM2.5). Outcome variables comprised 4 major disease categories: CA, MI, IS, and HS.
Data sources
Two national datasets were utilized to examine the relationship between air pollution and emergency room visits for cardiovascular and cerebrovascular diseases. Air pollution data were obtained from the National Air Pollution Monitoring Network (AirKorea; https://www.airkorea.or.kr/web/pastSearch?pMENU_NO=123) and included hourly measurements of pollutants (Dataset 1). Disease data were sourced from the National Emergency Department Information System (NEDIS; https://www.e-gen.or.kr/nemc/statistics_annual_report.do?brdclscd=02), managed by the Ministry of Health and Welfare (Dataset 2). These datasets provided monthly counts of emergency room visits for the 4 major disease categories, recorded across 17 administrative regions.
Data preprocessing
Several preprocessing steps were conducted to enable integrated spatiotemporal analysis. First, hourly air pollution data were aggregated to monthly average concentrations for each of the 17 administrative regions. Monthly aggregation was adopted because emergency room visit counts from NEDIS were only available at a monthly frequency and at the provincial (si-do) level; pollutant measurements from individual monitoring stations were likewise aggregated to ensure consistency. Next, pollutant and disease datasets were merged by matching month and region, resulting in 2 structured input types: (1) time-based datasets for each 12-month period, used to explore seasonal variation, and (2) region-based datasets covering 17 provinces, used to assess spatial heterogeneity. Finally, all variables were normalized to a 0–1 scale using Min-Max scaling to standardize input values and facilitate stable model training (Fig 1, Supplement 2).
Measurement (graph construction)
The integrated dataset was transformed into graph structures for input into the GAE model (Supplements 2, 3). Each of the 10 key variables—comprising the 6 pollutants and 4 disease categories—was represented as a node in the graph. A Pearson correlation matrix was calculated to quantify the relationships between nodes. An edge was created between 2 nodes only if the absolute value of the Pearson correlation coefficient (|ρᵢⱼ|) was greater than or equal to 0.4, resulting in a sparse adjacency matrix (A). The corresponding node feature matrix (X) was defined such that n=10 (number of nodes), and d represented the number of observations: 12 for time-series (monthly) analyses and 17 for regional (spatial) analyses.
Bias
As an ecological study, this analysis is subject to the risk of ecological fallacy, whereby associations observed at the population level may not apply to individuals. Potential confounding factors, such as meteorological conditions, socioeconomic status, and comorbidities, were not available in the dataset and could influence the observed associations.
Study size
All eligible records from national administrative databases for the defined period were analyzed; therefore, no sample size calculation was performed.
Statistical methods
The GAE model was implemented using the PyTorch Geometric library. The model architecture included an encoder with 2 graph convolutional layers (GCNConv) to compute node embeddings (Z), and a decoder that estimated edge probabilities between nodes by calculating the inner product of the embeddings, followed by a sigmoid activation function (σ). The loss function was defined as the binary cross-entropy between the original adjacency matrix (A) and the reconstructed matrix (Â). Model training was conducted for 100 epochs using the Adam optimizer with a learning rate of 0.01 (Supplements 2, 3). The resulting embedding vectors (Z) were visualized in 2-dimensional space to facilitate interpretation. Node positions were fixed based on the learned embeddings, and edge thickness was adjusted according to the predicted edge strength, visually reflecting the relative magnitude of pollutant–disease associations.
Two complementary analytical strategies were employed to explore the spatiotemporal dynamics of the pollutant–disease relationships. First, for the seasonal analysis, GAE models were trained individually for each month from January to December in both 2022 and 2023, allowing assessment of temporal variability. Second, for the regional analysis, the dataset was subdivided according to the 17 administrative regions, and separate GAE models were developed to examine spatial heterogeneity.
Additionally, a lag analysis was performed using pollutant concentrations from the previous month (t–1) to predict emergency room visits in the current month (t). The resulting graphs were exported as PNG images for visual inspection. In addition, the predicted edge strengths generated by the GAE and the corresponding Pearson correlation coefficients were compiled into Excel files (Microsoft Corp.) to enable comparative and supplementary quantitative analysis (Supplement 3).
GAE models were applied to spatiotemporal datasets of air pollutants and emergency room visits for cardiovascular and cerebrovascular diseases across 17 administrative regions in Korea, spanning January 2022 to December 2023. The analysis revealed pronounced seasonal patterns in pollutant–disease associations, as well as distinct structural differences between urban and non-urban regions.
Seasonal variations
Using monthly GAE models, we constructed pollutant–disease networks to track how associations changed across seasons in 2022 (Fig. 2AD, Supplements 4, 5). Each network visualizes the structural complexity and strength of connections between 6 major air pollutants—SO2, NO2, O3, CO, PM10, and PM2.5—and 4 emergency room diagnoses: MI, IS, HS, and CA.
In January, the network displayed the densest connections of any month, with a high mean edge strength of 0.960 (Fig. 2A). Most nodes were highly interconnected, and NO2 and O3 emerged as central hubs, robustly linked not only to each disease outcome but also to the other pollutants, including SO2, PM10, PM2.5, and CO. These widespread linkages suggest that exposure to multiple pollutants during winter imposes a cumulative burden on cardiovascular and cerebrovascular health. Notably, MI, IS, and CA all showed concurrent associations with nearly every pollutant, highlighting the particular vulnerability of these acute conditions to wintertime pollution.
By April, although the network remained complex—with the highest mean edge strength of the year at 0.967—the dominant pattern shifted (Fig. 2B). A highly integrated subnetwork formed among PM2.5, PM10, NO2, and the 4 disease nodes, indicating that particulate matter and NO2 were the primary contributors to emergency visits in spring. In contrast to the winter network, where O3 was central, ozone became peripheral in spring, and both CO and SO2 disconnected from the disease cluster. This reorganization points to seasonal specificity in pollutant effects. In particular, the edge between PM10 and CA, as well as between PM2.5 and MI, was notably strengthened.
In August, the network showed a distinct bifurcation: pollutant nodes (NO2, PM10, PM2.5, SO2, CO, O3) formed a tightly interconnected group, while disease nodes (MI, HS, IS, CA) clustered separately, with few or no direct links to pollutants (Fig. 2C). This configuration yielded the lowest mean correlation of the year, at 0.852. The reduced connectivity likely reflects decreased pollutant concentrations and enhanced atmospheric dispersion in summer, contributing to a temporary attenuation of acute health risks. Nevertheless, the strong internal cohesion among pollutants suggests that environmental exposure remained a concern, even in the absence of immediate disease associations.
By October, network connectivity rebounded, with a mean edge value of 0.905. NO2 and O3 reemerged as central nodes and re-established links with all 4 disease outcomes (Fig. 2D). In contrast to spring, CO and SO2 were once again integrated into the disease–pollutant cluster, suggesting a broader pollutant profile influencing emergency room visits as temperatures cooled. A particularly dense connection between NO2, MI, and CA was observed, closely resembling the winter pattern and indicating a resurgence of cardiovascular vulnerability. Notably, PM2.5 and PM10 were less prominent in autumn than in spring, suggesting that gaseous pollutants once again played a leading role in disease associations during the colder transition period.
Regional differences
Regional-level analysis was performed using individual GAE models for each of Korea’s 17 administrative regions, as shown in Fig. 2EH and Supplements 6 and 7. This approach enabled detailed comparisons of network structures between urban and non-urban areas, revealing clear differences in the complexity and strength of pollutant–disease associations.
Urban regions such as Busan and Seoul exhibited highly connected and densely structured networks. In Busan, the network had the highest regional correlation coefficient (0.967), with CA centrally positioned and strongly linked to several pollutants, including NO2, SO2, PM10, and PM2.5. MI also showed robust connections, particularly with fine particulate matter, suggesting that cardiovascular risk in this metropolitan area is influenced by a broad spectrum of air pollutants (Fig. 2E). In Seoul, a similarly complex subnetwork formed among gaseous pollutants (NO2, CO, O3, SO2), PMs, and CA. IS, MI, and HS were separated from this main cluster, while the remaining nodes interlinked to form an overall dense structure, with a correlation coefficient of 0.924 (Fig. 2F). Non-urban areas presented a contrasting profile. In Chungbuk, the network was moderately sparse (0.882), with pollutant nodes densely interconnected but showing fewer links to disease nodes. HS had limited connections, while MI and IS remained more peripheral (Fig. 2G). In Gangwon, the structure was slightly denser (0.893) than in Chungbuk but still showed some disconnection between certain diseases and pollutants (Fig. 2H). MI emerged as a bridge between the pollutant cluster—especially SO2 and NO2—and other health outcomes, while O3 and HS were more isolated. These findings indicate that non-metropolitan regions, while not devoid of pollutant–disease interactions, experience weaker or less complex exposures compared to urban centers, likely reflecting differences in emission sources, population density, and environmental conditions.
Lag analysis
Supplements 811 present lagged GAE networks in which pollutant concentrations from the previous month (t–1) are used to predict emergency room visits in the current month (t). The lagged networks exhibit seasonal patterns similar to those found in the contemporaneous analysis: in winter, strong NO2–O3 and disease node linkages persist, while in summer, overall network connectivity remains attenuated. These parallel findings confirm that the monthly co-occurrence patterns identified in the primary analysis are robust to a 1-month temporal shift.
Key results
This study applied a GAE model to evaluate the spatiotemporal associations between air pollutants and emergency room visits for cardiovascular and cerebrovascular diseases across Korea during 2022–2023. The findings revealed pronounced seasonal and regional variations (Fig. 3AD). Strong associations were observed in winter between NO2/O3 and IS, in spring between particulate matter (PM10 and PM2.5) and MI or CA, and in autumn between NO2 and CA. Urban regions consistently displayed more complex pollutant–disease networks than non-urban areas, highlighting differential exposure and vulnerability [14].
Interpretation
These seasonal shifts in network structure reflect both bulk concentration changes and mechanistic compositional differences. In winter, elevated NO2 and O3 levels—driven by increased residential heating emissions and atmospheric stagnation—contrast with higher PM mass in spring, attributable to dust storms and intensified outdoor activity [15,16]. Source-apportionment studies further indicate that secondary inorganic aerosols (sulfate, nitrate, ammonium) make up approximately 64% of PM2.5 in winter (compared to a 49% annual average), with a sulfate-to-nitrate mass ratio of around 1.5:1. In spring (March–May), crustal dust contributions increase from 8% to 12%, and the combined sulfate plus nitrate fraction rises from 39% to 45% [17,18]. The attenuated associations in summer could be due to improved pollutant dispersion and increased indoor activities, such as staying in air-conditioned environments, which lower personal exposure; however, persistent risks in regions like Gangwon suggest localized vulnerabilities [15,19-22].
The regional differences, particularly the denser connections in metropolitan centers, underscore health disparities resulting from urbanization [23,24]. These results fulfill the study’s objectives by identifying critical periods and locations for targeted air quality and health interventions. The GAE approach provided a robust framework for detecting nonlinear, multivariate relationships that conventional statistical models may miss [25].
Given these spatiotemporal dynamics, our findings support the need for season-specific and region-specific public health strategies. In winter (Fig. 3D), strict management of NO2 and O3—along with focused health monitoring of at-risk populations in major cities such as Busan, Daegu, and the Seoul Metropolitan Area—is essential [26,27]. In spring, effective control of particulate matter (PM10 and PM2.5) and improvements in indoor air quality in urban centers like Seoul, Daejeon, and Ulsan may help reduce emergency cardiovascular incidents [28] (Fig. 3A). In autumn, targeted NO2 reduction strategies should be prioritized in non-metropolitan areas, such as Gyeongbuk and Chungbuk, to mitigate risks of stroke and CA [29] (Fig. 3C). During summer, inland regions such as Gangwon may benefit from region-specific surveillance systems and risk assessment models that incorporate meteorological factors, which could explain sustained vulnerability despite improved average air quality [30] (Fig. 3B).
Comparison with previous studies
Our findings are consistent with previous research documenting stronger PM–stroke associations under extreme winter conditions in Seoul [31] and springtime soil-dust peaks in Incheon, as well as autumn industrial/coal impacts in Daegu [32]. These studies document seasonal and spatial heterogeneity in pollution–health relationships. However, prior research has generally focused on single pollutants or individual cities, whereas our GAE-based network analysis integrates multi-pollutant, multi-region time-series data to reveal how and when pollutant–disease clusters form across Korea [33]. This network approach uncovers interconnected risk modules—such as winter NO2/O3–stroke and spring PM–MI clusters—and tracks their spatiotemporal evolution, offering dynamic insights and identifying periods and subregional vulnerabilities that traditional models may not capture [34]. Notably, our network-level findings recent policy evaluations in Seoul showing that long-term air quality improvements correspond with reductions in cardiovascular morbidity, reinforcing the real-world value of targeted interventions [35].
Limitations
Several limitations should be acknowledged. First, the analysis was restricted to a 2-year period, which may not capture longer-term trends or lagged effects of pollution exposure. Second, health data were limited to emergency room visits, which may underestimate the overall disease burden. Third, the study used only monthly aggregated NEDIS emergency room visit data, which precluded direct assessment of daily or weekly acute exposure–effect relationships. Although monthly aggregation reduces noise and computational burden, it may mask short-lag effects. Fourth, the analysis was conducted at the provincial (si-do) level (17 regions), potentially missing intra-regional heterogeneity (e.g., variation in pollution and emergency room visit patterns within Seoul). Fifth, although the GAE model captured key pollutant–disease associations, it did not adjust for meteorological variables (such as temperature or humidity), individual-level risk factors, or socioeconomic status, all of which could confound or modify the observed associations. The exclusion of these covariates may have led to attenuation or inflation of some edge strengths, depending on region and season.
Generalizability
Despite these constraints, the study’s national scope and use of standardized administrative data enhance its generalizability within Korea. The methodological framework—particularly the GAE modeling—can be adapted to other countries with similar air pollution and health data systems. The visual and analytical insights derived from this model are valuable for healthcare policymakers, urban planners, and environmental health professionals aiming to develop locally tailored public health interventions.
Suggestions for further studies
Future studies should expand the observation period and include a broader array of health outcomes, such as hospitalizations and mortality. Incorporating meteorological, behavioral, and socioeconomic data would improve model precision and policy relevance. Further, investigating lagged effects of pollutant exposure and applying explainable AI methods to graph models may offer deeper insight into causal mechanisms. Ultimately, the integration of network science and public health surveillance offers a powerful approach for disease prevention and environmental risk management in increasingly urbanized societies.
High-resolution si/gun/gu-level graph analyses (corresponding to cities, counties, and districts), integrating inter-district movement matrices with moving-average pollutant exposure metrics, could trace subregional pollution–disease diffusion in a time-series framework, supporting improved causal inference and more targeted interventions.
This study investigated the spatiotemporal dynamics of emergency room visits associated with air pollution exposure in Korea by applying a GAE model to national datasets from 2022 to 2023. The analysis identified distinct seasonal and regional patterns in the associations between 6 major air pollutants and 4 categories of cardiovascular and cerebrovascular diseases.
We found that the strength and structure of pollutant–disease relationships varied significantly by season and urbanization level. Notably, NO2 and O3 played dominant roles in disease associations during winter and autumn, while particulate matter (PM10 and PM2.5) showed strong links to cardiac events in spring. In contrast, summer was characterized by overall weaker associations, though inland areas such as Gangwon continued to exhibit localized vulnerability. Urban regions consistently exhibited denser and more complex pollutant–disease networks compared to non-urban areas. These findings support the initial research aim of revealing the complex, multivariate relationships between environmental exposures and acute health outcomes across time and space. The use of GAE modeling enabled us to capture and visualize these nonlinear associations more effectively than conventional statistical methods.
From a medical and public health perspective, this study highlights the importance of season-specific and region-specific strategies for air pollution management and disease prevention. By pinpointing when and where pollutant–disease associations are most pronounced, our results can inform targeted interventions, risk communication, and policy planning to reduce preventable emergency health events and improve cardiovascular and cerebrovascular health nationwide.

Authors’ contribution

Conceptualization: EH. Data curation: SJ. Methodology/formal analysis/validation: SW, SJ. Project administration: EH. Funding acquisition: none. Writing–original draft: SW. Writing–review & editing: SW, SJ, EH.

Conflict of interest

No potential conflict of interest relevant to this article was reported.

Funding

None.

Data availability

The datasets analyzed in this study are publicly available from the following sources:

Dataset 1. Hourly air pollution measurement data (SO2, NO2, O3, CO, PM10, PM2.5) from 17 administrative regions in Korea (2022–2023) can be accessed via the AirKorea Final Confirmation Measurement Data Portal: (https://www.airkorea.or.kr/web/pastSearch?pMENU_NO=123).

Dataset 2. Emergency room visit records related to cardiovascular and cerebrovascular diseases (2022–2023) are available from the NEDIS Emergency Medical Portal Statistical Yearbook: (https://www.e-gen.or.kr/nemc/statistics_annual_report.do?brdclscd=02).

All data used in this study are publicly accessible, and no proprietary or restricted datasets were used. The code used for analysis is available on GitHub: (https://github.com/seungpil720/sohee).

Acknowledgments

None.

Supplementary files are available from Harvard Dataverse: https://doi.org/10.7910/DVN/ZY7MVM
Supplement 1. Temporal trends in daily air pollutant concentrations and monthly disease incidence (2022–2023).
emj-2025-00640-Supplementary-1.pdf
Supplement 2. Overall workflow of data preprocessing and graph autoencoder (GAE) modeling.
emj-2025-00640-Supplementary-2.pdf
Supplement 3. Mathematical notation of input representation, edge construction, and loss function in the graph autoencoder.
emj-2025-00640-Supplementary-3.pdf
Supplement 4. Monthly graph autoencoder (GAE) networks depicting pollutant–disease associations in 2022.
emj-2025-00640-Supplementary-4.pdf
Supplement 5. Monthly graph autoencoder (GAE) networks depicting pollutant–disease associations in 2023.
emj-2025-00640-Supplementary-5.pdf
Supplement 6. Regional graph autoencoder (GAE) networks depicting pollutant–disease associations in 2022.
emj-2025-00640-Supplementary-6.pdf
Supplement 7. Regional graph autoencoder (GAE) networks depicting pollutant–disease associations in 2023.
emj-2025-00640-Supplementary-7.pdf
Supplement 8. Monthly graph autoencoder (GAE) Networks depicting lagged pollutant–disease associations in 2022.
emj-2025-00640-Supplementary-8.pdf
Supplement 9. Monthly graph autoencoder (GAE) networks depicting lagged pollutant–disease associations in 2023.
emj-2025-00640-Supplementary-9.pdf
Supplement 10. Regional graph autoencoder (GAE) networks depicting lagged pollutant–disease associations in 2022.
emj-2025-00640-Supplementary-10.pdf
Supplement 11. Monthly graph autoencoder (GAE) networks depicting lagged pollutant–disease associations in 2023.
emj-2025-00640-Supplementary-11.pdf
Fig. 1.
Overview of data sources, preprocessing, and analysis pipeline. This flowchart illustrates the data integration and preprocessing steps used in the study. Emergency room visit records (n=396,713) for 4 cardiovascular and cerebrovascular conditions—myocardial infarction (MI), hemorrhagic stroke (HS), ischemic stroke (IS), and cardiac arrest (CA)—were obtained from the National Emergency Department Information System (NEDIS) for 2022–2023. After excluding records with missing regional information (n=5,693), 391,020 cases were included in the final analysis. Simultaneously, hourly air pollutant data (n=378,951) from AirKorea were aggregated into monthly averages for 6 pollutants (SO2, NO2, O3, CO, PM10, and PM2.5) across 17 regions. All variables were normalized using Min-Max scaling. The datasets were merged by region and month, resulting in 12 seasonal and 17 regional datasets. Correlation matrices were computed for each dataset to construct adjacency matrices, which were used as input for graph autoencoder modeling.
emj-2025-00640f1.jpg
Fig. 2.
Visualization of pollutant–disease networks by season and region (2022). This figure presents representative graph autoencoder (GAE) network outputs, showing seasonal (A–D) and regional (E–H) variations in the relationships between air pollutants and emergency room visit diseases in Korea during 2022. Circular nodes represent air pollutants (SO2, NO2, O3, CO, PM10, and PM2.5), while square nodes represent disease categories: cardiac arrest (CA), myocardial infarction (MI), ischemic stroke (IS), and hemorrhagic stroke (HS). Edges indicate learned associations, with thicker lines representing stronger predicted relationships. Red numbers denote the average predicted edge strength (mean ≥0.5) for each network. Panels (A–D) show seasonal networks for January, April, August, and October 2022, respectively. Panels (E–H) display regional networks for Busan, Seoul, Chungbuk, and Gangwon in 2022. The visualizations highlight pronounced seasonal changes in network density and structure: winter (A) and spring (B) show high connectivity, while summer (C) is much sparser. Regional differences are also clear: urban areas like Busan (E) and Seoul (F) display more complex networks, whereas Chungbuk (G) and Gangwon (H) have less integrated structures.
emj-2025-00640f2.jpg
Fig. 3.
Spatiotemporal hotspots of strong pollutant–disease associations by season. This figure highlights region- and season-specific pollutant–disease pairs with relatively strong associations based on graph autoencoder predictions. Each panel presents a map of South Korea with red-shaded areas indicating regions with significantly elevated edge strengths. Accompanying tables summarize the mean edge values across all seasons, within the highlighted season, and for specific regions, along with their relative significance. (A) Spring: PM10 is strongly associated with myocardial infarction (MI) in Ulsan and with cardiac arrest (CA) in Seoul and Daejeon. These associations are most prominent in spring, showing more than 3-fold increases in relative significance in the hotspot regions. (B) Summer: Gangwon is identified as a hotspot for PM2.5–ischemic stroke (IS) associations, with a relative increase in edge strength during summer. (C) Autumn: NO2 exhibits a strong seasonal link with cardiac arrest in Gyeongbuk, indicating increased vulnerability in this region during autumn. (D) Winter: NO2–IS and O3–IS associations are especially elevated in Daegu and Busan, respectively, with winter showing the highest relative significance for these edges (up to 2.11-fold).
emj-2025-00640f3.jpg

Figure & Data

References

    Citations

    Citations to this article as recorded by  

      Download Citation

      Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

      Format:

      Include:

      Spatiotemporal associations between air pollution and emergency room visits for cardiovascular and cerebrovascular diseases in Korea using a multivariate graph autoencoder modeling approach: an ecological study
      Ewha Med J. 2025;48(3):e43  Published online July 23, 2025
      Download Citation
      Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

      Format:
      • RIS — For EndNote, ProCite, RefWorks, and most other reference management software
      • BibTeX — For JabRef, BibDesk, and other BibTeX-specific software
      Include:
      • Citation for the content below
      Spatiotemporal associations between air pollution and emergency room visits for cardiovascular and cerebrovascular diseases in Korea using a multivariate graph autoencoder modeling approach: an ecological study
      Ewha Med J. 2025;48(3):e43  Published online July 23, 2025
      Close

      Figure

      • 0
      • 1
      • 2
      Spatiotemporal associations between air pollution and emergency room visits for cardiovascular and cerebrovascular diseases in Korea using a multivariate graph autoencoder modeling approach: an ecological study
      Image Image Image
      Fig. 1. Overview of data sources, preprocessing, and analysis pipeline. This flowchart illustrates the data integration and preprocessing steps used in the study. Emergency room visit records (n=396,713) for 4 cardiovascular and cerebrovascular conditions—myocardial infarction (MI), hemorrhagic stroke (HS), ischemic stroke (IS), and cardiac arrest (CA)—were obtained from the National Emergency Department Information System (NEDIS) for 2022–2023. After excluding records with missing regional information (n=5,693), 391,020 cases were included in the final analysis. Simultaneously, hourly air pollutant data (n=378,951) from AirKorea were aggregated into monthly averages for 6 pollutants (SO2, NO2, O3, CO, PM10, and PM2.5) across 17 regions. All variables were normalized using Min-Max scaling. The datasets were merged by region and month, resulting in 12 seasonal and 17 regional datasets. Correlation matrices were computed for each dataset to construct adjacency matrices, which were used as input for graph autoencoder modeling.
      Fig. 2. Visualization of pollutant–disease networks by season and region (2022). This figure presents representative graph autoencoder (GAE) network outputs, showing seasonal (A–D) and regional (E–H) variations in the relationships between air pollutants and emergency room visit diseases in Korea during 2022. Circular nodes represent air pollutants (SO2, NO2, O3, CO, PM10, and PM2.5), while square nodes represent disease categories: cardiac arrest (CA), myocardial infarction (MI), ischemic stroke (IS), and hemorrhagic stroke (HS). Edges indicate learned associations, with thicker lines representing stronger predicted relationships. Red numbers denote the average predicted edge strength (mean ≥0.5) for each network. Panels (A–D) show seasonal networks for January, April, August, and October 2022, respectively. Panels (E–H) display regional networks for Busan, Seoul, Chungbuk, and Gangwon in 2022. The visualizations highlight pronounced seasonal changes in network density and structure: winter (A) and spring (B) show high connectivity, while summer (C) is much sparser. Regional differences are also clear: urban areas like Busan (E) and Seoul (F) display more complex networks, whereas Chungbuk (G) and Gangwon (H) have less integrated structures.
      Fig. 3. Spatiotemporal hotspots of strong pollutant–disease associations by season. This figure highlights region- and season-specific pollutant–disease pairs with relatively strong associations based on graph autoencoder predictions. Each panel presents a map of South Korea with red-shaded areas indicating regions with significantly elevated edge strengths. Accompanying tables summarize the mean edge values across all seasons, within the highlighted season, and for specific regions, along with their relative significance. (A) Spring: PM10 is strongly associated with myocardial infarction (MI) in Ulsan and with cardiac arrest (CA) in Seoul and Daejeon. These associations are most prominent in spring, showing more than 3-fold increases in relative significance in the hotspot regions. (B) Summer: Gangwon is identified as a hotspot for PM2.5–ischemic stroke (IS) associations, with a relative increase in edge strength during summer. (C) Autumn: NO2 exhibits a strong seasonal link with cardiac arrest in Gyeongbuk, indicating increased vulnerability in this region during autumn. (D) Winter: NO2–IS and O3–IS associations are especially elevated in Daegu and Busan, respectively, with winter showing the highest relative significance for these edges (up to 2.11-fold).
      Spatiotemporal associations between air pollution and emergency room visits for cardiovascular and cerebrovascular diseases in Korea using a multivariate graph autoencoder modeling approach: an ecological study
      TOP