AI4G6146
Safeguarding Sustainable Cities: Unsupervised Video Anomaly Detection through Diffusion-based Latent Pattern Learning
Menghao Zhang, Jingyu Wang, Qi Qi, Pengfei Ren, Haifeng Sun, Zirui Zhuang, Lei Zhang, Jianxin Liao
10 min. talk | – at – | Session: –
[+] More
[-] Less
Sustainable cities requires high-quality community management and surveillance analytics, which are supported by video anomaly detection techniques. However, mainstream video anomaly detection techniques still require manually labeled data and do not apply to real-world massive videos. Without labeling, unsupervised video anomaly detection (UVAD) is challenged by the problem of pseudo-labeled noise and the openness of anomaly detection. In response, a diffusion-based latent pattern learning UVAD framework is proposed, called DiffVAD. The method learns potential patterns by generating different patterns of the same event through diffusion models. The detection of anomalies is realized by evaluating the pattern distribution. The different patterns of normal events are diverse but correlated, while the different patterns of abnormal events are more diffuse. This manner of detection is equally effective for unseen normal events in the training set. In addition, we design a refinement strategy for pseudo-labels to mitigate the effects of the noise problem. Extensive experiments on six benchmark datasets demonstrate the design’s promising generalization ability and high efficiency. Specifically, DiffVAD obtains an AUC score of 81.9% on the ShanghaiTech dataset.
AI4G6702
Effective High-order Graph Representation Learning for Credit Card Fraud Detection
Yao Zou, Dawei Cheng
10 min. talk | – at – | Session: –
[+] More
[-] Less
Credit card fraud imposes significant costs on both cardholders and issuing banks. Fraudsters often disguise their crimes, such as using legitimate transactions through several benign users to bypass anti-fraud detection. Existing graph neural network (GNN) models struggle with learning features of camouflaged, indirect multi-hop transactions due to their inherent over-smoothing issues in deep multi-layer aggregation, presenting a major challenge in detecting disguised relationships. Therefore, in this paper, we propose a novel High-order Graph Representation Learning model (HOGRL) to avoid incorporating excessive noise during the multi-layer aggregation process. In particular, HOGRL learns different orders of \emph{pure} representations directly from high-order transaction graphs. We realize this goal by effectively constructing high-order transaction graphs first and then learning the \emph{pure} representations of each order so that the model could identify fraudsters’ multi-hop indirect transactions via multi-layer \emph{pure} feature learning. In addition, we introduce a mixture-of-expert attention mechanism to automatically determine the importance of different orders for jointly optimizing fraud detection performance. We conduct extensive experiments in both the open source and real-world datasets, the result demonstrates the significant improvements of our proposed HOGRL compared with state-of-the-art fraud detection baselines. HOGRL’s superior performance also proves its effectiveness in addressing high-order fraud camouflage criminals.
AI4G7088
Wearable Sensor-Based Few-Shot Continual Learning on Hand Gestures for Motor-Impaired Individuals via Latent Embedding Exploitation
Riyad Bin Rafiq, Weishi Shi, Mark V. Albert
10 min. talk | – at – | Session: –
[+] More
[-] Less
Hand gestures can provide a natural means of human-computer interaction and enable people who cannot speak to communicate efficiently. Existing hand gesture recognition methods heavily depend on pre-defined gestures, however, motor-impaired individuals require new gestures tailored to each individual’s gesture motion and style. Gesture samples collected from different persons have distribution shifts due to their health conditions, the severity of the disability, motion patterns of the arms, etc. In this paper, we introduce the Latent Embedding Exploitation (LEE) mechanism in our replay-based Few-Shot Continual Learning (FSCL) framework that significantly improves the performance of fine-tuning a model for out-of-distribution data. Our method produces a diversified latent feature space by leveraging a preserved latent embedding known as gesture prior knowledge, along with intra-gesture divergence derived from two additional embeddings. Thus, the model can capture latent statistical structure in highly variable gestures with limited samples. We conduct an experimental evaluation using the SmartWatch Gesture and the Motion Gesture datasets. The proposed method results in an average test accuracy of 57.0%, 64.6%, and 69.3% by using one, three, and five samples for six different gestures. Our method helps motor-impaired persons leverage wearable devices, and their unique styles of movement can be learned and applied in human-computer interaction and social communication. Code is available at: https://github.com/riyadRafiq/wearable-latent-embedding-exploitation.
AI4G7218
For the Misgendered Chinese in Gender Bias Research: Multi-Task Learning with Knowledge Distillation for Pinyin Name Gender Prediction
Xiaocong Du, Haipeng Zhang
10 min. talk | – at – | Session: –
[+] More
[-] Less
Achieving gender equality is a pivotal factor in realizing the UN’s Global Goals for Sustainable Development. Gender bias studies work towards this and rely on name-based gender inference tools to assign individual gender labels when gender information is unavailable. However, these tools often inaccurately predict gender for Chinese Pinyin names, leading to potential bias in such studies. With the growing participation of Chinese in international activities, this situation is becoming more severe. Specifically, current tools focus on pronunciation (Pinyin) information, neglecting the fact that the latent connections between Pinyin and Chinese characters (Hanzi) behind convey critical information. As a first effort, we formulate the Pinyin name-gender guessing problem and design a Multi-Task Learning Network assisted by Knowledge Distillation that enables the Pinyin representations in the model to possess semantic features of Chinese characters and to learn gender information from Chinese character names. Our open-sourced method surpasses commercial name-gender guessing tools by 9.70% to 20.08% relatively, and also outperforms the state-of-the-art algorithms.
AI4G7442
Self-Supervised Vision for Climate Downscaling
Karandeep Singh, Chaeyoon Jeong, Naufal Shidqi, Sungwon Park, Arjun Nellikkattil, Elke Zeller, Meeyoung Cha
10 min. talk | – at – | Session: –
[+] More
[-] Less
Climate change is one of the most critical challenges that our planet is facing today. Rising global temperatures are already affecting Earth’s weather and climate patterns with an increased frequency of unpredictable and extreme events. Future projections for climate change research are based on computer models like Earth System Models (ESMs). Climate simulations typically run on a coarser grid due to the high computational resources required, and then undergo a lighter downscaling process to obtain data on a finer grid. This work presents a self-supervised deep learning model that does not require high resolution ground truth data for downscaling. This is realized by leveraging salient distribution patterns and the hidden dependencies between weather variables for an individual data point at runtime. We propose three climate-specific components that well represent the patterns of underlying weather variables and learn intricate inter-variable dependencies. Extensive evaluation with 2x, 3x, and 4x scaling factors demonstrates that our model obtains 8% to 47% performance gain over existing baselines while greatly reducing the overall runtime. The improved performance and no dependence on high resolution ground truth data make our method a valuable tool for future climate research.
AI4G7874
Fuel-Saving Route Planning with Data-Driven and Learning-Based Approaches – A Systematic Solution for Harbor Tugs
Shengming Wang, Xiaocai Zhang, Jing Li, Xiaoyang Wei, Hoong Chuin Lau, Bing Tian Dai, Binbin Huang, Zhe Xiao, Xiuju Fu, Zheng Qin
10 min. talk | – at – | Session: –
[+] More
[-] Less
In recent years, there are trends toward cleaner port environments through enforcement by imposed legislation. Transit optimisation of fuel-based port service boats like harbour tugs has emerged as a critical task to reduce fuel consumption and carbon emission. In this paper, an innovative learning-based method, comprising a Reinforcement Learning (RL) model together with a fuel consumption prediction model, was proposed to formulate fuel-saving transit routes. Firstly, an ensemble model is established by combining a Long Short-Term Memory (LSTM) model with a Multilayer Perceptron (MLP) model, predicting fuel use based on tugboat movement and environment factors. Subsequently, an innovative RL based on Deep Deterministic Policy Gradient (DDPG) framework is developed considering the characteristics and obstructions of waterway in Singapore as well as the environmental factors to learn the optimal transit strategy that minimizes fuel consumption. We also demonstrate the efficacy of the solution to generate routes from origin to destination terminals, exhibiting significantly reduced fuel consumption in comparison to real-world transit scenarios.
AI4G7886
Automated Essay Scoring Using Discourse External Knowledge
Nisrine Ait Khayi, Vasile Rus
10 min. talk | – at – | Session: –
[+] More
[-] Less
The Automated Essay Scoring (AES) task is an important NLP research problem given its significance for the education ecosystem. Recently, researchers started to apply a hybrid approach to this task. This hybrid approach incorporates into a deep learning model expert features that assess a particular dimension of the essay. Motivated by these successes, we propose to automatically assess essays using a hybrid approach that relies on external discourse knowledge. Our proposed model consists of using transformer-based embeddings to generate semantic representations of essays. Then, we incorporate several discourse features into these representations. Finally, we apply a linear classifier to generate the final score. To evaluate the effectiveness of this approach, we have conducted extensive experiments using the Automated Student Assessment Prize dataset (ASAP). The performance of the proposed model has been evaluated using the Quadratic Weighted Kappa (QWK) metric. The experimental results demonstrate the effectiveness of this approach in comparison with several existing solutions in literature.
AI4G7979
Safeguarding Fraud Detection from Attacks: A Robust Graph Learning Approach
Jiasheng Wu, Xin Liu, Dawei Cheng, Yi Ouyang, Xian Wu, Yefeng Zheng
10 min. talk | – at – | Session: –
[+] More
[-] Less
Financial fraud is one of the most significant social issues and has caused tremendous property losses. Graph neural networks (GNNs) have been applied to anti-fraud practices and achieved decent results. However, recent researches have discovered flaws in the robustness of fraud-detection models based on GNNs, enabling fraudsters to mislead them through attacks like data poisoning. In addition, most existing attack-defense models tend to study on ideal settings and lose information during truncation or filtering, which lowers their performances in complicated financial fraud cases. Therefore, in this paper, we propose a novel robust anti-fraud GNN model. In particular, we first design an attack algorithm tampering with both features and structures of graph data to simulate fraudsters’ attacking behaviors in real-life complex fraud scenarios. Then we apply singular value decomposition to the graph and learn the decomposed matrices in a GNN model with specifically designed joint losses. This enables our model to learn the graph patterns in low-rank subspaces without losing too much detailed information and fit the graph structure to characteristics including class-homophily and sparsity to guarantee robustness. The proposed approach is experimented on real-world fraud datasets, which demonstrates its advantages in fraud detection and robustness compared with the state-of-the-art baselines.
AI4G8009
Transfer Learning Using Inaccurate Physics Rule for Streamflow Prediction
Tianshu Bao, Taylor Thomas Johnson, Xiaowei Jia
10 min. talk | – at – | Session: –
[+] More
[-] Less
Accurate streamflow prediction is critical for ensuring water supply and detecting floods, while also providing essential hydrological inputs for other scientific models in fields such as climate and agriculture. Recently, deep learning models have been shown to achieve state-of-the-art regionalization performance by building a global hydrologic model. These models predict streamflow given catchment physical characteristics and weather forcing data. However, these models are only focused on gauged basins and cannot adapt to ungaugaed basins, i.e., basins without training data. Prediction in Ungauged Basins (PUB) is considered one of the most important challenges in hydrology, as most basins in the United States and around the world have no observations. In this work, we propose a meta-transfer learning approach by enhancing imperfect physics equations that facilitate model adaptation. Intuitively, physical equations can often be used to regularize deep learning models to achieve robust regionalization performance under gauged scenarios, but they can be inaccurate due to the simplified representation of physics. We correct such uncertainty in physical equation by residual approximation and let these corrected equations guide the model training process. We evaluated the proposed method for predicting daily streamflow on the catchment attributes and meteorology for large-sample studies (CAMELS) dataset. The experiment results on hydrological data over 19 years demonstrate the effectiveness of the proposed method in ungauged scenarios.
AI4G8060
Understanding Public Perception Towards Weather Disasters Through the Lens of Metaphor
Rui Mao, Qika Lin, Qiawen Liu, Gianmarco Mengaldo, Erik Cambria
10 min. talk | – at – | Session: –
[+] More
[-] Less
Extreme weather can lead to weather-induced disasters. These have a profound impact on communities worldwide, causing loss of life, damage to properties and infrastructure, and disruption of daily activities. In alignment with the United Nations Sustainable Development Goals, addressing the increasing frequency and severity of these events, exacerbated by climate change, is imperative. Exploring public perception and responses to weather disasters becomes crucial for policymakers to formulate effective strategies that not only mitigate the impacts but also contribute to the goal of ensuring sustainable and resilient communities. Social media, as a pervasive and real-time communication platform, has gathered a large amount of public opinion. In this work, we analyze public perception towards weather disasters based on tweets and metaphors. Metaphor, as a linguistic device, plays a pivotal role in unraveling cognitive processes and understanding how individuals perceive and make sense of concepts. We focus on tweets related to four distinct types of weather disasters i.e., floods, hurricanes, tornadoes, and wildfires, aiming to extract nuanced insights regarding public perceptions, concerns, and attitudes towards these specific events. We also deliver constructive recommendations, based on the insights.
AI4G8217
Revealing Hierarchical Structure of Leaf Venations in Plant Science via Label-Efficient Segmentation: Dataset and Method
Weizhen Liu, Ao Li, Ze Wu, Yue Li, Baobin Ge, Guangyu Lan, Shilin Chen, Minghe Li, Yunfei Liu, Xiaohui Yuan, Nanqing Dong
10 min. talk | – at – | Session: –
[+] More
[-] Less
Hierarchical leaf vein segmentation is a crucial but under-explored task in agricultural sciences, where analysis of the hierarchical structure of plant leaf venation can contribute to plant breeding. While current segmentation techniques rely on data-driven models, there is no publicly available dataset specifically designed for hierarchical leaf vein segmentation. To address this gap, we introduce the HierArchical Leaf Vein Segmentation (HALVS) dataset, the first public hierarchical leaf vein segmentation dataset. HALVS comprises 5,057 real-scanned high-resolution leaf images collected from three plant species: soybean, sweet cherry, and London planetree. It also includes human-annotated ground truth for three orders of leaf veins, with a total labeling effort of 83.8 person-days. Based on HALVS, we further develop a label-efficient learning paradigm that leverages partial label information, i.e. missing annotations for tertiary veins. Empirical studies are performed on HALVS, revealing new observations, challenges, and research directions on leaf vein segmentation. Our dataset and code are available at https://github.com/WeizhenLiuBioinform/ HALVS-Hierarchical-Vein-Segment.
AI4G8227
Using Causal Inference to Investigate Contraceptive Discontinuation in Sub-Saharan Africa
Victor Akinwande, Megan MacGregor, Celia Cintas, Ehud Karavani, Dennis Wei, Kush R. Varshney, Pablo Nepomnaschy
10 min. talk | – at – | Session: –
[+] More
[-] Less
Discontinuation rates vary by family planning method and across socio-economic contexts. Understanding these variations and their causes is paramount for developing and implementing policies aimed at curbing discontinuation rates. Randomized controlled trials (RCTs) are ideal for obtaining this information, but this design can be extremely expensive and logistically complex. The ongoing collection of comprehensive data sets, such as Demographic and Health Surveys (DHS data), when combined with machine learning methods, present an alternative and relatively cost-effective means of evidence gathering for policy development. Here, we use causal inference to estimate the effect of injectable contraceptive use on discontinuation over the 12-month period that follows its adoption. To that aim, we use retrospective observational data from seven sub-Saharan African countries captured by the DHS’ Contraceptive Calendar. We use machine learning methods to characterize data regions that share common covariate support. We find that the use of injectables increased the risk of discontinuation in four of the seven countries analyzed. Consistent with existing literature, we find that concerns with the side-effects of injectables appear to be the most frequent reason for discontinuation. However, these risks decreased after adjusting for socio-economic factors. As risk estimates may not apply uniformly within populations, we characterized the sub-populations for robust estimations by their geographical region, level of unmet needs, marital status, level of education, and age of first sex.
AI4G8242
ReBandit: Random Effects Based Online RL Algorithm for Reducing Cannabis Use
Susobhan Ghosh, Yongyi Guo, Pei-Yao Hung, Lara Coughlin, Erin Bonar, Inbal Nahum-Shani, Maureen Walton, Susan Murphy
10 min. talk | – at – | Session: –
[+] More
[-] Less
The escalating prevalence of cannabis use, and associated cannabis-use disorder (CUD), poses a significant public health challenge globally. With a notably wide treatment gap, especially among emerging adults (EAs; ages 18-25), addressing cannabis use and CUD remains a pivotal objective within the 2030 United Nations Agenda for Sustainable Development Goals (SDG). In this work, we develop an online reinforcement learning (RL) algorithm called reBandit which will be utilized in a mobile health study to deliver personalized mobile health interventions aimed at reducing cannabis use among EAs. reBandit utilizes random effects and informative Bayesian priors to learn quickly and efficiently in noisy mobile health environments. Moreover, reBandit employs Empirical Bayes and optimization techniques to autonomously update its hyper-parameters online. To evaluate the performance of our algorithm, we construct a simulation testbed using data from a prior study, and compare against commonly used algorithms in mobile health studies. We show that reBandit performs equally well or better than all the baseline algorithms, and the performance gap widens as population heterogeneity increases in the simulation environment, proving its adeptness to adapt to diverse population of study participants.
AI4G8259
Detecting AI-Generated Sentences in Human-AI Collaborative Hybrid Texts: Challenges, Strategies, and Insights
Zijie Zeng, Shiqi Liu, Lele Sha, Zhuang Li, Kaixun Yang, Sannyuya Liu, Dragan Gasevic, Guangliang Chen
10 min. talk | – at – | Session: –
[+] More
[-] Less
This study explores the challenge of sentence-level AI-generated text detection within human-AI collaborative hybrid texts (abbreviated as hybrid texts). Existing studies of AI-generated text detection for hybrid texts often rely on synthetic datasets. These typically involve hybrid texts with a limited number of boundaries, e.g., single-boundary hybrid texts that begin with human-written content and end with machine-generated continuations. We contend that studies of detecting AI-generated content within hybrid texts should cover different types of hybrid texts generated in realistic settings to better inform real-world applications. Therefore, our study utilizes the CoAuthor dataset, which includes diverse, realistic hybrid texts generated through the collaboration between human writers and an intelligent writing system in multi-turn interactions. We adopt a two-step, segmentation-based pipeline: (i) detect segments within a given hybrid text where each segment contains sentences of consistent authorship, and (ii) classify the authorship of each identified segment. Our empirical findings highlight (1) detecting AI-generated sentences in hybrid texts is overall a challenging task because (1.1) human writers’ selecting and even editing AI-generated sentences based on personal preferences adds difficulty in identifying the authorship of segments; (1.2) the frequent change of authorship between neighboring sentences within the hybrid text creates difficulties for segment detectors in identifying authorship-consistent segments; (1.3) the short length of text segments within hybrid texts provides limited stylistic cues for reliable authorship determination; (2) before embarking on the detection process, it is beneficial to assess the average length of segments within the hybrid text. This assessment aids in deciding whether (2.1) to employ a text segmentation-based strategy for hybrid texts with longer segments, or (2.2) to adopt a direct sentence-by-sentence classification strategy for those with shorter segments.
AI4G8303
A Teacher Classroom Dress Assessment Method Based on a New Assessment Dataset
Ming Fang, Qi Liu, Yunpeng Zhou, Xinning Du, Qiwen Liang, Shuhua Liu
10 min. talk | – at – | Session: –
[+] More
[-] Less
Proper attire is a professional requirement for teachers and teachers’ dress influence students’ perceptions of teacher quality. Therefore, evaluating teacher attire can better regulate and improve the teacher’s dress. However, the lack of a dataset on teacher attire hinders the development of this field. For this purpose, this paper constructs a Teachers’ Classroom Dress Assessment (TCDA) dataset. To our knowledge, it is the first dataset focused on teacher attire. This dataset is entirely from the classroom environment, covering 25 teacher attributes, with a total of 11879 teacher dress samples and sufficient positive and negative examples. Therefore, the TCDA dataset is a challenging evaluation dataset with characteristics such as data diversity. In order to verify the effectiveness of the dataset, this paper systematically explores a new perspective on human attribute information and proposes for the first time a Teachers’ Dress Assessment Method (TDAM), aiming to use predicted teacher attributes to scoring the overall attire of each teacher, thereby promoting the development of the teacher’s classroom teaching field. The experimental results demonstrate the rationality of the TCDA dataset and the effectiveness of the TDAM method. The dataset and code can be openly obtained at https://github.com/MingZier/TCDA-dataset.
AI4G8313
Down the Toxicity Rabbit Hole: A Framework to Bias Audit Large Language Models with Key Emphasis on Racism, Antisemitism, and Misogyny
Arka Dutta, Adel Khorramrouz, Sujan Dutta, Ashiqur R. KhudaBukhsh
10 min. talk | – at – | Session: –
[+] More
[-] Less
This paper makes three contributions. First, it presents a generalizable, novel framework dubbed toxicity rabbit hole that iteratively elicits toxic content from a wide suite of large language models. Spanning a set of 1,266 identity groups, we first conduct a bias audit of PaLM 2 guardrails presenting key insights. Next, we report generalizability across several other models. Through the elicited toxic content, we present a broad analysis with a key emphasis on racism, antisemitism, misogyny, Islamophobia, homophobia, and transphobia. We release a massive dataset of machine-generated toxic content with a view toward safety for all. Finally, driven by concrete examples, we discuss potential ramifications.
AI4G8321
Ensuring Fairness Stability for Disentangling Social Inequality in Access to Education: the FAiRDAS General Method
Eleonora Misino, Roberta Calegari, Michele Lombardi, Michela Milano
10 min. talk | – at – | Session: –
[+] More
[-] Less
Recent advancements in Artificial Intelligence in Education (AIEd) have revolutionized educational practices using machine learning to extract insights from students’ activities and behaviours. Performance prediction, a key domain within AIEd, aims to enhance student achievement levels and address sustainable development goals related to education, health, gender equality, and economic growth. However, the potential of AIEd to contribute to these goals is hindered by the lack of attention to fairness in prediction algorithms, leading to educational inequality. To address this gap, we introduce FAiRDAS a general framework that models long-term fairness as an abstract dynamic system. Our approach, illustrated through a case study in AIEd with real data, offers a customizable solution to promote long-term fairness while promoting the stability of mitigation actions over time.
AI4G8332
Predicting Housing Transaction with Common Covariance GNNs
Jinjin Li, Bin Liu, Chengyan Liu, Hongli Zhang
10 min. talk | – at – | Session: –
[+] More
[-] Less
Urban migration is a significant aspect of a city’s economy. The exploration of the underlying determinants of housing purchases among current residents contributes to the study of future trends in urban migration, enabling governments to formulate appropriate policies to guide future economic growth. This article employs a factor model to analyze data on residents’ rentals, first-time home purchases, and subsequent housing upgrades. We decompose the factors influencing housing purchases into common drivers and specific drivers. Our hypothesis is that common drivers reflect universal social patterns, while personalized drivers represent stochastic elements. We construct a correlation matrix capturing the inter-resident relationships based on the common drivers of housing purchases. We then propose a graph neural network based on the correlation matrix to model housing predictions as a node classification problem. Our model addresses two critical questions. Firstly, we aim to identify which part of rental residents will engage in first-time home purchases in the future. Secondly, we seek to determine which group of residents, having completed rental and first-time home purchases, will opt for a second home purchase. The results of our testing on real-world datasets demonstrate that based solely on rental and home purchase records, we can achieve a sensitivity for housing predictions exceeding 80%.
AI4G8334
VulnerabilityMap: An Open Framework for Mapping Vulnerability among Urban Disadvantaged Populations in the United States
Lin Chen, Yong Li, Pan Hui
10 min. talk | – at – | Session: –
[+] More
[-] Less
Cities are crucibles of numerous opportunities, but also hotbeds of inequality. The plight of disadvantaged populations who are “left behind” within urban environments has been an increasingly pressing concern, which poses substantial threats to the realization of the UN SDG agenda. However, a comprehensive framework for studying this urban dilemma is currently absent, preventing researchers from developing AI models for social good prediction and intervention. To fill this gap, we construct VulnerabilityMap, a framework to meticulously dissect the challenges faced by urban disadvantaged populations, unraveling their vulnerability to a spectrum of shocks and stresses that are categorized through the prism of Maslow’s hierarchy of needs. Specifically, we systematically collect large-scale multi-sourced census and web-based data covering more than 328 million people in the United States regarding demographic features, neighborhood environments, offline mobility behaviors, and online social connections. These features are further related to vulnerability outcomes from short-term shocks such as COVID-19 and long-term physiological, social, and self-actualization stresses. Leveraging our framework, we construct machine learning models that exhibit strong performance in predicting vulnerability outcomes from various disadvantage features, which shows the promising utility of our framework to support targeted AI models. Moreover, we provide model-based explainability analysis to interpret the reasons underlying model predictions, shedding light on intricate social factors that trap certain populations inside vulnerable situations. Our constructed dataset is publicly available at https://github.com/LinChen-65/VulnerabilityMap/.
AI4G8348
Domain Adaptation with Joint Loss for Consistent Regression and Ordinal Classification in the Proxy Means Test for Poverty Targeting
Siti Mariyah, Wayne Wobcke
10 min. talk | – at – | Session: –
[+] More
[-] Less
Previous domain adaptation methods are designed to work for a single task, either classification or regression. In this paper, the task of the learner is to produce both an estimation and an ordinal classification of instances that are consistent in that the classification of instances into quantiles is derived from the estimated values. We propose an extension of the boosting for transfer method (TrAdaBoost), Joint Quantile Loss Boosting Domain Adaptation (TrAdaBoost.JQL) for regression transfer learning, that aims to jointly minimize regression and ordinal classification errors. Motivated by the real-world problem of poverty targeting using the Proxy Means Test, we empirically show that TrAdaBoost.JQL can consistently reduce RMSE and inclusion and exclusion errors for estimating per capita household expenditure, across a wide variety of districts in Indonesia, compared to other reweighting-based and invariant feature representation-based domain adaptation methods. We design TrAdaBoost.JQL to be flexible as to the chosen eligibility (poor) threshold used in poverty targeting practice and as to whether estimation or ordinal classification accuracy is prioritized.
AI4G8352
Time-Evolving Data Science and Artificial Intelligence for Advanced Open Environmental Science (TAIAO) Programme
Yun Sing Koh, Albert Bifet, Karin Bryan, Guilherme Cassales, Olivier Graffeuille, Nick Lim, Phil Mourot, Ding Ning, Bernhard Pfahringer, Varvara Vetrova, Heitor Murilo Gomes
10 min. talk | – at – | Session: –
[+] More
[-] Less
New Zealand’s unique ecosystems face increasing threats from climate change, impacting biodiversity and posing challenges to safety, livelihoods, and well-being. To tackle these complex issues, advanced data science and artificial intelligence techniques can provide unique solutions. Currently, in its fourth year of a seven-year program, TAIAO focuses on methods for analyzing environmental datasets. Recognizing this urgency, the open-source TAIAO platform was developed. This platform enables new artificial intelligence research for environmental data and offers an open-access repository to enhance reproducibility in the field. This paper will showcase four environmental case studies, artificial intelligence research, platform implementation details, and future development plans.
AI4G8366
Enhancing Sustainable Urban Mobility Prediction with Telecom Data: A Spatio-Temporal Framework Approach
ChungYi Lin, Shen-Lung Tung, Hung-Ting Su, Winston H. Hsu
10 min. talk | – at – | Session: –
[+] More
[-] Less
Traditional traffic prediction, limited by the scope of sensor data, falls short in comprehensive traffic management. Mobile networks offer a promising alternative using network activity counts, but these lack crucial directionality. Thus, we present the TeltoMob dataset, featuring undirected telecom counts and corresponding directional flows, to predict directional mobility flows on roadways. To address this, we propose a two-stage spatio-temporal graph neural network (STGNN) framework. The first stage uses a pre-trained STGNN to process telecom data, while the second stage integrates directional and geographic insights for accurate prediction. Our experiments demonstrate the framework’s compatibility with various STGNN models and confirm its effectiveness. We also show how to incorporate the framework into real-world transportation systems, enhancing sustainable urban mobility.
AI4G8394
Remote Sensing for Water Quality: A Multi-Task, Metadata-Driven Hypernetwork Approach
Olivier Graffeuille, Yun Sing Koh, Jörg Wicker, Moritz Lehmann
10 min. talk | – at – | Session: –
[+] More
[-] Less
Inland water quality monitoring is vital for clean water access and aquatic ecosystem management. Remote sensing machine learning models enable large-scale observations, but are difficult to train due to data scarcity and variability across many lakes. Multi-task learning approaches enable learning of lake differences by learning multiple lake functions simultaneously. However, they suffer from a trade-off between parameter efficiency and the ability to model task differences flexibly, and struggle to model many diverse lakes with few samples per task. We propose Multi-Task Hypernetworks, a novel multi-task learning architecture which circumvents this trade-off using a shared hypernetwork to generate different network weights for each task from small task-specific embeddings. Our approach stands out from existing works by providing the added capacity to leverage task-level metadata, such as lake depth and temperature, explicitly. We show empirically that Multi-Task Hypernetworks outperform existing multi-task learning architectures for water quality remote sensing and other tabular data problems, and leverages metadata more effectively than existing methods.
AI4G8415
DeepLight: Reconstructing High-Resolution Observations of Nighttime Light With Multi-Modal Remote Sensing Data
Lixian Zhang, Runmin Dong, Shuai Yuan, Jinxiao Zhang, Mengxuan Chen, Juepeng Zheng, Haohuan Fu
10 min. talk | – at – | Session: –
[+] More
[-] Less
Nighttime light (NTL) remote sensing observation serves as a unique proxy for quantitatively assessing progress toward meeting a series of Sustainable Development Goals (SDGs), such as poverty estimation, urban sustainable development, and carbon emission. However, existing NTL observations often suffer from pervasive degradation and inconsistency, limiting their utility for computing the indicators defined by the SDGs. In this study, we propose a novel approach to reconstruct high-resolution NTL images using multimodal remote sensing data. To support this research endeavor, we introduce DeepLightMD, a comprehensive dataset comprising data from five heterogeneous sensors, offering fine spatial resolution and rich spectral information at a national scale. Additionally, we present DeepLightSR, a calibration-aware method for multi-modality super-resolution. DeepLightSR integrates calibration-aware alignment, an auxiliary-to-main multi-modality fusion, and an auxiliary-embedded refinement to effectively address spatial heterogeneity, fuse diversely representative features, and enhance performance in ×8 super-resolution tasks. Extensive experiments demonstrate the superiority of DeepLightSR over 8 competing methods, as evidenced by improvements in PSNR (2.01 dB ~ 13.25 dB) and PIQE (0.49 ~ 9.32). Our findings underscore the practical significance of our proposed dataset and model in reconstructing high-resolution NTL data, supporting efficiently and quantitatively assessing the SDG progress. The code and data will be available at https://github.com/xian1234/DeepLight.
AI4G8431
FairReFuse: Referee-Guided Fusion for Multi-Modal Causal Fairness in Depression Detection
Jiaee Cheong, Sinan Kalkan, Hatice Gunes
10 min. talk | – at – | Session: –
[+] More
[-] Less
Machine learning (ML) bias in mental health detection and analysis is becoming an increasingly pertinent challenge. Despite promising efforts indicating that multimodal methods work better than unimodal methods, there is minimal work on multimodal fairness for depression detection. We propose a causal multimodal framework which consists of two modules. Module 1 performs causal interventional debiasing via backdoor adjustment for each modality to achieve group fairness. Module 2 adaptively fuses the different modalities using a referee-based individual fairness guided fusion mechanism to address individual fairness. We conduct experiments and ablation studies on three depression datasets, D-Vlog, DAIC-WOZ and E-DAIC, and show that our framework improves classification performance as well as group and individual fairness compared to existing approaches.
AI4G8442
A Novel GAN Approach to Augment Limited Tabular Data for Short-Term Substance Use Prediction
Nguyen Thach, Patrick Habecker, Bergen Johnston, Lillianna Cervantes, Anika Eisenbraun, Alex Mason, Kimberly Tyler, Bilal Khan, Hau Chan
10 min. talk | – at – | Session: –
[+] More
[-] Less
Substance use is a global issue that negatively impacts millions of persons who use drugs (PWUDs). In practice, identifying vulnerable PWUDs for efficient allocation of appropriate resources is challenging due to their complex use patterns (e.g., their tendency to change usage within months) and the high acquisition costs for collecting PWUD-focused substance use data. Thus, there has been a paucity of machine learning models for accurately predicting short-term substance use behaviors of PWUDs. In this paper, using longitudinal survey data of 258 PWUDs in the U.S. Great Plains collected by our team, we design a novel GAN that deals with high-dimensional low-sample-size tabular data and survey skip logic to augment existing data to improve classification models’ prediction on (A) whether the PWUDs would increase usage and (B) at which ordinal frequency they would use a particular drug within the next 12 months. Our evaluation results show that, when trained on augmented data from our proposed GAN, the classification models improve their predictive performance (AUROC) by up to 13.4% in Problem (A) and 15.8% in Problem (B) for usage of marijuana, meth, amphetamines, and cocaine, which outperform state-of-the-art generative models.
AI4G8443
Energy-Efficient Missing Data Imputation in Wearable Health Applications: A Classifier-aware Statistical Approach
Dina Hussein, Taha Belkhouja, Ganapati Bhat, Janardhan Rao Doppa
10 min. talk | – at – | Session: –
[+] More
[-] Less
Wearable devices are being increasingly used in high-impact health applications including vital sign monitoring, rehabilitation, and movement disorders. Wearable health monitoring can aid in the United Nations social development goal of healthy lives by enabling early warning, risk reduction, and management of health risks. Health tasks on wearable devices employ multiple sensors to collect relevant parameters of user’s health and make decisions using machine learning (ML) algorithms. The ML algorithms assume that data from all sensors are available for the health monitoring tasks. However, the applications may encounter missing or incomplete data due to user error, energy limitations, or sensor malfunction. Missing data results in significant loss of accuracy and quality of service. This paper presents a novel Classifier-Aware iMputation (CAM) approach to impute missing data such that classifier accuracy for health tasks is not affected. Specifically, CAM employs unsupervised clustering followed by a principled search algorithm to uncover imputation patterns that maintain high accuracy. Evaluations on seven diverse health tasks show that CAM achieves accuracy within 5% of the baseline with no missing data when one sensor is missing. CAM also achieves significantly higher accuracy compared to generative approaches with negligible energy overhead, making it suitable for wide range of wearable applications.
AI4G8447
An Embarrassingly Simple Approach to Enhance Transformer Performance in Genomic Selection for Crop Breeding
Renqi Chen, Wenwei Han, Haohao Zhang, Haoyang Su, Zhefan Wang, Xiaolei Liu, Hao Jiang, Wanli Ouyang, Nanqing Dong
10 min. talk | – at – | Session: –
[+] More
[-] Less
Genomic selection (GS), as a critical crop breeding strategy, plays a key role in enhancing food production and addressing the global hunger crisis. The predominant approaches in GS currently revolve around employing statistical methods for prediction. However, statistical methods often come with two main limitations: strong statistical priors and linear assumptions. A recent trend is to capture the non-linear relationships between markers by deep learning. However, as crop datasets are commonly long sequences with limited samples, the robustness of deep learning models, especially Transformers, remains a challenge. In this work, to unleash the unexplored potential of attention mechanism for the task of interest, we propose a simple yet effective Transformer-based framework that enables end-to-end training of the whole sequence. Via experiments on rice3k and wheat3k datasets, we show that, with simple tricks such as k-mer tokenization and random masking, Transformer can achieve overall superior performance against seminal methods on GS tasks of interest.
AI4G8458
RisQNet: Rescuing SMEs from Financial Shocks with a Novel Networked-Loan Risk Assessment
Zhaoyuan Lu, Taijun Li, Jingzhen Zhang, Moyang Liu, Xiang Li, Linyi Cui, Junqi Chen, Zhibin Niu
10 min. talk | – at – | Session: –
[+] More
[-] Less
In the face of economic downturns, Small and Medium-sized Enterprises (SMEs) within interconnected networked-loans are vulnerable to cascading debt crises, exacerbated by factors like social media-induced financial shocks. Traditional risk assessment models, which mainly rely on financial data, inadequately predict such crises, as evidenced by the collapse of Silicon Valley Bank in 2023. To address this issue, we developed RisQNet, a model that uses temporal graph networks to incorporate diverse risks, including real-time media influences. This approach not only advances risk prediction through news feature extraction and large language models but also enhances risk management strategies with intuitive visualization tools. Validated on a dataset with a total loan volume of USD 3 trillion, RisQNet outperforms the state-of-the-art baseline and achieves 87.1% of AUC. Our collaborative effort with financial regulators and the SME community underpins the model’s development, aligning with the UN SDG 8. RisQNet represents a significant step forward in leveraging AI for financial stability, offering a promising approach to combat the propagation of debt crises in financial networks.
AI4G8459
Streamflow Prediction with Uncertainty Quantification for Water Management: A Constrained Reasoning and Learning Approach
Mohammed Amine Gharsallaoui, Bhupinderjeet Singh, Supriya Savalkar, Aryan Deshwal, Ananth Kalyanaraman, Kirti Rajagopalan, Janardhan Rao Doppa
10 min. talk | – at – | Session: –
[+] More
[-] Less
Predicting the spatiotemporal variation in streamflow along with uncertainty quantification enables decision-making for sustainable management of scarce water resources. Process-based hydrological models (aka physics-based models) are based on physical laws, but use simplifying assumptions which can lead to poor accuracy. Data-driven approaches offer a powerful alternative, but they require large amount of training data and tend to produce predictions that are inconsistent with physical laws. This paper studies a constrained reasoning and learning (CRL) approach where physical laws represented as logical constraints are integrated as a layer in the deep neural network. To address small data setting, we develop a theoretically-grounded training approach to improve the generalization accuracy of deep models. For uncertainty quantification, we combine the synergistic strengths of Gaussian processes (GPs) and deep temporal models by passing the learned latent representation as input to a standard distance-based kernel. Experiments on multiple real-world datasets demonstrate the effectiveness of both CRL and GP with deep kernel approaches over strong baseline methods.
AI4G8477
Benchmarking Fish Dataset and Evaluation Metric in Keypoint Detection – Towards Precise Fish Morphological Assessment in Aquaculture Breeding
Weizhen Liu, Jiayu Tan, Guangyu Lan, Ao Li, Dongye Li, Le Zhao, Xiaohui Yuan, Nanqing Dong
10 min. talk | – at – | Session: –
[+] More
[-] Less
Accurate phenotypic analysis in aquaculture breeding necessitates the quantification of subtle morphological phenotypes. Existing datasets suffer from limitations such as small scale, limited species coverage, and inadequate annotation of keypoints for measuring refined and complex morphological phenotypes of fish body parts. To address this gap, we introduce FishPhenoKey, a comprehensive dataset comprising 23,331 high-resolution images spanning six fish species. Notably, FishPhenoKey includes 22 phenotype-oriented annotations, enabling the capture of intricate morphological phenotypes. Motivated by the nuanced evaluation of these subtle morphologies, we also propose a new evaluation metric, Percentage of Measured Phenotypes (PMP). It is designed to assess the accuracy of individual keypoint positions and is highly sensitive to the phenotype measured using the corresponding keypoints. To enhance keypoint detection accuracy, we further propose a novel loss, Anatomically-Calibrated Regularization (ACR), that can be integrated into keypoint detection models, leveraging biological insights to refine keypoint localization. Our contributions set a new benchmark in fish phenotype analysis, addressing the challenges of precise morphological quantification and opening new avenues for research in sustainable aquaculture and genetic studies. Our dataset and code are available at https://github.com/WeizhenLiuBioinform/FishPhenotype-Detect.
AI4G8482
Predicting Carpark Availability in Singapore with Cross-Domain Data: A New Dataset and A Data-Driven Approach
Huaiwu Zhang, Yutong Xia, Siru Zhong, Kun Wang, Zekun Tong, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
10 min. talk | – at – | Session: –
[+] More
[-] Less
The increasing number of vehicles highlights the need for efficient parking space management. Predicting real-time Parking Availability (PA) can help mitigate traffic congestion and the corresponding social problems, which is a pressing issue in densely populated cities like Singapore. In this study, we aim to collectively predict future PA across Singapore with complex factors from various domains. The contributions in this paper are listed as follows: (1) A New Dataset: We introduce the SINPA dataset, containing a year’s worth of PA data from 1,687 parking lots in Singapore, enriched with various spatial and temporal factors. (2) A Data-Driven Approach: We present DeepPA, a novel deep-learning framework, to collectively and efficiently predict future PA across thousands of parking lots. (3) Extensive Experiments and Deployment: DeepPA demonstrates a 9.2% reduction in prediction error for up to 3-hour forecasts compared to existing advanced models. Furthermore, we implement DeepPA in a practical web-based platform to provide real-time PA predictions to aid drivers and inform urban planning for the governors in Singapore. We release the dataset and source code at https://github.com/yoshall/SINPA.
AI4G8485
Spatio-Temporal Field Neural Networks for Air Quality Inference
Yutong Feng, Qiongyan Wang, Yutong Xia, Junlin Huang, Siru Zhong, Yuxuan Liang
10 min. talk | – at – | Session: –
[+] More
[-] Less
The air quality inference problem aims to utilize historical data from a limited number of observation sites to infer the air quality index at an unknown location. Considering the sparsity of data due to the high maintenance cost of the stations, good inference algorithms can effectively save the cost and refine the data granularity. While spatio-temporal graph neural networks have made excellent progress on this problem, their non-Euclidean and discrete data structure modeling of reality limits its potential. In this work, we make the first attempt to combine two different spatio-temporal perspectives, fields and graphs, by proposing a new model, Spatio-Temporal Field Neural Network, and its corresponding new framework, Pyramidal Inference. Extensive experiments validate that our model achieves state-of-the-art performance in nationwide air quality inference in the Chinese Mainland, demonstrating the superiority of our proposed model and framework.
AI4G8489
Fitness Activity Recognition Using a Novel Pressure Sensing Mat and Machine Learning for the Future of Accessible Training
Katia Bourahmoune, Karlos Ishac, Marc Carmichael
10 min. talk | – at – | Session: –
[+] More
[-] Less
Physical inactivity is still a major problem contributing to a growing public health crisis despite a fast-expanding body of technological solutions and wellness research around fitness training. The inaccessibility of professional fitness training remains a leading cause of this gap for reasons encompassing socioeconomic factors, cultural and demographic barriers, and more recently the threat of global pandemics that disrupt traditional modes of staying physically active. Previous lines of work have explored using AI for fitness activity recognition from various sensing modalities such as computer vision, wearable sensors, and force and pressure sensors. However, these works are limited by their feasibility, deployability, and accessibility in real-world scenarios, in addition to the technical challenges faced by each modality for accurate and reliable activity recognition. In this paper, we propose an accessible system for gym activity recognition and correction focusing on foundational fitness activities using ML and a novel pressure sensing mat, and validate its deployability in a real-world use case in a natural gym setting. We present the detailed and previously under-investigated Centre of Pressure (COP) profile of four main gym activities in terms of several COP-related metrics specifically as targets for ML-based recognition tasks. Based on this, we identify COP displacement and COP balance measures as important features for ML-based recognition of these fitness activities for future research in this area. Furthermore, we compare the performance of several ML models in the activity recognition task, achieving 98.5% recognition accuracy using ML models suitable for real-time deployment. Finally, we demonstrate the feasibility of our system in a live real-world with use case in a natural gym environment.
AI4G8514
Functional Graph Convolutional Networks: A Unified Multi-task and Multi-modal Learning Framework to Facilitate Health and Social-Care Insights
Tobia Boschi, Francesca Bonin, Rodrigo Ordonez-Hurtado, Cécile Rousseau, Alessandra Pascale, John Dinsmore
10 min. talk | – at – | Session: –
[+] More
[-] Less
This paper introduces a novel Functional Graph Convolutional Network (funGCN) framework that combines Functional Data Analysis and Graph Convolutional Networks to address the complexities of multi-task and multi-modal learning in digital health and longitudinal studies. With the growing importance of health solutions to improve health care and social support, ensure healthy lives, and promote well-being at all ages, funGCN offers a unified approach to handle multivariate longitudinal data for multiple entities and ensures interpretability even with small sample sizes. Key innovations include task-specific embedding components that manage different data types, the ability to perform classification, regression, and forecasting, and the creation of a knowledge graph for insightful data interpretation. The efficacy of funGCN is validated through simulation experiments and a real-data application. funGCN source code is publicly available at https://github.com/IBM/funGCN.
AI4G8526
Drug Overdose Vital-Signs Evaluator Using Machine Learning
Anush Niranjan Lingamoorthy, Abhishek Kumar Mishra, Suman Kumar, David Gordon, Jacob Brenner, Nagarajan Kandasamy, Amanda Watson
10 min. talk | – at – | Session: –
[+] More
[-] Less
Opioid overdose is an escalating global epidemic, affecting 16 million individuals. Lack of overdose detection and slower response times are the leading causes of overdose deaths. During a fatal opioid overdose, the user exhibits motionlessness, lack of breathing, and hypoxemia (oxygen saturation drops). In this paper, we discuss the development of a shoulder-based wearable overdose detection device that monitors hypoxemia, motion, and respiration. The device’s design considers the underserved socio-economic population and their psychological contexts. However, conventional approaches to detecting an overdose typically focus on a single biomarker. To address this, we have developed a robust capsule networks based machine learning (ML) model, OxyCaps that integrates oxygen saturation, respiration rate, and motion to classify different levels of hypoxemia. This also helps improve patient adherence by decreasing the chances of false positive alerts. To determine a hypoxemic state, the model considers various features like skin tone, body physiology, motion, and photoplethysmography (PPG) signals. In the absence of real-world opioid overdose data, our research leverages data collected by our device from 19 patients experiencing sleep apnea, exploiting the parallels between overdose and apnea biomarkers. Our dataset provides a novel compilation of raw PPG and motion signals detected from the shoulder. Our model classifies 3 stages of hypoxemia with an average accuracy of 92%, specifically achieving a high recall of 0.98 for the critical hypoxemic state that is crucial in determining an overdose.
AI4G8530
Empathy and AI: Achieving Equitable Microtransit for Underserved Communities
Eleni Bardaka, Pascal Van Hentenryck, Crystal Chen Lee, Christopher B. Mayhorn, Kai Monast, Samitha Samaranayake, Munindar P. Singh
10 min. talk | – at – | Session: –
[+] More
[-] Less
This paper describes a newly launched project that will produce a new approach to public microtransit for underserved communities. Public microtransit cannot rely on pricing signals to manage demand, and current approaches face the challenges of simultaneously being underutilized and overextended. This project conceives of the setting as a sociotechnical system. Its main idea is to engage users through AI agents in conjunction with platform constraints to find solutions that purely technical conceptions cannot find. The project was specified over an intense series of discussions with key stakeholders (riders, city government, and nongovernmental agencies) and brings together expertise in the disciplines of AI, Operations Research, Urban Planning, Psychology, and Community Development. The project will culminate in a pilot study, results from which will facilitate the transfer of its technology to additional communities.
AI4G8531
SUKHSANDESH: An Avatar Therapeutic Question Answering Platform for Sexual Education in Rural India
Salam Michael Singh, Shubhmoy Kumar Garg, Amitesh Misra, Aaditeshwar Seth, Tanmoy Chakraborty
10 min. talk | – at – | Session: –
[+] More
[-] Less
Sexual education aims to foster a healthy lifestyle in terms of emotional, mental and social well-being. In countries like India, where adolescents form the largest demographic group, they face significant vulnerabilities concerning sexual health. Unfortunately, sexual education is often stigmatized, creating barriers to providing essential counseling and information to this at-risk population. Consequently, issues such as early pregnancy, unsafe abortions, sexually transmitted infections, and sexual violence become prevalent. Our current proposal aims to provide a safe and trustworthy platform for sexual education to the vulnerable rural Indian population, thereby fostering the healthy and overall growth of the nation. In this regard, we strive towards designing SUKHSANDESH, a multi-staged AI-based Question Answering platform for sexual education tailored to rural India, adhering to safety guardrails and regional language support. By utilizing information retrieval techniques and large language models, SUKHSANDESH will deliver effective responses to user queries. We also propose to anonymise the dataset to mitigate safety measures and set AI guardrails against any harmful or unwanted response generation. Moreover, an innovative feature of our proposal involves integrating "avatar therapy" with SUKHSANDESH. This feature will convert AI-generated responses into real-time audio delivered by an animated avatar speaking regional Indian languages. This approach aims to foster empathy and connection, which is particularly beneficial for individuals with limited literacy skills. Partnering with Gram Vanni, an industry leader, we will deploy SUKHSANDESH to address sexual education needs in rural India.
AI4G8540
CGAP: Urban Region Representation Learning with Coarsened Graph Attention Pooling
Zhuo Xu, Xiao Zhou
10 min. talk | – at – | Session: –
[+] More
[-] Less
The explosion of massive urban data recently has provided us with a valuable opportunity to gain deeper insights into urban regions and the daily lives of residents. Urban region representation learning emerges as a crucial realm for fulfilling this task. Among deep learning approaches, graph neural networks (GNNs) have shown promise, given that city elements can be naturally represented as nodes with various connections between them as edges. However, many existing GNN approaches encounter challenges such as over-smoothing and limitations in capturing information from nodes in other regions, resulting in the loss of crucial urban information and a decline in region representation performance. To address these challenges, we leverage urban graph structure information and introduce a hierarchical graph pooling process called Coarsened Graph Attention Pooling (CGAP). CGAP features local attention units to create coarsened intermediate graphs and global features. Additionally, by incorporating urban region graphs and global features into a global attention layer, we harness relational information to enhance representation effectiveness. Furthermore, CGAP integrates region attributes such as Points of Interest (POIs) and inter-regional contexts like human mobility, enabling the exploitation of multi-modal urban data for more comprehensive representation learning. Experiments on three downstream tasks related to the UN Sustainable Development Goals validate the effectiveness of region representations learned by our approach. Experimental results and analyses demonstrate that CGAP excels in various socioeconomic prediction tasks compared to competitive baselines.
AI4G8550
Unmasking Societal Biases in Respiratory Support for ICU Patients through Social Determinants of Health
Mira Moukheiber, Lama Moukheiber, Dana Moukheiber, Hyung-Chul Lee
10 min. talk | – at – | Session: –
[+] More
[-] Less
In critical care settings, where precise and timely interventions are crucial for health outcomes, evaluating disparities in patient outcomes is important. Current approaches often fall short in comprehensively understanding and evaluating the impact of respiratory support interventions on individuals affected by social determinants of health. Attributes such as gender, race, and age are commonly assessed and essential, but provide only a partial view of the complexities faced by diverse populations. In this study, we focus on two clinically motivated tasks: prolonged mechanical ventilation and successful weaning. We also perform fairness audits on the models’ predictions across demographic groups and social determinants of health to better understand the health inequities in respiratory interventions in the intensive care unit. We also release a temporal benchmark dataset, verified by clinical experts, to enable benchmarking of clinical respiratory intervention tasks.
AI4G8559
A Survival Guide for Iranian Women Prescribed by Iranian Women: Participatory AI to Investigate Intimate Partner Physical Violence in Iran
Adel Khorramrouz, Mahbeigom Fayyazi, Ashiqur R. KhudaBukhsh
10 min. talk | – at – | Session: –
[+] More
[-] Less
Intimate Partner Violence (IPV) is a global problem affecting more than 2 billion women worldwide. Our paper makes two key contributions. First, via a substantial corpus of 53,220 comments to 1,563 Intimate Partner Physical Violence (IPPV) posts gleaned from more than 10 million comments posted on 523,232 posts on a popular parental health website in Iran, we present the first-ever computational analysis of user comments on accounts of IPPV in Iran. We harness large language models and participatory AI and tackle extreme class imbalance and other linguistic challenges that arise from tackling low-resource languages to shed light on the gender struggles of a country with documented stark gender inequality. With active input from a woman with a history of advocacy for social rights and grounded in Iranian culture, we characterize comments on IPPV into three broad categories: empathy, confront, and conform, and analyze their distribution. Second, we release an important dataset of 3,400 comments on IPPV posts.
AI4G8562
Deploying Mobility-On-Demand for All by Optimizing Paratransit Services
Sophie Pavia, David Rogers, Amutheezan Sivagnanam, Michael Wilbur, Danushka Edirimanna, Youngseo Kim, Philip Pugliese, Samitha Samaranayake, Aron Laszka, Ayan Mukhopadhyay, Abhishek Dubey
10 min. talk | – at – | Session: –
[+] More
[-] Less
While on-demand ride-sharing services have become popular in recent years, traditional on-demand transit services cannot be used by everyone, e.g., people who use wheelchairs. Paratransit services, operated by public transit agencies, are a critical infrastructure that offers door-to-door transportation assistance for individuals who face challenges in using standard transit routes. However, with declining ridership and mounting financial pressure, public transit agencies in the USA struggle to operate existing services. We collaborate with a public transit agency from the southern USA, highlight the specific nuances of paratransit optimization, and present a vehicle routing problem formulation for optimizing paratransit. We validate our approach using real-world data from the transit agency, present results from an actual pilot deployment of the proposed approach in the city, and show how the proposed approach comprehensively outperforms existing approaches used by the transit agency. To the best of our knowledge, this work presents one of the first examples of using open-source algorithmic approaches for paratransit optimization.
AI4G8571
From Pink and Blue to a Rainbow Hue! Defying Gender Bias through Gender Neutralizing Text Transformations
Gopendra Singh, Soumitra Ghosh, Neil Dcruze, Asif Ekbal
10 min. talk | – at – | Session: –
[+] More
[-] Less
In an era where language biases contribute to societal inequalities, this research focuses on gender bias in textual data, with profound implications for promoting inclusivity and equity, aligning with United Nations Sustainable Development Goals (SDGs) and upholding the principle of Leave No One Behind (LNOB). Leveraging advances in artificial intelligence, the study introduces the GEnder-NEutralizing Text Transformation (GENETT) framework, addressing gender bias in text through auto-encoders, vector quantization, and Neutrality-Infused Stylization. Furthermore, we present the first-of-its-kind corpus of GEnder Neutralized REvisions (GENRE) crafted from gender-stereotyped versions. This corpus serves a multifaceted utility, offering a resource for diverse downstream tasks in gender-bias analysis. Extensive experimentation on GENRE highlights the superiority of the proposed model over established baselines and state-of-the-art methods. Access the code and dataset at 1. https://www.iitp.ac.in/~ai-nlp-ml/resources.html#GNR, 2. https://github.com/Soumitra816/GNR. Note: Our research focuses on understanding cyber harassment conversations, especially in under-researched areas, with the exclusion of non-binary cases due to existing dataset limitations, not lack of sensitivity. We strive for inclusivity and plan to address this in future research with suitable datasets.
AI4G8574
Long-term Detection and Monitory of Chinese Urban Village Using Satellite Imagery
Yuming Lin, Xin Zhang, Yu Liu, Zhenyu Han, Qingmin Liao, Yong Li
10 min. talk | – at – | Session: –
[+] More
[-] Less
Urban villages are areas filled with rural-like improvised structures in Chinese cities, usually housing the most vulnerable groups. Under the guidance of the Sustainable Development Goals (SDGs), the Chinese government initiated renewal and redevelopment projects, underscoring the meticulous mapping and segmentation of urban villages. Satellite imagery is advanced and efficient in identifying urban villages and monitoring changes, but traditional methods neglect the morphological diversity in season, shape, size, spacing, and layout of urban villages, which is not satisfying for long-term wide-range data. Here, we design a targeted approach based on Tobler’s First Law of Geography, using curriculum labeling to solve morphological diversity and semi-automatically generate segmentation for urban village boundaries. Specifically, we use manually labeled data as seeds for pre-trained SegFormer models and incrementally fine-tune the model based on geographical proximity. The rigorous experimentation across five diverse cities substantiates the commendable efficacy of our methodology. IoU metric demonstrates a noteworthy improvement of over 119% to baseline. Our final results cover 265,050 urban villages across 433 cities in China over the past 10 years, and the analysis reveals the uneven redevelopment by geography and city scale. We further examine the within-city distribution and verify the urban scaling law associated with several socio-economic factors. Our method can be used nationwide to decide redevelopment priority and resource tilt, contributing to SDG 11.1 on affordable housing and upgrading slums. The code and dataset are available at https://github.com/tsinghua-fib-lab/LtCUV.
AI4G8576
Guiding Clinical Reasoning with Large Language Models via Knowledge Seeds
Jiageng Wu, Xian Wu, Jie Yang
10 min. talk | – at – | Session: –
[+] More
[-] Less
Clinical reasoning refers to the cognitive process that physicians employ in evaluating and managing patients. This process typically involves suggesting necessary examinations, diagnosing patients’ diseases, and selecting appropriate therapies, etc. Accurate clinical reasoning requires extensive medical knowledge and rich clinical experience, setting a high bar for physicians. This is particularly challenging in developing countries due to the overwhelming number of patients and limited physician resources, contributing significantly to global health inequity and necessitating automated clinical reasoning approaches. Recently, the emergence of large language models (LLMs) such as ChatGPT and GPT-4 have demonstrated their potential in clinical reasoning. However, these LLMs are prone to hallucination problems, and the reasoning process of LLMs may not align with the clinical decision pathways of physicians. In this study, we introduce a novel framework, In-Context Padding (ICP), to enhance LLMs reasoning with medical knowledge. Specifically, we infer critical clinical reasoning elements (referred to as knowledge seeds) and use these as anchors to guide the generation process of LLMs. Experiments on two clinical question datasets validate that ICP significantly improves the clinical reasoning ability of LLMs.
AI4G8582
From Pixels to Progress: Generating Road Network from Satellite Imagery for Socioeconomic Insights in Impoverished Areas
Yanxin Xi, Yu Liu, Zhicheng Liu, Sasu Tarkoma, Pan Hui, Yong Li
10 min. talk | – at – | Session: –
[+] More
[-] Less
The Sustainable Development Goals (SDGs) aim to resolve societal challenges, such as eradicating poverty and improving the lives of vulnerable populations in impoverished areas. Those areas rely on road infrastructure construction to promote accessibility and economic development. Although publicly available data like OpenStreetMap is available to monitor road status, data completeness in impoverished areas is limited. Meanwhile, the development of deep learning techniques and satellite imagery shows excellent potential for earth monitoring. To tackle the challenge of road network assessment in impoverished areas, we develop a systematic road extraction framework combining an encoder-decoder architecture and morphological operations on satellite imagery, offering an integrated workflow for interdisciplinary researchers. Extensive experiments of road network extraction on real-world data in impoverished regions achieve a 42.7% enhancement in the F1-score over the baseline methods and reconstruct about 80% of the actual roads. We also propose a comprehensive road network dataset covering approximately 794,178 km2 area and 17.048 million people in 382 impoverished counties in China. The generated dataset is further utilized to conduct socioeconomic analysis in impoverished counties, showing that road network construction positively impacts regional economic development. The technical appendix, code, and generated dataset can be found at https://github.com/tsinghua-fib-lab/Road_network_extraction_impoverished_counties.
AI4G8584
LEEC for Judicial Fairness: A Legal Element Extraction Dataset with Extensive Extra-Legal Labels
Zongyue Xue, Huanghai Liu, Yiran Hu, Yuliang Qian, Yajing Wang, Kangle Kong, Chenlu Wang, Yun Liu, Weixing Shen
10 min. talk | – at – | Session: –
[+] More
[-] Less
An extensive label system is pivotal to facilitate judicial fairness and social justice. Prior empirical research and our interview with legal professionals underscore the importance of extra-legal factors in criminal trials. To help identify sentencing biases and facilitate downstream applications, we introduce the Legal Element ExtraCtion (LEEC) dataset comprising 15,919 judicial documents and 155 labels. This dataset was constructed through two main steps: First, designing the label system by legal experts based on prior empirical research which identified critical factors driving and processes generating sentencing outcomes in criminal cases; Second, employing legal knowledge to annotate judicial documents according to the label system and annotation guideline. LEEC represents the most extensive and domain-specific legal element extraction dataset for the Chinese legal system. Our experiments reveal that despite certain capabilities, both Document Event Extraction (DEE) models and Large Language Models(LLMs) face significant restrictions in legal element extraction tasks. Finally, our empirical analysis based on LEEC provides evidence for judicial unfairness in Chinese criminal sentencing and confirms the applicability of LEEC for future empirical research and other downstream applications. LEEC and related resources are available on https://github.com/THUlawtech/LEEC.
AI4G8601
MuseCL: Predicting Urban Socioeconomic Indicators via Multi-Semantic Contrastive Learning
Xixian Yong, Xiao Zhou
10 min. talk | – at – | Session: –
[+] More
[-] Less
Predicting socioeconomic indicators within urban regions is crucial for fostering inclusivity, resilience, and sustainability in cities and human settlements. While pioneering studies have attempted to leverage multi-modal data for socioeconomic prediction, jointly exploring their underlying semantics remains a significant challenge. To address the gap, this paper introduces a Multi-Semantic Contrastive Learning (MuseCL) framework for fine-grained urban region profiling and socioeconomic prediction. Within this framework, we initiate the process by constructing contrastive sample pairs for street view and remote sensing images, capitalizing on the similarities in human mobility and Point of Interest (POI) distribution to derive semantic features from the visual modality. Additionally, we extract semantic insights from POI texts embedded within these regions, employing a pre-trained text encoder. To merge the acquired visual and textual features, we devise an innovative cross-modality-based attentional fusion module, which leverages a contrastive mechanism for integration. Experimental results across multiple cities and indicators consistently highlight the superiority of MuseCL, demonstrating an average improvement of 10% in R2 compared to various competitive baseline models. The code of this work is publicly available at https://github.com/XixianYong/MuseCL.
AI4G8829
CDSTraj: Characterized Diffusion and Spatial-Temporal Interaction Network for Trajectory Prediction in Autonomous Driving
Haicheng Liao, Xuelin Li, Yongkang Li, Hanlin Kong, Chengyue Wang, Bonan Wang, Yanchen Guan, KaHou Tam, Zhenning Li
10 min. talk | – at – | Session: –
[+] More
[-] Less
Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments. To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios. At the heart of this model lies the Characterized Diffusion Module, an innovative module designed to simulate traffic scenarios with inherent uncertainty. This module enriches the predictive process by infusing it with detailed semantic information, thereby enhancing trajectory prediction accuracy. Complementing this, our Spatio-Temporal (ST) Interaction Module captures the nuanced effects of traffic scenarios on vehicle dynamics across both spatial and temporal dimensions with remarkable effectiveness. Demonstrated through exhaustive evaluations, our model sets a new standard in trajectory prediction, achieving state-of-the-art (SOTA) results on the Next Generation Simulation (NGSIM), Highway Drone (HighD), and Macao Connected Autonomous Driving (MoCAD) datasets across both short and extended temporal spans. This performance underscores the model’s unparalleled adaptability and efficacy in navigating complex traffic scenarios, including highways, urban streets, and intersections.