Associate Professor, Zhejiang University
Email: yangya {at} zju [dot] edu [dot] cn
Office: Room 415, CGB Building, Yuquan Campus
I am associate professor of Computer Science and Technology at Zhejiang University, serving as dean of Artificial Intelligence. I am also scientific advisor at FinVolution Group. My research interests include artificial intelligence in networks, deep learning for large-scale dynamic time-series, and computational social science. I obtained my Ph.D. degree from Tsinghua University in 2016, fortunately advised by Jie Tang and Juanzi Li. During my Ph.D. career, I have been visiting Cornell University (working with John Hopcroft) in 2012, and University of Leuven (working with Marie-Francine Moens) in 2013. I also have Yizhou Sun from UCLA as my research advisor. Here is my CV.
I am looking for highly-motivated students to work with me. If interested, please drop me a message by email.
Time series modeling has attracted extensive research efforts; however, achieving both reliable efficiency and interpretability from a unified model still remains a challenging problem.
Our recent work proposes to model time series from the perspective of graphs. More specifically, we aim to capture the intrinsic factors and their transitions behind the time series, and describe how these factors affect the time series evolution. To achieve this, we respectively propose the shapelet based method (Time2Graph, Cheng et al., AAAI'20; Time2Graph+, Cheng et al., TKDE'21) and a dynamic graph neural network based model (EvoNet, Hu et al., WSDM'21). Our proposed methods not only achieves clear improvements comparing with state-of-the-art baselines in many tasks, but also provide valuable insights towards explaining the results of prediction results.
Our work has been applied in real-world scenarios, such as network traffic anomaly monitor, as a common service of Alicloud, and electricity-theft behavior detection (Hu et al., WWW'20), collaborated with Alibaba and State Grid Corporation of China.Related papers: (Cheng et al., TKDE'21), (Cheng et al., AAAI'20), (Hu et al., WSDM'21), (Hu et al., WWW'20)
Related codes: [Time2Graph], [Time2Graph+], [EvoNet]
Network data in real-world tends to be error-prone due to incomplete sampling or imperfect measurements. This in turn results in inaccurate results when performing network analysis or modeling, such as node classification and link prediction, on these flawed networks.
Our research aims to reconstruct a reliable network from a flawed one, a process referred to network enhancement. More specifically, network enhancement aims to detect the noisy links that are observed in the network but should not exist in the real world, as well as to complement the missing links that do indeed exist in the real world yet remain unobserved.
From one perspective, we turn the network enhancement problem into edge sequences generation, and employ a deep reinforcement learning framework to solve it, which takes advantage of downstream task to guide the network denoising process (NetRL, Xu et al., TKDE'21). From another perspective, we construct a self-supervised learning framework that identifies missing links and nosiy links simultaneously by leveraging the mutual influence of them (E-Net, Xu et al., TKDE'20)Moreover, we study the model robustness against adversarial attacks. Our work shows that even without any information about the target model, one can still perform effective attacks (Xu et al., AAAI'22a). To handle such perturbations, we further propose an unsupervised defense technique to robustify pre-trained deep graph models (Xu et al., AAAI'22b).
Related papers: (Xu et al., TKDE'20), (Xu et al., TKDE'21), (Xu et al., AAAI'22a), (Xu et al., AAAI'22b).
Related codes & benchmark: [NetRL], [E-Net], [Graph Robustness Benchmark]
The goal is to understand and detect abnormal vertexes (e.g., users with anomalous behaviors) in large-scale social and information networks. Our work has been widely applied in many scenarios.
In telecommunications field, we propose to spot telemarketing frauds, with an emphasis on unveiling the "precise fraud" phenomenon and the strategies that are used by fraudsters to precisely select targets (Yang et al., TKDE'19). Our study is conducted on a one-month complete dataset of telecommunication metadata in Shanghai with 54 million anonymous users and 698 million call logs.
In financial field, we unearth the correlation between users' anomalous behaviors and their communication network structure in an online lending platform. Moreover, we propose a novel problem: how to identify muti-type fraudsters (Yang et al., CIKM'19)? Our proposed framework can uniformly identify two types of frauds: default borrowers, who will default on a loan to the platform, and cheating agents, who recruit and teach borrowers to cheat by providing false information and faking application materials.
Related papers: (Yang et al., TKDE'19), (Yang et al., CIKM'19)
Graph embedding, also known as network representation learning, aims to learn the low-dimensional representations of vertexes in a network, while structure and inherent properties of the graph is preserved.
Our research mainly focuses on learning representations for social networks. Comparing with other networks, social networks have unique properties. For example, social networks are dynamic and evolving over time, caused by user interactions and unstable user relations. We study how to preserve both structural information and temporal information of a given social network, by modeling triadic closure process (Zhou et al., AAAI'18). In particular, the general idea is to impose triad, which is a group of three vertices and is one of the basic units of networks. We model how a closed triad, which consists of three vertices connected with each other, develops from an open triad that has two of three vertices not connected with each other. This triadic closure process is a fundamental mechanism in the formation and evolution of networks, thereby makes our model being able to capture the network dynamics and to learn representation vectors for each vertex at different time steps.
Besides, social networks are scale-free: vertex degrees of a social network follow a heavy-tailed distribution. Is it possible to reconstruct a scale-free network according to the learned vertex embedding? We first theoretically analyze the difficulty of embedding and reconstructing a scale-free network in the Euclidean space, by converting our problem to the sphere packing problem. Then, we propose the "degree penalty" principle for designing scale-free property preserving network embedding algorithm: punishing the proximity between high-degree vertexes. We introduce two implementations of our principle by utilizing the spectral techniques and a skip-gram model respectively (Feng et al., AAAI'18).
Related papers: (Zhou et al., AAAI'18), (Feng et al., AAAI'18), (Gu et al., WWW'18)
Related codes and data: [DynamicTriad] [DP-Spectral]