understanding temporal and spatial travel paterns of individual passengers by mining smart card data
Question1:what is the temporal acess pattern?
Question2:what is the spatio access pattern?
Question3:is there any relationship between the temporal and spatio pattern?
Question4:is this passenger's paterns normal or special?
(如何能既能表现temporal和spatio,刷卡人的每次出行,时间和空间不能分家,仅时间不可以,仅空间也不可以,因此如何把他们俩个同时表示出来才可以)
benifit:
- policy evaluation
- anomaly detection(beggar:specail passengers)
- social networking(a scalable processing:connecting the passengers with similar public transportation patterns)
contribution:
- a systematic approach :extract temporal and spatial features,uses spatio-temporal analyse to perform abnoramal detection.
- an in-depth analysis and explanations for different groups
Morency 的三篇论文与其相似,已下载
Dataset: a month,21 weekdays,metro or bus transactions
Data preprocessing:
- find all trips belongs to one passenger
- filter out the passengers that rarely take metro.make a picture to show the distribution of the number of passengers according to the number of active days:有80%的人活动工作日天数少于7,20%的最活跃的人占有68% 的交易。研究那些很少出行的人没有意义,因此将工作日天数少于6的人去掉
Temporal features extraction: n维数据来描述时间属性
- n值不能太大也不能太小
- the central idea of temporal feature extraction is to divide time into sequential and overlapped slots.
- 选择这个的原因,第一:non-overlapped slots即不重叠的时间序列很难表示一些trips;第二:很少有trips超过三个小时,因此把时间长度定位3小时,8:00-10:59;9:00-11:59等
- 三步骤提取时空属性
Spatial features extract:
- OD矩阵,按OD对的频率下降排列,将空间属性的值设为4
anomaly features extract:
- 用时多于相同的OD用时 概率W ;起始点与终点相同 概率P ;
- 需要找出这两种异常经常发生的人
Temporal analysis:
- Clustering:k-means 将按时间属性将乘客分成四类:
- TGrp1:one dominant travel slot
- TGrp2:two dominant travel slot
- TGrp3:one relatively high dominant travel slot and one general travel slot
- TGrp4:no significant diference
- 分析一番,将公交聚类,BTGrp1-4
- 将TGrp与BTGrp 结合起来分析,分析乘客的行为
Spatial analysis:
- k-means聚类方法将其分成四类
- SGrp1:only one frequently accessed OD-pair
- SGrp2:two frequently accessed OD -pairs
- SGrp3:one relatively frequnetly accessed OD-pair and one general accessed OD-pair
- SGrp4:no remarkable frequently accessed OD-pair
- SGrp与TGrp的关系:使用条件概率,发现概率很大
- 解释为什么有些人choose metro in a single trip and choose bus in another trip ,instead of metro in round trips.
Anomaly analysis:
- W:the radio of abnomal travel time trips
- P:the radio of abnomal OD pairs of a passengers
- 将概率W与P为40%一下的去掉,WP二维散点表,得到几类异常