引用本文:张明西,张雷洪,吕巍,孙刘杰.按需印刷平台中的相似搜索研究[J].包装工程,2015,36(23):135-139.
【打印本页】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 2337次   下载 2382 本文二维码信息
码上扫一扫!
分享到: 微信 更多
按需印刷平台中的相似搜索研究
张明西, 张雷洪, 吕巍, 孙刘杰
上海理工大学,上海 200093
摘要:
目的 研究按需印刷平台中的相似搜索效率问题。方法 利用用户与产品之间的 “购买” 关系构建 “用户-产品” 关系, 基于P-Rank提出一种高效的相似搜索方法POD-Rank, 用于从 “用户-产品” 关系中发现相似产品。POD-Rank相似搜索过程依据 “用户-产品” 关系离线计算用户相似性, 并利用用户相似性在线计算产品相似性, 而后进一步提出优化的在线查询处理算法, 以降低查询处理的时间开销。结果 POD-Rank的计算时间开销和存储开销显著低于P-Rank, 而且能够快速响应查询请求。结论 POD-Rank 的相似性计算开销为 P-Rank 的 0.03%, 存储开销为 P-Rank 的 0.06%, 计算效果与P-Rank接近, 能够满足按需印刷平台中大规模产品数据处理的需求。
关键词:  按需印刷  P-Rank  相似搜索  “用户-产品” 关系图
DOI:
分类号:TS801.8
基金项目:上海市教委科研创新项目 (15ZZ074);上海高校青年教师培养资助计划 (ZZSLG14021);上海出版传媒研究院招标课题 (SAYB1410);上海理工大学博士启动基金 (1D-14-309-001)
Similarity Search over Print-on-demand Platform
ZHANG Ming-xi, ZHANG Lei-hong, LYU Wei, SUN Liu-jie
Shanghai University of Science and Technology, Shanghai 200093, China
Abstract:
The aim of this work was to study the efficiency problem of similarity search over Print-On-Demand (POD) Platform. A "user-product" relation graph was built by utilizing the purchasing relationship between user and product, the similarity between products was measured according to the structure of "user-product" relation graph. For improving the efficiency, we proposed a similarity search method, POD-Rank, which divided the computation process into 2 steps. In the first step, we computed the similarity between users in an off-line manner; and in the second step, we computed the similarity between the query and each candidate product based on user similarity in an online manner. For further reducing the response time of on-line query processing, we proposed an optimized online query processing algorithm by skipping the unnecessary accumulation operations on zero-values. The space cost and pre-computation time cost of POD-Rank were evidently lower than those of P-Rank with little effectiveness loss and short online query time. By adopting the 2-step similarity computation method, the time cost was significantly reduced, the computation time cost was only 0.03% of that of P-Rank, the size of similarity matrix was only 0.06% of that of P-Rank, and the effectiveness was close to that of P-Rank. This method can therefore be efficiently applied to processing of large datasets of POP platform.
Key words:  Print-On-Demand  P-Rank  similarity search  "user-product" relation graph

关于我们 | 联系我们 | 投诉建议 | 隐私保护

您是第24463689位访问者    渝ICP备15012534号-2

版权所有:《包装工程》编辑部 2014 All Rights Reserved

邮编:400039 电话:023—68792836传真:023—68792396 Email: designartj@126.com

    

  
 

渝公网安备 50010702501717号