杨世旺,钟雨星,杨春晨,孙明铎,施焕健.基于分布式并行计算框架的海量数据增量处理方法研究[J].电力需求侧管理,2019,21(S1):04-06 |
基于分布式并行计算框架的海量数据增量处理方法研究 |
Study on massive data incremental processing method based on distributed parallel computing framework |
投稿时间:2019-07-24 修订日期:2019-08-15 |
DOI:10. 3969 / j. issn. 1009-1831. 2019. S1. 002 |
中文关键词: 数据仓库 分布式并行计算框架 增量数据处理 |
英文关键词: data warehouse distributed parallel computing framework incremental data processing |
基金项目: |
|
摘要点击次数: 292 |
全文下载次数: 225 |
中文摘要: |
随着大数据时代的到来,各大企业已经陆续建设数据仓库,如何提升海量数据处理的效率逐渐成为数仓应用中的重要问题。分析了企业大数据平台现状,提出了基于分布式并行计算框架的数据增量处理方法,并基于此方法对海量增量数据处理进行了实践,验证结果表明,该方法提升了数据处理效率,增强了企业数据仓库数据处理的准确性。 |
英文摘要: |
With the arrival of the era of big data, how to improve the efficiency of mass data processing is more important in the application of enterprise data warehouse. Based on the analysis of the current situation of data sharing application platform of electric, a data incremental processing method based on a distributed parallel computing framework is proposed, and the practice of massive incremental data processing is carried out based on this method. Verification results show that the problem of low data processing efficiency is solved, and the accuracy of data processing in enterprise data warehouse is enhanced. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |