madlib | 易学教程

madlib

GPDB-特性实践

阅读更多关于 GPDB-特性实践

date: 2020-01-11 15:51:39 前段时间导师要求了解 GreenPlum 数据库，后来安装和使用了一下，感觉和其他数据库没有什么不同，于是就不了了之了。现在重新看一遍 GPDB 的特性并尝试使用这些特性。其实，官方宣传页上写的特性才是真正需要我去了解的。特性首先，GPDB 的特性是什么？从哪里找？产品的宣传页上肯定有。最大特性 GPDB 的最大特性就是 MPP , 即 Massively Parallel Processing 。在首页上，最明显的就是这两个： Massively Parallel，大规模并行 Analytics，分析然后，下滑页面，两个明显的特性是： Power at scale: High performance on petabyte-scale data volumes. PB级数据高性能处理。 True Flexibility: Deploy anywhere. 部署灵活。主要特性首页接着往下滑，写明了 GPDB 的主要特性： MPP Architecture Petabyte-Scale Loading: 加载速度随着每个额外节点的增加而增加，每个机架的加载速度超过10Tb/h ( 约为 347.22 GB/s ) 。 Innovative Query Optimization: 工业界中首个大数据工作负载的

How to unnest a 2d array into a 1d array quickly in PostgreSQL?

阅读更多关于 How to unnest a 2d array into a 1d array quickly in PostgreSQL?

问题 I have a really large array that have I computed with Apache Madlib and I would like to apply an operation to each single array in that 2d array. I have found code that can help me unnest it from this related answer. However, the code is miserably slow on this really large 2d array (150,000+ 1d float arrays). While unnest() only takes a few seconds to run, even after waiting for several minutes the code has not completed. Surely, there must be a faster way to unnest the large 2d array into

订阅 madlib