|
|
Project
Overview
In recent years, there has been tremendous growth in the data
warehousing market. Despite the sophistication and maturity of
conventional database technologies, the ever-increasing size of
corporate databases, coupled with the emergence of the new global
Internet "database", suggests that new computing models may soon
be required to fully support many crucial data management tasks.
In particular, the exploitation of parallel algorithms and
architectures holds considerable promise, given their inherent
capacity for both concurrent computation and data access.
We are interested in the design parallel data mining and OLAP
algorithms and their implementation on coarse grained parallel
multicomputers and PC-based clusters. To date we have been focusing on
parallelization of Data cube methods. Data cube queries represent an
important class of On-Line Analytical Processing (OLAP) queries in
decision support systems. The precomputation of the different group-bys of
a data cube (i.e., the forming of aggregates for every combination of
GROUP BY attributes) is critical to improving the response time of the
queries [Gray et. al. 1997]. The resulting data structures can then be
used to dramatically accelerate visualization and query tasks associated
with large information sets.
|
| |
Project
Team
This project represents ongoing joint work with a number of researchers
including Frank Dehne (Carleton, Canada) and Susanne Hambrusch (Purdue,
USA). Todd Eavis, a Ph.D. student at Dalhousie has also played an major
role. The following Undergraduate and Masters students
in Computer Science have been involved in the implementation effort: Zimin
Chen, Steven Blimkie, Khoi Manh Nguyen, Thomas Pehle, and Suganthan
Sivagnanasundaram. |
|
|
Publications
|
Papers
in Refereed Journals |
|
F. Dehne, T. Eavis, and A. Rau-Chaplin,
"The cgmCUBE Project: Optimizing Parallel Data Cube Generation For ROLAP"
, Distributed and Parallel Databases, Sep 2004. |
|
Y. Chen, F. Dehne, T. Eavis, and A. Rau-Chaplin,
"Improved Data Partitioning For Building Large ROLAP Data Cubes in Parallel"
, International Journal of Data Warehousing and Mining, Volume 2, Number 1, Aug 2004, pages 1-26. |
|
Y. Chen, F. Dehne, T. Eavis, A. Rau-Chaplin,
"Parallel ROLAP Data Cube Construction On Shared-Nothing Multiprocessors"
, Distributed and Parallel Databases, Volume 15, Number 3, May 2004, pages 219-236. |
|
F. Dehne, T. Eavis, S. Hambrusch and A. Rau-Chaplin,
"Parallelizing The Data Cube"
, Distributed and Parallel Databases (Special Issue on Parallel and Distributed Data Mining), Volume 11, Number 2, Sep 2001, pages 181-201. |
|
Papers
in Refereed Conference Proceedings |
|
M. Lawrence and A. Rau-Chaplin,
"Dynamic View Selection for OLAP"
in Proceedings of the 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2006), Krakow, Poland, Sep 2006. |
|
M. Lawrence and A. Rau-Chaplin,
"The OLAP-Enabled Grid: Model and Query Processing Algorithms"
in Proceedings of the 20th International Symposium on High Performance Computing Systems and Applications (HPCS'06), IEEE, Eds. R. Deupree, St. Johns, Canada, May 2006. |
|
Y. Chen, F. Dehne, T. Eavis, and A. Rau-Chaplin,
"cgmOLAP: Efficient Parallel Generation and Querying of Terabyte Size ROLAP Data Cubes"
in Proceedings of the 22nd International Conference on Data Engineering, IEEE, Atlanta, USA, Apr 2006. |
|
Y. Chen, F. Dehne, T. Eavis, A. Rau-Chaplin,
"Building Large ROLAP Data Cubes in Parallel"
in Proceedings of the 8th International Database Engineering and Applications Symposium (IDEAS '04), IEEE, pages 367-377, Coimbra, Portugal, Jul 2004. |
|
Y. Chen, F. Dehne, T. Eavis, and A. Rau-Chaplin,
"PnP: Parallel And External Memory Iceberg Cube Computation."
in Proceedings of the 21st International Conference on Data Engineering (ICDE 2005) (Short paper), IEEE, Tokyo, Japan, Jun 2004. |
|
F. Dehne, T. Eavis, and A. Rau-Chaplin,
"Computing Partial Data Cubes"
in Data Warehousing and Business Intelligence Minitrack of the Thirty-Seventh Hawaii International Conference on System Sciences (HICSS-37), Jan 2004. |
|
F. Dehne, T. Eavis, and A. Rau-Chaplin,
"Parallel Multi-Dimensional ROLAP Indexing"
in Proceedings of the 3rd IEEE/ACM International Symposuim on Cluster Computing and the Grid (CCGrid2003), pages 86--93, Tokyo, Japan, Oct 2002. |
|
Y. Chen, F. Dehne, T. Eavis, and A. Rau-Chaplin,
"Parallel ROLAP Data Cube Construction On Shared-Nothing Multiprocessors"
in International Parallel and Distributed Processing Symposium (IPDPS2003), Nice, France, Oct 2002. |
|
F. Dehne, T. Eavis and A. Rau-Chaplin,
"Computing Partial Data Cubes for Parallel Data Warehousing Applications"
in Proceedings of PVM-MPI 01, Volume 2131, Lecture Notes in Computer Science, Springer Verlag, pages 319-326, Santorini, Greece, Sep 2001. |
|
F. Dehne, T. Eavis, and A. Rau-Chaplin,
"Coarse Grained Parallel On-Line Analytical Processing (OLAP) For Data Mining"
in Proceedings of the 2001 International Conference on Computational Science (ICCS 2001), San Francisco, USA, May 2001. |
|
F. Dehne, T. Eavis, and A. Rau-Chaplin,
"A Cluster Architecture for Parallel Data Warehousing"
in Proceedings of the 2001 IEEE International Symposium of Cluster Computing and the Grid (CCGRid'01), May 2001. |
|
F. Dehne, S. Hambrusch, T. Eavis, and A. Rau-Chaplin,
"Parallelizing The Data Cube"
in Proceedings of the 8th International Conference on Database Theory (ICDT'01), London, UK, Jan 2001. |
|