OLAP de documents

OLAP de documents

Omar Khrouf Kaïs Khrouf Jamel Feki

Laboratoire MIR@CL, Université de Sfax, Route de l’Aérodrome Km 4, B.P. 1088, 3018 Sfax, Tunisie

Faculty of Computing and IT, University of Jeddah, Jeddah, Saudi Arabia

Corresponding Author Email: 
Omar.Khrouf@yahoo.fr, Khrouf.Kais@isecs.rnu.tn, Jamel.Feki@gmail.com
28 February 2016
| Citation

Documents constitute a capitalization of knowledge within organizations’ Information Systems. Therefore, the management of the contents of these documents represents, during several years, a crucial need allowing organizations to improve their decision-making processes, in order to enhance the success of their activities and thus their sustainability. For decision-makers, analyzing the contents of documents represents a real challenge. In this context, we propose a new and generic multidimensional model called CobWeb; it is based on the galaxy model and dedicated to the OLAP (On-Line Analytical Processing) of XML documents (eXtensible Markup Language). The proposed model relies on a combination of different standard facets extracted from XML documents in order to provide more opportunities for the expression of analytical queries and an appropriate vision of data for decision-makers.


OLAP, XML document, standard facet, multidimensional model, OROLAP

1. Introduction
2. État de l’art
3. Notion de facette de documents
4. Modélisation conceptuelle : modèle multidimensionnel en toile d’araignée
5. Modélisation logique
6. Expérimentations
7. Conclusion

Aknouche R., Asfari O., Bentayeb F., Boussaid O. (2013). Decisional architecture for text warehousing: ETL-text process and multidimensional model TWM. Proceedings of the 19th International Conference on Management of Data, p. 101-104.

Azabou M., Khrouf K., Feki J., Vallès N., Soulé-Dupuy C. (2015). Diamond multidimensional model and aggregation operators for document OLAP. IEEE 9th International Conference on Research Challenges in Information Science, Athens, Greece, p. 363-373.

Bautista M., Molina C., Tejeda E., Vila A. (2013). A new multidimensional model with text dimensions: definition and implementation. International Journal of Computational Intelligence Systems, vol. 6, n° 1, p. 137-155.

Ben Mefteh S., Khrouf K., Feki J., Soulé-Dupuy C. (2013). Semantic Structure for XML Documents: Structuring and Pruning. Journal of Information Organization, vol. 3, n° 1, p. 37-46.

Boussaid O., Ben Messaoud R., Choquet R., Anthoard S. (2006). Conception et construction d’entrepôts XML. Journée francophone sur les Entrepôts de Données et l’Analyse en ligne, Versailles, France, p. 3-22.

Cabanac G., Chevalier M., Chrisment C., Julien C. (2010). Organization of digital resources as an original facet for exploring the quiescent information capital of a community. International Journal on Digital Libraries, Vol. 11, n° 4, p. 239-261.

Charhad M., Quénot G. (2004). Semantic Video Content Indexing and Retrieval using Conceptual Graphs. IEEE Conference on Information and Communication Technologies: From Theory to Applications, Damascus, Syria, p.19-23.

Dublin Core. (2012). The Dublin Core Metadata Element Set de http://dublincore.org/,Version 1.1.

Eder J., Koncilia C. (2001). Changes of dimension data in temporal data warehouses. Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWaK’01), Munich, Germany, p. 284-293.

Evéquoz F., Thomet J., Lalanne D. (2010). Gérer son information personnelle au moyen de la navigation par facettes. Proceedings of the 22nd Conference on l’nteraction Homme-Machine, Luxembourg, p. 41-48.

Feki J., Ben Messaoud I., Zurfluh G. (2013). Building an XML Document Warehouse.

Journal of Decision Systems (JDS), Taylor & Francis, vol. 22, n° 2, p. 122-148. Ghozzi F., Ravat F., Teste O., Zurfluh O. (2003). Modèle multidimensionnel à contraintes. Extraction et Gestion des Connaissances, Lyon, France, p. 43-55.

Hachaichi Y., Feki J. (2013). An Automatic Method for the Design of Multidimensional Schemas from Object Oriented Databases. International Journal of Information Technology and Decision Making, vol. 12, n° 6, p. 1223-1260.

Hernandez N., Mothe J., Ralalason B., Ramamonjisoa B., Stolf P. (2008). A Model to Represent the Facets of Learning Objects. Interdisciplinary Journal of e-Learning and Learning Objects, vol. 4, p. 65-82.

Khrouf O., Khrouf K., Feki J. (2013). A new multidimensional model for the OLAP of documents based on facets. The International Arab Conference on Information Technology, Khartoum, Soudan.

Khrouf O., Khrouf K., Feki J. (2014). Modèle multidimensionnel en toile d’araignée : Modélisation conceptuelle et logique. Conférence sur les Avancées des Systèmes Décisionnels, Hammamet, Tunisie, p. 146-156.

Khrouf O., Khrouf K., Altalhi A., Feki J. (2015). CobWeb Multidimensional Model: Filtering Documents using Semantic Structures and OLAP. The Tenth International Conference on Internet and Web Applications and Services, Brussels, Belgium, p. 92-98.

Kimball R., Ross M. (2013). The Data Warehouse Toolki: The Definitive Guide to Dimensional Modeling, 3rd edition. John Wiley & Sons, New York.

Kumar S., Morstatter F., Marshall G., Liu H., Nambiar U. (2012). Navigating Information

Facets on Twitter (NIF-T). Proceedings of the 18th ACM SIGKDD International conference on Knowledge discovery and data mining, Beijing, China, p. 1548-1551.

Lin C. X., Ding, B., Han J., Zhu F., Zhao B. (2008). Text cube: Computing in measures for multidimensional text database analysis. Proceedings of the 8th IEEE International Conference on Data Mining, Pisa, p. 905-910.

Mechkour M. (1995). A Multifacet Formal Image Model for Information Retrieval. Proceedings of the Final WorkShop on Multimedia Information Retrieval, Glasgow, UK, p. 18-20.

Oukid L., Benblidia N., Bentayeb F., Asfari O., Boussaid O. (2015). Contextualized Text OLAP Based on Information Retrieval. International Journal of Data Warehousing and Mining IJDWM, vol. 11, n° 2, p.1-21.

Park B.-K., Han H., Song I.-Y. (2005). XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses. Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery, Copenhagen, Denmark, p. 32-42.

Ravat F., Teste O., Tournier R., Zurfluh G. (2008). Designing and Implementing OLAP Systems from XML Documents. Proc. Annals of Information Systems, Springer, Special issue New Trends in Data Warehousing and Data Analysis, vol. 3, p. 1-21.

Ravat F., Teste O., Tournier R., Zurfluh G. (2010). Finding an Application-Appropriate Model for XML Data Warehouses. Information Systems, vol. 35, n° 6, p. 662-687.

Soutou C. (1999). Relational-objet sous oracle 8 : Modélisation avec UML. Edition Eyrolles.

Tournier R. (2007). Analyse en ligne (OLAP) des documents. Thèse de doctorat en Informatique, Université Toulouse III, Paul Sabatier, Toulouse, France.

Tseng F.S.C., Chou A.Y. (2006). The concept of document warehousing for multidimensional modeling of textual-based business intelligence. Journal Decision Support System (DSS), vol. 42, n° 2, p. 727-744.

Zhang D., Zhai C., Han J. (2009). Topic cube: Topic modeling for olap on multidimensional text databases. Proceedings of the SIAM International Conference on Data Mining, NV, USA, p. 1124-1135.