Home Journals RIA Benefit from domain ontologies and rule mining to improve truth discovery

JOURNAL METRICS

CiteScore 2023: ℹCiteScore:

CiteScore is the number of citations received by a journal in one year to documents published in the three previous years, divided by the number of documents indexed in Scopus published in those same three years.

SCImago Journal Rank (SJR) 2023: ℹSCImago Journal Rank (SJR):

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

Source Normalized Impact per Paper (SNIP) 2023: ℹSource Normalized Impact per Paper(SNIP):

SNIP measures a source’s contextual citation impact by weighting citations based on the total number of citations in a subject field. It helps you make a direct comparison of sources in different subject fields. SNIP takes into account characteristics of the source's subject field, which is the set of documents citing that source.

Benefit from domain ontologies and rule mining to improve truth discovery

Valentina Beretta| Sylvie Ranwez| Sébastien Harispe| Isabelle Mougenot

LGI2P, IMT Mines Ales, Univ Montpellier, Ales, France 6, avenue de Clavières, F-30 319 Alès, France

UMR 228 Espace-Dev, Université de Montpellier 500, rue JF. Breton, F-34 093 Montpellier cedex 5, France

Corresponding Author Email:

prenom.nom@mines-ales.fr; isabelle.mougenot@umontpellier.fr

Received:

| |

Accepted:

| | Citation

ria32_3_07_beretta.pdf

OPEN ACCESS

https://ria.revuesonline.com/accueil.jsp

Abstract:

Data veracity is one of the main issues regarding web data. Facing fake news proliferation and disinformation dangers, Truth Discovery models can be used to assess this veracity by estimating value confidence and source trustworthiness through analysis of claims on the same real-world entities provided by different sources. This treatment is crucial within an automated knowledge extraction process, in particular if resulting knowledge bases (KB) are devoted to be used in decision processes. Many studies have been conducted in Truth Discovery domain; however none of them, to our knowledge, take into account the a priori knowledge that may exist regarding a domain (e.g., domain ontologies). This article proposes two ways to reinforce some value confidences and thus source trustworthiness calculus during this process: the first one considers the concepts’ hierarchy and the second one exploits patterns that are extracted from KB using association rule learning techniques. Both approaches are validated and tested using benchmarks, that are freely available as well as the source code.

Keywords:

truth discovery, ontologies, semantic web, value confidence, source trustworthiness, association rule learning, reasoning

1. Introduction

2. État de l’art et positionnement

3. Formalisation du problème et description de l’approche proposée

4. Évaluation de la méthode

5. Résultats

6. Conclusion et perspectives

References

Auer S. et al., (2007). DBpedia: A Nucleus for a Web of Open Data. In K. Aberer et al., eds. The Semantic Web, Lecture Note in Computer Science. Springer Berlin Heidelberg, pp. 722–735.

Beretta V. et al., (2016). How Can Ontologies Give You Clue for Truth-Discovery? An Exploratory Study. In Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics. Nîmes, France, pp. 15:1-15:12.

Berti-Équille L. & Borge-Holthoefer J. (2015). Veracity of Data : From Truth Discovery Computation Algorithms to Models of Misinformation Dynamics, ser. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, pp.1-155.

Blanco L. et al., (2010). Probabilistic Models to Reconcile Complex Data from Inaccurate Data Sources. In Proceedings of the 22nd International Conference on Advanced Information Systems Engineering, Hammamet, Tunisia, pp.83–97.

Boley H. (2000). Relationships between logic programming and RDF. In Proceedings of the 6th Pacific Rim International Conference on Artificial Intelligence, Melbourne, Australia, Melbourne, Australia,, pp. 201-218.

Dong X.L. et al., (2010). Global detection of complex copying relationships between sources. In Proceedings of the VLDB Endowment, 3(1-2), pp.1358–1369.

Dong X.L. et al., (2015). Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources. In Proceedings of the VLDB Endowment, 8(9), pp. 938–949.

Dong X.L., Berti-Equille L. & Srivastava D. (2009a). Integrating conflicting data: the role of source dependence. In Proceedings of the VLDB Endowment, 2(1), pp. 550–561.

Dong X.L., Berti-Equille L. & Srivastava D. (2009b). Truth Discovery and Copying Detection in a Dynamic World. In Proceeding of VLDB Endowment, 2(1), pp. 562–573.

Feno D.R. (2007). Mesures de qualité des règles d’association : normalisation et caractérisation des bases. Université de la Réunion, France.

Galárraga L. et al., (2015). Fast rule mining in ontological knowledge bases with AMIE+. The VLDB Journal, 24(6), pp.707–730.

Galland A. et al., (2010). Corroborating Information from Disagreeing Views.In Proceedings of the third ACM international conference on Web search and data mining, New York City, NY, USA, pp.131–140.

Gupta M., Sun Y. & Han J. (2011). Trust analysis with clustering. In Proceedings of the 20th international conference companion on World wide web, pp. 53-54.

Harispe S. et al., (2015). On the consideration of a bring-to-mind model for computing the Information Content of concepts defined into ontologies. In Proceedings of IEEE International Conference on Fuzzy Systems, Istanbul, Turkey, pp. 1-8.

Harispe S. et al., (2015). Semantic Similarity from Natural Language and Ontology Analysis. Synthesis Lectures on Human Language Technologies, 8(1), pp.1–254.

Harispe S. et al., (2013). SML: semantic measure library. Available at: http://www.semanticmeasures- library.org/sml/.

Hitzler P. et al., (2009). OWL 2 Web Ontology Language Primer. W3C recommendation, pp.1–123.

Jean P.-A. et al., (2016). Uncertainty Detection in Natural Language: A Probabilistic Model. In Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics. Nîmes, France, pp. 10:1-10:10.

Li Y. et al., (2015). A Survey on Truth Discovery. ACM SIGKDD Explorations Newsletter, 17(2), pp.1–16.

Maimon O. & Rokach L. (2005). Data Mining and Knowledge Discovery Handbook. O. Maimon & L. Rokach (eds.), Springer US Publisher, pp.1-1285.

Mann C.J.H. (2003). The Description Logic Handbook – Theory, Implementation and Applications, Kybernetes, 32(8-9).

Meng C. et al., (2015). Truth Discovery on Crowd Sensing of Correlated Entities. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, Seoul, Republic of Korea, pp.169–182.

Nenov Y. et al., (2015). RDFox: A Highly-Scalable RDF Store. In Proceedings of the 14th International Semantic Web Conference, Bethlehem, Pennsylvania, pp. 3-20.

Pasternack J. & Roth D. (2010). Knowing what to believe (when you already know something). In Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, pp.877–885.

Pasternack J. & Roth D. (2011). Making better informed trust decisions with generalized factfinding. In Proceedings of the22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain, 3, pp.2324–2329.

Pochampally R. et al., (2014). Fusing data with correlations. In Proceedings of the 2014 ACM Special Interest Group on Management of Data, Snowbird, USA, pp.433–444.

Qi G.-J. et al., (2013). Mining collective intelligence in diverse groups. In Proceedings of the 22nd international conference on World Wide Web, pp. 1041–1052.

Quboa Q.K. & Saraee M. (2013). A State-of-the-Art Survey on Semantic Web Mining. Intelligent Information Management, Rio de Janeiro, Brazil, 5, pp.10–17.

Seco N., Veale T. & Hayes J. (2004). An intrinsic information content metric for semantic similarity in WordNet. In Proceedings of the 16th European Conference on Artificial Intelligence, Valencia, Spain, pp.1089–1090.

Shafer G. (1976). A Mathematical Theory of Evidence, Princeton: Princeton University Press.

Wang D., Abdelzaher T. & Kaplan L. (2015). Social Sensing: Building Reliable Systems on Unreliable Data, Morgan Kaufmann Publishers, San Francisco, CA, USA, pp. 1-232.

Wang S. et al., (2015). Scalable Social Sensing of Interdependent Phenomena. In Proceedings of the 14th International Conference on Information Processing in Sensor Networks, Seattle, USA, pp.202–213.

Wang X. et al., (2015). An Integrated Bayesian Approach for Effective Multi-Truth Discovery. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, pp. 493–502.

Wang Z. & Li J. (2015). RDF2Rules: Learning Rules from RDF Knowledge Bases by Mining Frequent Predicate Cycles. arXiv:1512.07734.

Yin X., Han J. & Yu P.S. (2008). Truth discovery with multiple conflicting information providers on the Web. IEEE Transactions on Knowledge and Data Engineering, 20(6), pp.796–808.

Zhao B. et al., (2012). A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration. In Proceedings of the VLDB Endowment, 5(6), pp.550–561.

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form

Benefit from domain ontologies and rule mining to improve truth discovery