Publications

2024

  1. glossary.png
    A Data Quality Glossary
    Sedir Mohammed , Lou Therese Brandner , Felicia Burtscher , Sebastian Hallensleben , Hazar Harmouch, Andreas Hauschke , Jessica Heesen , Stefanie Hildebrandt , Simon David Hirsbrunner , Julia Keselj , Philipp Mahlow , Marie Massow , Felix Naumann , Frauke Rostalski , Anna Wilken , and Annika Wölke
    Jan 2024
  2. under.jpg
    The Five Facets of Data Quality Assessment
    Sedir Mohammed , Lisa Ehrlinger , Hazar Harmouch, Felix Naumann , and Divesh Srivastava
    SIGMOD Records, Jan 2024
  3. under.jpg
    Step-by-Step Data Cleaning Recommendations to Improve ML Prediction Accuracy
    Sedir Mohammed , Felix Naumann , and Hazar Harmouch
    EDBT 2025, Jan 2024

2023

  1. ewaf23.png
    How Data Quality Determines AI Fairness: The Case of Automated Interviewing
    Lou T. Brandner , Philipp Mahlow , Anna Wilken , Annika Wölke , Hazar Harmouch, and Simon D. Hirsbrunner
    European Workshop on Algorithmic Fairness (EWAF2023), Jan 2023
  2. glossary.png
    Ein Glossar zur Datenqualität (German)
    Sedir Mohammed , Lou Brandner , Sebastian Hallensleben , Hazar Harmouch, Andreas Hauschke , Jessica Heesen , Stefanie Hildebrandt , Simon David Hirsbrunner , Julia Keselj , Philipp Mahlow , Felix Naumann , Frauke Rostalski , Anna Wilken , and Annika Wölke
    Mar 2023
    Die Forschung für diesen Artikel wurde gefördert durch das deutsche Bundesministerium für Arbeit und Soziales (BMAS) / The research for this article has been funded by the Federal Ministry of Labour and Social Affairs (in Germany).
  3. vldb.gif
    Joint Proceedings of Workshops at the 49th International Conference on Very Large Data Bases (VLDB 2023), Vancouver, Canada, August 28 - September 1, 2023
    Mar 2023

2022

  1. under.jpg
    The Effects of Data Quality on Machine Learning Performance
    Lukas Budach , Moritz Feuerpfeil , Nina Ihde , Andrea Nathansen , Nele Sina Noack , Hendrik Patzlaff , Felix Naumann , and Hazar Harmouch
    arXiv preprint arXiv:2207.14529, Mar 2022

2021

  1. ICDE21.jpg
    Relational Header Discovery using Similarity Search in a Table Corpus.
    Hazar Harmouch, Thorsten Papenbrock , and Felix Naumann
    Proceedings of the International Conference on Data Engineering (ICDE), Mar 2021

2020

  1. hat.gif
    Single-column data profiling
    Hazar Harmouch
    University of Potsdam, Germany , Mar 2020

2019

  1. cikm19.jpg
    Inclusion Dependency Discovery:An Experimental Evaluation of Thirteen Algorithms.
    Falco Dürsch , Axel Stebner , Fabian Windheuser , Maxi Fischer , Tim Friedrich , Nils Strelow , Tobias Bleifuß , Hazar Harmouch, Lan Jiang , Thorsten Papenbrock , and Felix Naumann
    Proceedings of the International Conference on Information and Knowledge Management (CIKM), Mar 2019

2018

  1. vldb.gif
    Discovery of genuine functional dependencies from relational data with missing values
    Laure Berti-Equille , Hazar Harmouch, Felix Naumann , Noël Novelli , and Saravanan Thirumuruganathan
    PVLDB, Mar 2018

2017

  1. vldb.gif
    Cardinality estimation: an experimental survey
    Hazar Harmouch, and Felix Naumann
    PVLDB, Mar 2017

2016

  1. bulltin.png
    Data Anamnesis: Admitting Raw Data into an Organization.
    Sebastian Kruse , Thorsten Papenbrock , Hazar Harmouch, and Felix Naumann
    IEEE Data Engineering Bulletin, Mar 2016

2015

  1. ijasr.png
    Evaluating four of the most popular open source and free data mining tools
    Ahmad Al-Khoder , and Hazar Harmouch
    IJASR International Journal of Academic Scientific Research, Mar 2015