• Skip to content (access key 1)
  • Skip to search (access key 7)
FWF — Austrian Science Fund
  • Go to overview page Discover

    • Research Radar
      • Research Radar Archives 1974–1994
    • Discoveries
      • Emmanuelle Charpentier
      • Adrian Constantin
      • Monika Henzinger
      • Ferenc Krausz
      • Wolfgang Lutz
      • Walter Pohl
      • Christa Schleper
      • Elly Tanaka
      • Anton Zeilinger
    • Impact Stories
      • Verena Gassner
      • Wolfgang Lechner
      • Georg Winter
    • scilog Magazine
    • Austrian Science Awards
      • FWF Wittgenstein Awards
      • FWF ASTRA Awards
      • FWF START Awards
      • Award Ceremony
    • excellent=austria
      • Clusters of Excellence
      • Emerging Fields
    • In the Spotlight
      • 40 Years of Erwin Schrödinger Fellowships
      • Quantum Austria
    • Dialogs and Talks
      • think.beyond Summit
    • Knowledge Transfer Events
    • E-Book Library
  • Go to overview page Funding

    • Portfolio
      • excellent=austria
        • Clusters of Excellence
        • Emerging Fields
      • Projects
        • Principal Investigator Projects
        • Principal Investigator Projects International
        • Clinical Research
        • 1000 Ideas
        • Arts-Based Research
        • FWF Wittgenstein Award
      • Careers
        • ESPRIT
        • FWF ASTRA Awards
        • Erwin Schrödinger
        • doc.funds
        • doc.funds.connect
      • Collaborations
        • Specialized Research Groups
        • Special Research Areas
        • Research Groups
        • International – Multilateral Initiatives
        • #ConnectingMinds
      • Communication
        • Top Citizen Science
        • Science Communication
        • Book Publications
        • Digital Publications
        • Open-Access Block Grant
      • Subject-Specific Funding
        • AI Mission Austria
        • Belmont Forum
        • ERA-NET HERA
        • ERA-NET NORFACE
        • ERA-NET QuantERA
        • ERA-NET TRANSCAN
        • Alternative Methods to Animal Testing
        • European Partnership Biodiversa+
        • European Partnership BrainHealth
        • European Partnership ERA4Health
        • European Partnership ERDERA
        • European Partnership EUPAHW
        • European Partnership FutureFoodS
        • European Partnership OHAMR
        • European Partnership PerMed
        • European Partnership Water4All
        • Gottfried and Vera Weiss Award
        • netidee SCIENCE
        • Herzfelder Foundation Projects
        • Quantum Austria
        • Rückenwind Funding Bonus
        • WE&ME Award
        • Zero Emissions Award
      • International Collaborations
        • Belgium/Flanders
        • Germany
        • France
        • Italy/South Tyrol
        • Japan
        • Luxembourg
        • Poland
        • Switzerland
        • Slovenia
        • Taiwan
        • Tyrol–South Tyrol–Trentino
        • Czech Republic
        • Hungary
    • Step by Step
      • Find Funding
      • Submitting Your Application
      • International Peer Review
      • Funding Decisions
      • Carrying out Your Project
      • Closing Your Project
      • Further Information
        • Integrity and Ethics
        • Inclusion
        • Applying from Abroad
        • Personnel Costs
        • PROFI
        • Final Project Reports
        • Final Project Report Survey
    • FAQ
      • Project Phase PROFI
      • Project Phase Ad Personam
      • Expiring Programs
        • Elise Richter and Elise Richter PEEK
        • FWF START Awards
  • Go to overview page About Us

    • Mission Statement
    • FWF Video
    • Values
    • Facts and Figures
    • Annual Report
    • What We Do
      • Research Funding
        • Matching Funds Initiative
      • International Collaborations
      • Studies and Publications
      • Equal Opportunities and Diversity
        • Objectives and Principles
        • Measures
        • Creating Awareness of Bias in the Review Process
        • Terms and Definitions
        • Your Career in Cutting-Edge Research
      • Open Science
        • Open-Access Policy
          • Open-Access Policy for Peer-Reviewed Publications
          • Open-Access Policy for Peer-Reviewed Book Publications
          • Open-Access Policy for Research Data
        • Research Data Management
        • Citizen Science
        • Open Science Infrastructures
        • Open Science Funding
      • Evaluations and Quality Assurance
      • Academic Integrity
      • Science Communication
      • Philanthropy
      • Sustainability
    • History
    • Legal Basis
    • Organization
      • Executive Bodies
        • Executive Board
        • Supervisory Board
        • Assembly of Delegates
        • Scientific Board
        • Juries
      • FWF Office
    • Jobs at FWF
  • Go to overview page News

    • News
    • Press
      • Logos
    • Calendar
      • Post an Event
      • FWF Informational Events
    • Job Openings
      • Enter Job Opening
    • Newsletter
  • Discovering
    what
    matters.

    FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

    SOCIAL MEDIA

    • LinkedIn, external URL, opens in a new window
    • , external URL, opens in a new window
    • Facebook, external URL, opens in a new window
    • Instagram, external URL, opens in a new window
    • YouTube, external URL, opens in a new window

    SCILOG

    • Scilog — The science magazine of the Austrian Science Fund (FWF)
  • elane login, external URL, opens in a new window
  • Scilog external URL, opens in a new window
  • de Wechsle zu Deutsch

  

On High Dimensional Data Analysis in Music Information Retrieval

On High Dimensional Data Analysis in Music Information Retrieval

Arthur Flexer (ORCID: 0000-0002-1691-737X)
  • Grant DOI 10.55776/P27082
  • Funding program Principal Investigator Projects
  • Status ended
  • Start October 1, 2014
  • End June 30, 2018
  • Funding amount € 275,499

Disciplines

Computer Sciences (85%); Arts (15%)

Keywords

    Music Information Retrieval, Artificial Intelligence, Machine Learning, Multimedia, Hubness, High Dimensional Data Analysis

Abstract Final report

Learning in high dimensional spaces poses a number of challenges which are referred to as the curse of dimensionality. Music Information Retrieval (MIR), as the interdisciplinary science of retrieving information from music, is very often relying on high dimensional feature representations and models. The existence of a new aspect of the curse of dimensionality, the so-called hubness, has been first documented and established in MIR as a problem of computing music similarity. Hub songs are, according to the music similarity function, similar to very many other songs and as a consequence appear in very many recommendation lists preventing other songs from being recommended at all. The hubness phenomenon has since then been identified as a general problem of machine learning in high dimensional spaces. It is due to the property of distance concentration which causes all points in a high dimensional data space to be at almost the same distance to each other. Our own previous research efforts have focused on the impact of distance concentration and hubness on nearest neighbor based music recommendation and genre classification. As a result we have developed a general unsupervised method to pre-process and rescale distance spaces which is able to decisively diminish hubness and its adverse effects in music databases but also general machine learning datasets. Research by our own and other research groups has also made it clear that concentration and hubness have an impact on many more distance based algorithms being used in high dimensional data analysis. This proposed project will explore existing and develop new approaches to deal with these problems by studying their effects on a wide range of methods in MIR, but also multimedia and machine learning. In particular we are planning to (i) study and unify rescaling methods to avoid distance concentration, (ii) explore the role of hubness in unsupervised (clustering, visualization) and supervised learning (classification) in high dimensional spaces. The main focus of this project is on MIR since this is where the majority of results on hubness and concentration exist. But the evaluation of our results in the broader field of multimedia and machine learning will make sure that our research has the potential to solve an important problem in MIR and at the same time a general problem of learning in high dimensional spaces.

Learning in high dimensional spaces poses a number of challenges which are referred to as the curse of dimensionality. Music Information Retrieval (MIR), as the interdisciplinary science of retrieving information from music, is very often relying on high dimensional feature representations and models. The existence of a new aspect of the curse of dimensionality, the so-called hubness, has been first documented and established in MIR as a problem of computing music similarity. Hub songs are, according to the music similarity function, similar to very many other songs and as a consequence appear in very many recommendation lists preventing other songs from being recommended at all. The hubness phenomenon has since then been identified as a general problem of machine learning in high dimensional spaces. It is due to the property of distance concentration which causes all points in a high dimensional data space to be at almost the same distance to each other. In this project we have developed, studied and unified methods to reduce hubness by either re-scaling of distance-spaces, data centering or usage of alternative distance norms. We conducted a large-scale empirical evaluation of all twelve available versions of hubness reduction methods on fifty data sets. We also developed a hubness analysis workflow which, based on some simple criteria, helps practitioners to decide which hubness reduction method to use for their problem at hand. In addition, we explored the negative impact of hubness on unsupervised (clustering, visualization, outlier detection) and supervised machine learning (classification) in high dimensional spaces. All these distance-based machine learning algorithms suffer from a range of hubness related problems which can be alleviated via hubness reduction. In summary, within the course of our project we were able to develop new methods of hubness reduction, clarify which hubness reduction methods work best under which conditions, and document the influence of hubness and its reduction on the full breadth of machine learning. This allowed us to solve an important problem in MIR and at the same time a general problem of learning in high dimensional spaces.

Research institution(s)
  • ÖFAI - Österreichisches Forschungsinstitut für Artifical Intelligence - 100%
International project participants
  • Emmanuel Vincent, INRIA Rennes - France
  • Nenad Tomasev, Jozef Stefan Institute - Slovenia

Research Output

  • 146 Citations
  • 9 Publications
Publications
  • 2018
    Title Hubness as a case of technical algorithmic bias in music recommendation
    DOI 10.1109/icdmw.2018.00154
    Type Conference Proceeding Abstract
    Author Flexer A
    Pages 1062-1069
  • 2018
    Title A comprehensive empirical comparison of hubness reduction in high-dimensional spaces
    DOI 10.1007/s10115-018-1205-y
    Type Journal Article
    Author Feldbauer R
    Journal Knowledge and Information Systems
    Pages 137-166
    Link Publication
  • 2015
    Title Choosing lp norms in high-dimensional spaces based on hub analysis
    DOI 10.1016/j.neucom.2014.11.084
    Type Journal Article
    Author Flexer A
    Journal Neurocomputing
    Pages 281-287
    Link Publication
  • 2017
    Title Mutual proximity graphs for improved reachability in music recommendation
    DOI 10.1080/09298215.2017.1354891
    Type Journal Article
    Author Flexer A
    Journal Journal of New Music Research
    Pages 17-28
    Link Publication
  • 2018
    Title Fast Approximate Hubness Reduction for Large High-Dimensional Data
    DOI 10.1109/icbk.2018.00055
    Type Conference Proceeding Abstract
    Author Feldbauer* R
    Pages 358-367
  • 2016
    Title The Problem of Limited Inter-rater Agreement in Modelling Music Similarity
    DOI 10.1080/09298215.2016.1200631
    Type Journal Article
    Author Flexer A
    Journal Journal of New Music Research
    Pages 239-251
    Link Publication
  • 2016
    Title An Empirical Analysis of Hubness in Unsupervised Distance-Based Outlier Detection
    DOI 10.1109/icdmw.2016.0106
    Type Conference Proceeding Abstract
    Author Flexer A
    Pages 716-723
  • 2016
    Title Centering Versus Scaling for Hubness Reduction
    DOI 10.1007/978-3-319-44778-0_21
    Type Book Chapter
    Author Feldbauer R
    Publisher Springer Nature
    Pages 175-183
  • 2015
    Title The Unbalancing Effect of Hubs on K-Medoids Clustering in High-Dimensional Spaces
    DOI 10.1109/ijcnn.2015.7280303
    Type Conference Proceeding Abstract
    Author Schnitzer D
    Pages 1-8

Discovering
what
matters.

Newsletter

FWF-Newsletter Press-Newsletter Calendar-Newsletter Job-Newsletter scilog-Newsletter

Contact

Austrian Science Fund (FWF)
Georg-Coch-Platz 2
(Entrance Wiesingerstraße 4)
1010 Vienna

office(at)fwf.ac.at
+43 1 505 67 40

General information

  • Job Openings
  • Jobs at FWF
  • Press
  • Philanthropy
  • scilog
  • FWF Office
  • Social Media Directory
  • LinkedIn, external URL, opens in a new window
  • , external URL, opens in a new window
  • Facebook, external URL, opens in a new window
  • Instagram, external URL, opens in a new window
  • YouTube, external URL, opens in a new window
  • Cookies
  • Whistleblowing/Complaints Management
  • Accessibility Statement
  • Data Protection
  • Acknowledgements
  • IFG-Form
  • Social Media Directory
  • © Österreichischer Wissenschaftsfonds FWF
© Österreichischer Wissenschaftsfonds FWF