{"id":8444,"date":"2024-11-12T17:55:41","date_gmt":"2024-11-12T16:55:41","guid":{"rendered":"https:\/\/projecteaina.cat\/tech\/?post_type=publicacions&#038;p=8444"},"modified":"2024-11-21T18:53:22","modified_gmt":"2024-11-21T17:53:22","slug":"on-the-use-of-audio-to-improve-dialogue-policies","status":"publish","type":"publicacions","link":"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/","title":{"rendered":"On the Use of Audio to Improve Dialogue Policies"},"excerpt":{"rendered":"<p>With the significant progress of speech technologies, spoken goal-oriented dialogue systems are becoming increasingly popular. One of the main modules of a dialogue system is typically the dialogue policy, which is responsible for determining system actions. This component usually relies only on audio transcriptions, being strongly dependent on their quality and ignoring very important extralinguistic information embedded in the user\u2019s speech. In this paper, we propose new architectures to add audio information by combining speech and text embeddings using a Double Multi-Head Attention component. Our experiments show that audio embedding-aware dialogue policies outperform text-based ones, particularly in noisy transcription scenarios, and that how text and audio embeddings are combined is crucial to improve performance. We obtained a 9.8% relative improvement in the User Request Score compared to an only-text-based dialogue system on the DSTC2 dataset1.<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"_acf_changed":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0},"class_list":["post-8444","publicacions","type-publicacions","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>On the Use of Audio to Improve Dialogue Policies - Projecte Aina Tech<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/\" \/>\n<meta property=\"og:locale\" content=\"ca_ES\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"On the Use of Audio to Improve Dialogue Policies - Projecte Aina Tech\" \/>\n<meta property=\"og:description\" content=\"With the significant progress of speech technologies, spoken goal-oriented dialogue systems are becoming increasingly popular. One of the main modules of a dialogue system is typically the dialogue policy, which is responsible for determining system actions. This component usually relies only on audio transcriptions, being strongly dependent on their quality and ignoring very important extralinguistic information embedded in the user\u2019s speech. In this paper, we propose new architectures to add audio information by combining speech and text embeddings using a Double Multi-Head Attention component. Our experiments show that audio embedding-aware dialogue policies outperform text-based ones, particularly in noisy transcription scenarios, and that how text and audio embeddings are combined is crucial to improve performance. We obtained a 9.8% relative improvement in the User Request Score compared to an only-text-based dialogue system on the DSTC2 dataset1.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/\" \/>\n<meta property=\"og:site_name\" content=\"Projecte Aina Tech\" \/>\n<meta property=\"article:modified_time\" content=\"2024-11-21T17:53:22+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@projecte_aina\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/publicacions\\\/on-the-use-of-audio-to-improve-dialogue-policies\\\/\",\"url\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/publicacions\\\/on-the-use-of-audio-to-improve-dialogue-policies\\\/\",\"name\":\"On the Use of Audio to Improve Dialogue Policies - Projecte Aina Tech\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/#website\"},\"datePublished\":\"2024-11-12T16:55:41+00:00\",\"dateModified\":\"2024-11-21T17:53:22+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/publicacions\\\/on-the-use-of-audio-to-improve-dialogue-policies\\\/#breadcrumb\"},\"inLanguage\":\"ca\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/publicacions\\\/on-the-use-of-audio-to-improve-dialogue-policies\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/publicacions\\\/on-the-use-of-audio-to-improve-dialogue-policies\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Inici\",\"item\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"On the Use of Audio to Improve Dialogue Policies\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/#website\",\"url\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/\",\"name\":\"Projecte Aina Tech\",\"description\":\"Impulsant l&#039;\u00fas del catal\u00e0 en l&#039;era digital\",\"publisher\":{\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ca\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/#organization\",\"name\":\"Projecte Aina Tech\",\"url\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ca\",\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/wp-content\\\/uploads\\\/2023\\\/11\\\/cropped-aina-home-logo.jpg\",\"contentUrl\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/wp-content\\\/uploads\\\/2023\\\/11\\\/cropped-aina-home-logo.jpg\",\"width\":512,\"height\":512,\"caption\":\"Projecte Aina Tech\"},\"image\":{\"@id\":\"https:\\\/\\\/projecteaina.cat\\\/tech\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/projecte_aina\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/projecte-aina\\\/\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"On the Use of Audio to Improve Dialogue Policies - Projecte Aina Tech","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/","og_locale":"ca_ES","og_type":"article","og_title":"On the Use of Audio to Improve Dialogue Policies - Projecte Aina Tech","og_description":"With the significant progress of speech technologies, spoken goal-oriented dialogue systems are becoming increasingly popular. One of the main modules of a dialogue system is typically the dialogue policy, which is responsible for determining system actions. This component usually relies only on audio transcriptions, being strongly dependent on their quality and ignoring very important extralinguistic information embedded in the user\u2019s speech. In this paper, we propose new architectures to add audio information by combining speech and text embeddings using a Double Multi-Head Attention component. Our experiments show that audio embedding-aware dialogue policies outperform text-based ones, particularly in noisy transcription scenarios, and that how text and audio embeddings are combined is crucial to improve performance. We obtained a 9.8% relative improvement in the User Request Score compared to an only-text-based dialogue system on the DSTC2 dataset1.","og_url":"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/","og_site_name":"Projecte Aina Tech","article_modified_time":"2024-11-21T17:53:22+00:00","twitter_card":"summary_large_image","twitter_site":"@projecte_aina","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/","url":"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/","name":"On the Use of Audio to Improve Dialogue Policies - Projecte Aina Tech","isPartOf":{"@id":"https:\/\/projecteaina.cat\/tech\/#website"},"datePublished":"2024-11-12T16:55:41+00:00","dateModified":"2024-11-21T17:53:22+00:00","breadcrumb":{"@id":"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/#breadcrumb"},"inLanguage":"ca","potentialAction":[{"@type":"ReadAction","target":["https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/projecteaina.cat\/tech\/publicacions\/on-the-use-of-audio-to-improve-dialogue-policies\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Inici","item":"https:\/\/projecteaina.cat\/tech\/"},{"@type":"ListItem","position":2,"name":"On the Use of Audio to Improve Dialogue Policies"}]},{"@type":"WebSite","@id":"https:\/\/projecteaina.cat\/tech\/#website","url":"https:\/\/projecteaina.cat\/tech\/","name":"Projecte Aina Tech","description":"Impulsant l&#039;\u00fas del catal\u00e0 en l&#039;era digital","publisher":{"@id":"https:\/\/projecteaina.cat\/tech\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/projecteaina.cat\/tech\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ca"},{"@type":"Organization","@id":"https:\/\/projecteaina.cat\/tech\/#organization","name":"Projecte Aina Tech","url":"https:\/\/projecteaina.cat\/tech\/","logo":{"@type":"ImageObject","inLanguage":"ca","@id":"https:\/\/projecteaina.cat\/tech\/#\/schema\/logo\/image\/","url":"https:\/\/projecteaina.cat\/tech\/wp-content\/uploads\/2023\/11\/cropped-aina-home-logo.jpg","contentUrl":"https:\/\/projecteaina.cat\/tech\/wp-content\/uploads\/2023\/11\/cropped-aina-home-logo.jpg","width":512,"height":512,"caption":"Projecte Aina Tech"},"image":{"@id":"https:\/\/projecteaina.cat\/tech\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/projecte_aina","https:\/\/www.linkedin.com\/company\/projecte-aina\/"]}]}},"_links":{"self":[{"href":"https:\/\/projecteaina.cat\/tech\/wp-json\/wp\/v2\/publicacions\/8444","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/projecteaina.cat\/tech\/wp-json\/wp\/v2\/publicacions"}],"about":[{"href":"https:\/\/projecteaina.cat\/tech\/wp-json\/wp\/v2\/types\/publicacions"}],"wp:attachment":[{"href":"https:\/\/projecteaina.cat\/tech\/wp-json\/wp\/v2\/media?parent=8444"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}