Pas de son sur media_player ESP32 N16R8 S3

Bob · Mars 11, 2024, 6:21

Bonjour,

je crée un nouveau sujet pour ne pas polluer les autres

Je suis toujours sur mon ESP32 N16R8 S3 du sujet :

Pour info j’ai déjà un assistant vocal (ESP32 + INMP441 + Max98357A) qui fonctionne bien.

J’ai réussi à piloter mes appareils avec ce nouvel ESP, ça génère bien le fichier mp3 dans le dossier config/tts, mais il n’est pas lu.

Le yaml ESPhome:

captive_portal:

psram:
  mode: octal
  speed: 80MHz

# voice assistant
i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO3   #WS / LRC
    i2s_bclk_pin: GPIO5    #SCK /BCLK

microphone:
  - platform: i2s_audio
    adc_type: external
    pdm: false
    id: mic_i2s
    channel: left
    bits_per_sample: 32bit
    i2s_audio_id: i2s_in
    i2s_din_pin: GPIO4    #SD 
    
media_player:
  - platform: i2s_audio
    id: media_sat3
    name: "media_sat3"
    i2s_dout_pin: GPIO8
    dac_type: external
    mode: mono

voice_assistant:
  microphone: mic_i2s
  id: brunoassist
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  use_wake_word: true

  on_wake_word_detected: 
    - switch.turn_on:
        id: reveil

  on_error: 
   - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - switch.turn_off: use_wake_word
          - switch.turn_on: use_wake_word      

  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:

  on_client_disconnected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.stop:

  on_end: 
    - switch.turn_off:
        id: reveil

  #Pour envoyer réponse 
  on_tts_end:
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id: media_player.esp32_psram16_r8_voice_3_media_sat3 #denon_avc_x3700h mibox3  freebox_player_2 
          media_content_id: !lambda 'return x;'
          media_content_type: music
          announce: "true"
    - delay : 10s    
    - homeassistant.service:
       service: tts.clear_cache

binary_sensor:
  - platform: status
    name: API Connection
    id: api_connection
    filters:
      - delayed_on: 1s
    on_press:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.start_continuous:
    on_release:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.stop:
              
switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(brunoassist).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(brunoassist).set_use_wake_word(false);

  - platform: gpio
    name: "LedIO13"
    id: reveil
    pin: GPIO13
          
light:
  - platform: esp32_rmt_led_strip
    id: my_light  
    rgb_order: GRB
    pin: GPIO48
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: "My Light"
    effects:
      - pulse:
          transition_length: 550ms
          update_interval: 550ms

web_server:
  port: 80

Voici les log de l’ESP :

[18:35:36][D][voice_assistant:422]: Desired state set to STREAMING_MICROPHONE
[18:35:36][D][voice_assistant:523]: Event Type: 1
[18:35:36][D][voice_assistant:526]: Assist Pipeline running
[18:35:36][D][voice_assistant:523]: Event Type: 9
[18:35:39][D][voice_assistant:523]: Event Type: 10
[18:35:39][D][voice_assistant:532]: Wake word detected
[18:35:39][D][switch:012]: 'LedIO13' Turning ON.
[18:35:39][D][switch:055]: 'LedIO13': Sending state ON
[18:35:39][D][voice_assistant:523]: Event Type: 3
[18:35:39][D][voice_assistant:537]: STT started
[18:35:40][D][voice_assistant:523]: Event Type: 11
[18:35:40][D][voice_assistant:677]: Starting STT by VAD
[18:35:42][D][voice_assistant:523]: Event Type: 12
[18:35:42][D][voice_assistant:681]: STT by VAD end
[18:35:42][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[18:35:42][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE
[18:35:42][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[18:35:42][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[18:35:43][D][voice_assistant:523]: Event Type: 4
[18:35:43][D][voice_assistant:551]: Speech recognised as: "allumer la lumière marine"
[18:35:43][D][voice_assistant:523]: Event Type: 5
[18:35:43][D][voice_assistant:556]: Intent started
[18:35:43][D][voice_assistant:523]: Event Type: 6
[18:35:43][D][voice_assistant:523]: Event Type: 7
[18:35:43][D][voice_assistant:579]: Response: "Allumé"
[18:35:43][D][voice_assistant:523]: Event Type: 8
[18:35:43][D][voice_assistant:599]: Response URL: "http://192.168.1.32:8123/api/tts_proxy/a54a861193bf0fc5d5dc1b9f543d744a47d3ba31_fr-fr_6c2e43c6c1_google_translate.mp3"
[18:35:43][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to IDLE
[18:35:43][D][voice_assistant:422]: Desired state set to IDLE
[18:35:43][D][voice_assistant:416]: State changed from IDLE to START_PIPELINE
[18:35:43][D][voice_assistant:422]: Desired state set to START_MICROPHONE
[18:35:43][D][voice_assistant:523]: Event Type: 2
[18:35:43][D][voice_assistant:613]: Assist Pipeline ended
[18:35:43][D][voice_assistant:118]: microphone not running
[18:35:43][D][voice_assistant:202]: Requesting start...
[18:35:43][D][voice_assistant:416]: State changed from START_PIPELINE to STARTING_PIPELINE
[18:35:43][D][switch:016]: 'LedIO13' Turning OFF.
[18:35:43][D][switch:055]: 'LedIO13': Sending state OFF
[18:35:43][D][media_player:059]: 'media_sat3' - Setting
[18:35:43][D][media_player:066]:   Media URL: http://192.168.1.32:8123/api/tts_proxy/a54a861193bf0fc5d5dc1b9f543d744a47d3ba31_fr-fr_6c2e43c6c1_google_translate.mp3
[18:35:43][D][voice_assistant:118]: microphone not running
[18:35:44][W][component:214]: Component i2s_audio.media_player took a long time for an operation (0.52 s).
[18:35:44][W][component:215]: Components should block for at most 20-30ms.
[18:35:44][D][voice_assistant:437]: Client started, streaming microphone

J’ai testé sans succès avec le Max98357A qui fonctionne sur mon autre montage.

Je sèche un peu là !

Merci à vous.
Bob

Krull56 · Mars 12, 2024, 8:50

Hello voisin,

Il y a eu pas mal de changement .
On peut maintenant indiquer un media player comme speaker directement dans la partie voice_assistant du yaml esp. Plus besoin de la « bricole » d’avant.
En exemple je t’invite à regarder ce code :

firmware/media-player/onju-voice.yaml at main · esphome/firmware (github.com)

@+

Bob · Mars 12, 2024, 9:12

Bonjour @Krull56,

Ok, merci, je regarde ce midi et je commente le post en fonction du résultat.

Bonne journée voisin
Bob

Krull56 · Mars 12, 2024, 9:33

Et rien ne t’empêche de tester en premier un code uniquement media player pour voir si tu as du son en sortie

Bob · Mars 12, 2024, 12:49

Pas eu beaucoup de temps mais pas mieux :

[13:39:30][D][voice_assistant:551]: Speech recognised as: "allumer la lumière marine"
[13:39:30][D][voice_assistant:523]: Event Type: 5
[13:39:30][D][voice_assistant:556]: Intent started
[13:39:30][D][voice_assistant:523]: Event Type: 6
[13:39:30][D][voice_assistant:523]: Event Type: 7
[13:39:30][D][voice_assistant:579]: Response: "Allumé"
[13:39:30][D][voice_assistant:523]: Event Type: 8
[13:39:30][D][voice_assistant:599]: Response URL: "http://192.168.1.32:8123/api/tts_proxy/a54a861193bf0fc5d5dc1b9f543d744a47d3ba31_fr-fr_6c2e43c6c1_google_translate.mp3"
[13:39:30][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[13:39:30][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE
[13:39:30][D][media_player:059]: 'Media sat3' - Setting
[13:39:30][D][media_player:066]:   Media URL: http://192.168.1.32:8123/api/tts_proxy/a54a861193bf0fc5d5dc1b9f543d744a47d3ba31_fr-fr_6c2e43c6c1_google_translate.mp3
[13:39:30][D][media_player:059]: 'Media sat3' - Setting
[13:39:30][D][media_player:066]:   Media URL: http://192.168.1.32:8123/api/tts_proxy/a54a861193bf0fc5d5dc1b9f543d744a47d3ba31_fr-fr_6c2e43c6c1_google_translate.mp3
[13:39:30][D][voice_assistant:523]: Event Type: 2
[13:39:30][D][voice_assistant:613]: Assist Pipeline ended
[13:39:31][W][component:214]: Component i2s_audio.media_player took a long time for an operation (0.54 s).
[13:39:31][W][component:215]: Components should block for at most 20-30ms.
[13:39:31][D][switch:016]: 'LedIO13' Turning OFF.
[13:39:31][D][switch:055]: 'LedIO13': Sending state OFF
[13:39:31][D][light:036]: 'My Light' Setting:
[13:39:31][D][light:047]:   State: OFF

media_player:
  - platform: i2s_audio
    id: media_sat3
    name: Media sat3
    i2s_dout_pin: GPIO8
    dac_type: external
    mode: mono

  on_tts_end:
    - media_player.play_media: !lambda return x;
    - delay : 10s    
    - homeassistant.service:
       service: tts.clear_cache

Bob

Krull56 · Mars 12, 2024, 12:55

Et seulement en media player sans la partie voice assistant ? t’as testé ?

Bob · Mars 12, 2024, 1:13

Pas eu le temps @Krull56 , retour au boulot…
Je testerai ce soir mais c’est surprenant quand même que sur l’autre ESP ça fonctionne et pas sur celui là
Je vois bien le media_player dans les appareils mais impossible par là aussi de le faire lire une phrase.
Bob

Bob · Mars 12, 2024, 3:30

Contrairement au pinout de mes autres ESP, pas de trace de pin DAC sur le N16R8 !
Il n’y aurait pas un lien avec mon souci ?

Bob

Bob · Mars 12, 2024, 7:10

Bonsoir,
pas mieux sans la partie voice assistant !
pas d’erreur mais pas de son
Bob

Krull56 · Mars 12, 2024, 7:18

Je n’ai pas ce modèle sous la main, mais dans le cadre du contest j’ai vu passer des infos sur le forum anglophone .
pas de media_player mais le speaker « classique » dans le yaml qui à l’air de fonctionner.

A presentable voice assistant satellite - Voice Assistant Contest - Home Assistant Community (home-assistant.io)

@+

Bob · Mars 13, 2024, 12:37

Bonjour,

J’ai suivi ton lien @Krull56,

Conf yaml ESPHome :

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO3  ##INMP441-WS
    i2s_bclk_pin: GPIO5  ##INMP441-SCK
  - id: i2s_out
    i2s_lrclk_pin: GPIO6  ##Max98357 - LRC
    i2s_bclk_pin: GPIO20   ###Max98357 - BCLK
    ## l/R pin on ##INMP441 is connected to ground

microphone:
  - platform: i2s_audio
    adc_type: external
    pdm: false
    id: mic_i2s
    channel: left
    bits_per_sample: 32bit
    i2s_audio_id: i2s_in
    i2s_din_pin: GPIO4    #SD 33

speaker:
  - platform: i2s_audio
    id: my_speaker
    dac_type: external
    i2s_audio_id: i2s_out
    i2s_dout_pin: GPIO8   #DIN 
    mode: mono

voice_assistant:
  id: brunoassist
  microphone: mic_i2s
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  use_wake_word: true
  speaker: my_speaker

Les logs (le fichier généré est un .wav maintenant.

[13:20:42][D][voice_assistant:532]: Wake word detected
[13:20:42][D][switch:012]: 'LedIO13' Turning ON.
[13:20:42][D][switch:055]: 'LedIO13': Sending state ON
[13:20:42][D][light:036]: 'My Light' Setting:
[13:20:42][D][light:047]:   State: ON
[13:20:42][D][light:051]:   Brightness: 60%
[13:20:42][D][light:059]:   Red: 0%, Green: 100%, Blue: 0%
[13:20:42][D][light:109]:   Effect: 'Pulse'
[13:20:42][D][voice_assistant:523]: Event Type: 3
[13:20:42][D][voice_assistant:537]: STT started
[13:20:43][D][voice_assistant:523]: Event Type: 11
[13:20:43][D][voice_assistant:677]: Starting STT by VAD
[13:20:44][D][voice_assistant:523]: Event Type: 12
[13:20:44][D][voice_assistant:681]: STT by VAD end
[13:20:44][D][voice_assistant:416]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[13:20:44][D][voice_assistant:422]: Desired state set to AWAITING_RESPONSE
[13:20:44][D][voice_assistant:416]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[13:20:44][D][voice_assistant:416]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[13:20:46][D][voice_assistant:523]: Event Type: 4
[13:20:46][D][voice_assistant:551]: Speech recognised as: "éteindre la lumière marine"
[13:20:46][D][voice_assistant:523]: Event Type: 5
[13:20:46][D][voice_assistant:556]: Intent started
[13:20:46][D][voice_assistant:523]: Event Type: 6
[13:20:46][D][voice_assistant:523]: Event Type: 7
[13:20:46][D][voice_assistant:579]: Response: "Éteint"
[13:20:46][D][voice_assistant:523]: Event Type: 8
[13:20:46][D][voice_assistant:599]: Response URL: "http://192.168.1.32:8123/api/tts_proxy/bd80c857a9f4a7d75384c8df252b26f9f7fde3a3_fr-fr_2c82848529_google_translate.wav"
[13:20:46][D][voice_assistant:416]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[13:20:46][D][voice_assistant:422]: Desired state set to STREAMING_RESPONSE
[13:20:46][D][voice_assistant:523]: Event Type: 2
[13:20:46][D][voice_assistant:613]: Assist Pipeline ended
[13:20:46][D][i2s_audio.speaker:161]: Starting I2S Audio Speaker
[13:20:46][D][switch:016]: 'LedIO13' Turning OFF.
[13:20:46][D][switch:055]: 'LedIO13': Sending state OFF
[13:20:46][D][light:036]: 'My Light' Setting:
[13:20:46][D][light:047]:   State: OFF
[13:20:46][D][light:085]:   Transition length: 1.0s
[13:20:46][D][light:091]:   Effect: 'None'
[13:20:46][D][voice_assistant:523]: Event Type: 98
[13:20:46][D][voice_assistant:664]: TTS stream start
[13:20:46][D][i2s_audio.speaker:164]: Started I2S Audio Speaker
[13:20:47][D][voice_assistant:351]: Speaker buffer full, trying again next loop
[13:20:47][D][voice_assistant:351]: Speaker buffer full, trying again next loop
[13:20:47][D][voice_assistant:351]: Speaker buffer full, trying again next loop
[13:20:47][D][voice_assistant:351]: Speaker buffer full, trying again next loop
[13:20:47][D][voice_assistant:351]: Speaker buffer full, trying again next loop
[13:20:47][D][voice_assistant:351]: Speaker buffer full, trying again next loop

Dans le cas de l’utilisation de speaker, la lecture du fichier est automatique ou il y a une commande à mettre comme pour le media_player ?

Bob

Bob · Mars 13, 2024, 10:32

Bonsoir,
Bon je me dit que sur le N16r8 pas moyen (pour moi) d’envoyer du son sur le HP, avec media_player ou speaker donc j’essaie un truc, envoyer sur le media_player d’un autre ESP qui fonctionne:

  on_tts_end:
    - homeassistant.service:
        service: media_player.play_media
        data:
          entity_id:  media_player.esp32_psram_voice_control_2_media_sat2
          media_content_id: !lambda 'return x;'
          media_content_type: music
          announce: "true"
    - delay : 10s    
    - homeassistant.service:
       service: tts.clear_cache

Pas mieux même si pas d’erreur, dommage, il répond très vite aux demandes, mais pas de sortie audio !
Je vais je crois avec regrets arrêter là avec cet ESP, je passe à mes LED infra rouge pour mes cameras montées sur ESP pour la vision nocturne!

Merci pour tout.
Mais… si vous avez une idée

Bob

WarC0zes · Avril 26, 2024, 12:41

Salut @Bob,
j’ai essayer le media player sur un esp32. J’ai pu lire un mp3 en local et envoyer du TTS.
Par contre j’ai essayer avec youtube et ca fonctionne pas , as tu essayer ?
ca me dit que ca peut pas decoder, peut être le mode: mono.

Krull56 · Avril 26, 2024, 1:17

Hello @WarC0zes

C’est peut-être lié aux DRM.
Avec l’addon hacs Music Assistant ça fonctionne !
Mon ESP32-S3 N16R8 fait à la fois Assist et Media_player

@+

WarC0zes · Avril 26, 2024, 1:22

Salut @Krull56

ah oui, bien vu.

aller encore un addon a installer
Je verrais plus tard alors, merci du retour.

Bob · Avril 26, 2024, 1:42

Bonjour @WarC0zes,
Pour l’instant je n’ai gardé qu’un assistant vocal sur ESP32 T8 V1.7.1.
Sur un ESP32 Wroom 32 classique je n’ai laissé qu’un média player qui fonctionne bien, MP3 et TTS, pour youtube je n’ai jamais essayé, je passe par ma mibox et mon ampli.
@Krull56, je viens d’installer Music Assistant, je vais rebrancher mon ESP32-S3 N16R8 pour voir si il veut enfin parler
Merci pour ces nouvelles
Je dois bosser un peu aussi
Bob

Krull56 · Avril 26, 2024, 1:55

De mon côté, j’avance « lentement » sur ma Tablette-Assist avec l’objectif qu’elle remplace mon radio-réveil actuel (mais avec beaucoup plus de fonctionnalités bien sûr)