J’ai réussi à installer tout le bazar (encore merci @Krull56 pour m’avoir donné les pointeurs) et globalement ca fonctionne, sauf que l’assistate (« tournesol ») ne me laisse pas le temps de parler.
Ce qui fonctionne:
- porcupine1 (reconnaît le mot clef « tournesol »)
- Speak to Text: Vosk reconnaît bien ce que je dis
- Text to Speech: Piper qui me dit bien sur le haut-parleur qu’elle n’a rien compris.
Un des problème semble venir de la partie wav qui est coupée dès qu’il y a un 10ème de seconde de blanc, voilà par exemple ce que me traduit vosk dans les log pour la phrase tournesol allume prise tulipe:
DEBUG:root:Unexpected event: type=transcribe, data={'language': 'fr'}
DEBUG:root:Loaded recognizer in 0.00 second(s)
DEBUG:root:Transcript for client 4852291918306: **tournesol aller**
DEBUG:root:Client disconnected: 4852291918306
DEBUG:root:Client connected: 4868515555377
DEBUG:root:Sent info to client: 4868515555377
DEBUG:root:Client disconnected: 4868515555377
DEBUG:root:Client connected: 4886848877064
DEBUG:root:Unexpected event: type=transcribe, data={'language': 'fr'}
DEBUG:root:Loaded recognizer in 0.00 second(s)
DEBUG:root:Transcript for client 4886848877064: tournesol
DEBUG:root:Client disconnected: 4886848877064
DEBUG:root:Client connected: 4900408908370
DEBUG:root:Unexpected event: type=transcribe, data={'language': 'fr'}
DEBUG:root:Loaded recognizer in 0.00 second(s)
DEBUG:root:Client connected: 4900850699370
DEBUG:root:Sent info to client: 4900850699370
DEBUG:root:Transcript for client 4900408908370: tournesol allume
DEBUG:root:Client disconnected: 4900408908370
DEBUG:root:Client disconnected: 4900850699370
Voilà ce que j’ai mis pour Vosk:
Pour Microphone Assist:
Et la log de Microphone Assit:
DEBUG:__main__:Waiting for speech
DEBUG:__main__:Speech detected
DEBUG:homeassistant_satellite.remote:{'type': 'auth_required', 'ha_version': '2023.12.3'}
DEBUG:homeassistant_satellite.remote:{'type': 'auth_ok', 'ha_version': '2023.12.3'}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'result', 'success': True, 'result': None}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'run-start', 'data': {'pipeline': '01gzmfs3e6gxksrd5v4vsqseg2', 'language': 'fr', 'runner_data': {'stt_binary_handler_id': 1, 'timeout': 300}}, 'timestamp': '2023-12-18T17:14:28.348212+00:00'}}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'wake_word-start', 'data': {'entity_id': 'wake_word.porcupine1', 'metadata': {'format': 'wav', 'codec': 'pcm', 'bit_rate': 16, 'sample_rate': 16000, 'channel': 1}, 'timeout': 3}, 'timestamp': '2023-12-18T17:14:28.348317+00:00'}}
DEBUG:__main__:wake_word-start {'entity_id': 'wake_word.porcupine1', 'metadata': {'format': 'wav', 'codec': 'pcm', 'bit_rate': 16, 'sample_rate': 16000, 'channel': 1}, 'timeout': 3}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'wake_word-end', 'data': {'wake_word_output': {'wake_word_id': 'tournesol', 'timestamp': 1020}}, 'timestamp': '2023-12-18T17:14:29.242428+00:00'}}
DEBUG:__main__:wake_word-end {'wake_word_output': {'wake_word_id': 'tournesol', 'timestamp': 1020}}
DEBUG:root:play ffmpeg: ['ffmpeg', '-i', '/usr/src/sounds/awake.wav', '-f', 'wav', '-ar', '22050', '-ac', '1', '-filter:a', 'volume=0.5', '-']
DEBUG:root:play: ['aplay', '-r', '22050', '-c', '1', '-f', 'S16_LE', '-t', 'raw']
Playing raw data 'stdin' : Signed 16 bit Little Endian, Rate 22050 Hz, Mono
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'stt-start', 'data': {'engine': 'stt.vosk', 'metadata': {'language': 'fr', 'format': 'wav', 'codec': 'pcm', 'bit_rate': 16, 'sample_rate': 16000, 'channel': 1}}, 'timestamp': '2023-12-18T17:14:29.242519+00:00'}}
DEBUG:__main__:stt-start {'engine': 'stt.vosk', 'metadata': {'language': 'fr', 'format': 'wav', 'codec': 'pcm', 'bit_rate': 16, 'sample_rate': 16000, 'channel': 1}}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'stt-vad-start', 'data': {'timestamp': 725}, 'timestamp': '2023-12-18T17:14:29.251656+00:00'}}
DEBUG:__main__:stt-vad-start {'timestamp': 725}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'stt-vad-end', 'data': {'timestamp': 1655}, 'timestamp': '2023-12-18T17:14:31.015415+00:00'}}
DEBUG:__main__:stt-vad-end {'timestamp': 1655}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'stt-end', 'data': {'stt_output': {'text': 'tournesol'}}, 'timestamp': '2023-12-18T17:14:31.286421+00:00'}}
DEBUG:__main__:stt-end {'stt_output': {'text': 'tournesol'}}
DEBUG:root:play ffmpeg: ['ffmpeg', '-i', '/usr/src/sounds/done.wav', '-f', 'wav', '-ar', '22050', '-ac', '1', '-filter:a', 'volume=0.5', '-']
DEBUG:root:play: ['aplay', '-r', '22050', '-c', '1', '-f', 'S16_LE', '-t', 'raw']
Playing raw data 'stdin' : Signed 16 bit Little Endian, Rate 22050 Hz, Mono
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'intent-start', 'data': {'engine': 'homeassistant', 'language': 'fr', 'intent_input': 'tournesol', 'conversation_id': None, 'device_id': None}, 'timestamp': '2023-12-18T17:14:31.286499+00:00'}}
DEBUG:__main__:intent-start {'engine': 'homeassistant', 'language': 'fr', 'intent_input': 'tournesol', 'conversation_id': None, 'device_id': None}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'intent-end', 'data': {'intent_output': {'response': {'speech': {'plain': {'speech': "Désolé, je n'ai pas compris", 'extra_data': None}}, 'card': {}, 'language': 'fr', 'response_type': 'error', 'data': {'code': 'no_intent_match'}}, 'conversation_id': None}}, 'timestamp': '2023-12-18T17:14:31.480389+00:00'}}
DEBUG:__main__:intent-end {'intent_output': {'response': {'speech': {'plain': {'speech': "Désolé, je n'ai pas compris", 'extra_data': None}}, 'card': {}, 'language': 'fr', 'response_type': 'error', 'data': {'code': 'no_intent_match'}}, 'conversation_id': None}}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'tts-start', 'data': {'engine': 'tts.piper', 'language': 'fr_FR', 'voice': 'fr_FR-siwis-medium', 'tts_input': "Désolé, je n'ai pas compris"}, 'timestamp': '2023-12-18T17:14:31.480446+00:00'}}
DEBUG:__main__:tts-start {'engine': 'tts.piper', 'language': 'fr_FR', 'voice': 'fr_FR-siwis-medium', 'tts_input': "Désolé, je n'ai pas compris"}
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'tts-end', 'data': {'tts_output': {'media_id': "media-source://tts/tts.piper?message=D%C3%A9sol%C3%A9,+je+n'ai+pas+compris&language=fr_FR&voice=fr_FR-siwis-medium", 'url': '/api/tts_proxy/393247aa3dd2cd24b4ee2f8550489431d86f0c02_fr-fr_1e55c50379_tts.piper.mp3', 'mime_type': 'audio/mpeg'}}, 'timestamp': '2023-12-18T17:14:31.480755+00:00'}}
DEBUG:__main__:tts-end {'tts_output': {'media_id': "media-source://tts/tts.piper?message=D%C3%A9sol%C3%A9,+je+n'ai+pas+compris&language=fr_FR&voice=fr_FR-siwis-medium", 'url': '/api/tts_proxy/393247aa3dd2cd24b4ee2f8550489431d86f0c02_fr-fr_1e55c50379_tts.piper.mp3', 'mime_type': 'audio/mpeg'}}
DEBUG:root:play ffmpeg: ['ffmpeg', '-i', 'http://192.168.100.175:8123/api/tts_proxy/393247aa3dd2cd24b4ee2f8550489431d86f0c02_fr-fr_1e55c50379_tts.piper.mp3', '-f', 'wav', '-ar', '22050', '-ac', '1', '-filter:a', 'volume=0.5', '-']
DEBUG:root:play: ['aplay', '-r', '22050', '-c', '1', '-f', 'S16_LE', '-t', 'raw']
Playing raw data 'stdin' : Signed 16 bit Little Endian, Rate 22050 Hz, Mono
DEBUG:homeassistant_satellite.remote:{'id': 1, 'type': 'event', 'event': {'type': 'run-end', 'data': None, 'timestamp': '2023-12-18T17:14:31.480781+00:00'}}
DEBUG:__main__:run-end None
DEBUG:homeassistant_satellite.remote:Pipeline finished
DEBUG:__main__:Waiting for speech
Une piste ? une idée ? une confif à me proposer de tester ?
Merci à tous !