Asystent głosowy i wybudzanie słowem

Krzysiek2 · 18 Styczeń 2025 19:17

Jesten “nowy” w temacie Hpme Assistent. Mam HA Core 2025.1.2, Supervisor 2024.13.3, Operating system 14,1

na PRi4B z 4GB. Wykonałem “asystenta głosowego z wykrywaniem słowa budzenia na urządzeniu w ESP32 S3” wg instrukcji ze stronu Voice Assistant Timers With On-Device Wake Word Detection On ESP32 S3 | Smart Home Circle.
Zainstalowałem Whisper (2.4.0),

Piper 1.5.2

i wgrałem przez ESP Home yaml wg podanej powyżej strony. Dodałem nowego Assystebnta głosowego i ustawiłem go jak na załaczonym zrzucie ekranu.

Urządzenie zainstalowało sie i sterowanie z HA (Led strip, mute, On board light) działa, niestety nie reaguje na wybudzanie “OK nabu”.

Doinstalowałem openWakeWord (1.10.0)

ale niestety nadal nie mam w ustawieniach Asystenta głosowego opcjo wybudzania słowem (nadal jest tylko to co na poprzednim zrzucie. Gdzie robię błąd

szopen · 18 Styczeń 2025 19:42

Uruchomiłeś prawidłowo PSRAM na swojej płytce prototypowej?
(w zależności od modelu jest tam w wersji quad lub octal, a nawet słówkiem nie wspomniałeś jaki sprzęt masz poza faktem, że jakiś S3), a od obecności wystarczającej ilości pamięci w ogóle zależy prawidłowe działanie firmware (to co ma MCU na pokładzie nie wystarczy na obsługę czegokolwiek związanego z przetwarzaniem głosu, dlatego potrzebujesz zewnętrznego RAMu który wymaga prawidłowej inicjalizacji, oczywiście on jest wlutowany już na moduł, ale rozwiązania są różne i od nich zależy jaki YAML jest OK)
https://docs.espressif.com/projects/esp-dev-kits/en/latest/esp32s3/esp32-s3-devkitc-1/index.html

Masz jakieś logi z pracy tego firmware? Mikrofon poprawnie podpięty i działa?

Wybrałeś język polski zamiast angielskiego wszędzie, a testy to raczej po angielsku.

Co do całej reszty to poczekaj na kogoś kto tego używa, bo to uwagi głównie odnośnie ogólnego działania ESPHome i równie ogólne co do asystenta głosowego.

@boskikak @surfek @Download-er @Grzegorz_Stelmach @kniazio @Grzegorz_Murach

boskikak · 18 Styczeń 2025 20:16

RPi4B + Whisper z Piperem to może być baaaardzo złe połączenie (bardzo szybko się zirytujesz prędkością działania i zrezynujesz ze sterowania głosowego). Jedyne co Ci mogę na szybko doradzić do dorobić sobie coś takiego w konfiguracji płytki:

switch:
  - platform: template
    id: assist
    icon: mdi:account-tie-voice
    name: "Asystent"
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on: 
      - voice_assistant.start
    on_turn_off:
      - voice_assistant.stop

To pomaga “kopnąć w d*pe” asystentowi i powinien zacząć działać. Niestety nikt nie zgłaszał problemów ze zwieszającymi się asystentami na ESP więc taki patent na to wymyśliłem. Aktualnie kończę swojego 4 asystenta też na płtyce S3 ale wykorzystuję inne biblioteki więc nie mogę tego sprawdzić w prost.

Krzysiek2 · 19 Styczeń 2025 08:57

@szopen
Mam ESP32 s3 N16R8 (Płytka rozwojowa modułu ESP32-S3 Wifi BT dla Arduino IDE ESP32-S3-WROOM1 N16R8 N8R2 44Pin Type-C 16MB Flash 8M PSRAM ESP32 S3 - AliExpress 502).
A mój asystent-glosowy.yaml ma postać:

esphome:
  name: asystent-glosowy
  friendly_name: Asystent głosowy
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - light.turn_on:
        id: led_ww
        blue: 100%
        brightness: 60%
        effect: fast pulse

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf

    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"
   
psram:
  mode: octal # Please change this to quad for N8R2 and octal for N16R8
  speed: 80MHz


# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "xxxxxxxxxxxxxxxxxxxxxxx"
  on_client_connected:
        then:
          - delay: 50ms
          - light.turn_off: led_ww
          - micro_wake_word.start:
  on_client_disconnected:
        then:
          - voice_assistant.stop: 



ota:
  - platform: esphome
    password: !secret wifi_password

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  manual_ip: 
    static_ip: 10.1.1.90
    gateway: 10.1.1.1
    subnet: 255.255.255.0
  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-S3-Asystent-glosowy"
    password: !secret wifi_password

captive_portal:


button:
  - platform: restart
    name: "Restart"
    id: but_rest

switch:
  - platform: template
    id: mute
    name: mute
    optimistic: true
    on_turn_on: 
      - micro_wake_word.stop:
      - voice_assistant.stop:
      - light.turn_on:
          id: led_ww           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
      - light.turn_on:
          id: led_strip           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
          
      - delay: 2s
      - light.turn_off:
          id: led_ww
      - light.turn_off:
          id: led_strip

      - light.turn_on:
          id: led_ww          
          red: 100%
          green: 0%
          blue: 0%
          brightness: 30%
      - light.turn_on:
          id: led_strip           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 30%

    on_turn_off:
      - micro_wake_word.start:
      - light.turn_on:
          id: led_ww           
          red: 0%
          green: 100%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
      - light.turn_on:
          id: led_strip  
          red: 0%
          green: 100%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
      - delay: 2s
      - light.turn_off:
          id: led_strip
      - light.turn_off:
          id: led_ww
  - platform: template
    id: timer_ringing
    optimistic: true
    internal: False
    name: "Timer Ringing"
    restore_mode: ALWAYS_OFF


# GPIO Mute Button Config
binary_sensor:
  - platform: gpio
    id: button01
    name: "Mute Button" # Physical Mute switch
    pin:
      number: GPIO10
      inverted: True
      mode:
        input: True
        pullup: True
    on_press: 
      if:
        condition:
          switch.is_on: timer_ringing 
        then:
          - switch.turn_off: timer_ringing
        else:
          - switch.toggle: mute



light:
  - platform: esp32_rmt_led_strip
    id: led_ww
    rgb_order: GRB
    pin: GPIO48
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: "On board light"
    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%

  - platform: esp32_rmt_led_strip
    id: led_strip    # LED Strip Config
    rgb_order: GRB
    pin: GPIO09
    num_leds: 15
    rmt_channel: 1
    chipset: ws2812
    name: "Led Strip"
    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%
      - addressable_scan:
          name: "Scan Effect With Custom Values"
          move_interval: 5ms
          scan_width: 10

          
          
 # Audio and Voice Assistant Config          
i2s_audio:
  - id: i2s_in # For microphone
    i2s_lrclk_pin: GPIO3  #WS 
    i2s_bclk_pin: GPIO2 #SCK

  - id: i2s_speaker #For Speaker
    i2s_lrclk_pin: GPIO6  #LRC 
    i2s_bclk_pin: GPIO7 #BLCK

microphone:
  - platform: i2s_audio
    id: va_mic
    adc_type: external
    i2s_din_pin: GPIO4 #SD
    channel: left
    pdm: false
    i2s_audio_id: i2s_in
    bits_per_sample: 32bit
    
speaker:
    platform: i2s_audio
    id: va_speaker
    i2s_audio_id: i2s_speaker
    dac_type: external
    i2s_dout_pin: GPIO8   #  DIN Pin of the MAX98357A Audio Amplifier
    channel: mono

    
micro_wake_word:
  on_wake_word_detected:
    
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
        silence_detection: true
    - light.turn_on:
        id: led_ww           
        red: 30%
        green: 30%
        blue: 70%
        brightness: 60%
        effect: fast pulse 
    - light.turn_on:
        id: led_strip
        effect: "Scan Effect With Custom Values"
        red: 80%
        green: 0%
        blue: 80%
        brightness: 80%
  models:
    - model: okay_nabu
    
voice_assistant:
  id: va
  microphone: va_mic
  auto_gain: 31dBFS
  noise_suppression_level: 2
  volume_multiplier: 4.0
  speaker: va_speaker
  on_stt_end:
       then: 
         - light.turn_off: led_ww
         - light.turn_off: led_strip
  on_error:
          - micro_wake_word.start:  
  on_end:
        then:
          - light.turn_off: led_ww
          - light.turn_off: led_strip
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start: 
  
  
  on_timer_finished:
    - micro_wake_word.stop:
    - voice_assistant.stop:
    - switch.turn_on: timer_ringing
    - wait_until:
        not:
          microphone.is_capturing:
    
    - wait_until:
        not:
          micro_wake_word.is_running:
    - light.turn_on:
        id: led_strip
        effect: "Scan Effect With Custom Values"
        red: 80%
        green: 0%
        blue: 30%
        brightness: 80%
    
    - lambda: id(va_speaker).play(id(timer_finished_wave_file), sizeof(id(timer_finished_wave_file)));
    - micro_wake_word.start:
    - wait_until:
        and:
          - micro_wake_word.is_running:

    - while:
        condition:
          switch.is_on: timer_ringing
        then:
          - lambda: id(va_speaker).play(id(timer_finished_wave_file), sizeof(id(timer_finished_wave_file)));
          - delay: 2s
    - wait_until:
        not:
          speaker.is_playing:
    
    - light.turn_off: led_strip
    - micro_wake_word.start:

external_components:
  - source: github://jesserockz/esphome-components
    components: [file]
    refresh: 0s

file: 
  - id: timer_finished_wave_file
    file: https://github.com/esphome/firmware/raw/main/voice-assistant/sounds/timer_finished.wav

@boskikak

To co radzisz zamiast RPi4B + Whisper z Pipere?
Czy możesz zdradzić więcej szczegółów nt. budowanych przez Ciebie asystentów?

boskikak · 19 Styczeń 2025 09:21

Trzeba na pewno mocniejszy sprzęt. Testowałem to połczenie na Wyse 5070 J5005 i było trochę za słabe. Możesz jeszcze sprawdzić model VOSK: GitHub - rhasspy/hassio-addons: Add-ons for Home Assistant's Hass.IO

Dużo zależy od tego czy chcesz puszczać też muzykę przez tego asystenta czy ma być tylko do sterowania głosowego. Jeśli masz już tą płytkę to szkoda wymieniać ale dotychczas korzystałem z frameworka arduino gdzie nie ma micro wake worda i działa to zadowalająco. Tutaj jest 3 asystent który u mnie pracował pół roku a teraz ma innego właściciela: ESP32- budowa swojego Atom Echo
Przy tej płytce zostań sobie na frameworku esp-idf. Testowałeś co się dzieje z logami przy dodatniu tego switcha o którym pisałem?

Krzysiek2 · 19 Styczeń 2025 10:13

Dzięki z porady
W Twoim kodzie jest w switch: name: "Asystent"
(jestem początkujący w yaml i HA )
w mnie w esphome: jest name: asystent-glosowy
Czy mam wprowadzić dokładnie Twój kod, czy w twoim zmienić name na name: asystent-glosowy (tak jak jest w mnie?)

boskikak · 19 Styczeń 2025 11:03

Name taki jak chcesz. Przełączając go zobaczysz co się dzieje z logami. Przy tej płytce i z tym frameworkiem nie mam jeszcze takiego doświadczenia bo dopiero od tygodnia testuje nowego asystenta i nie jestem jeszcze z niego do końca zadowolony

szopen · 19 Styczeń 2025 11:56

czyli jest OK

jeszcze powinieneś analizować logi.

Krzysiek2 · 19 Styczeń 2025 14:24

@boskikak89
Po dopisaniu Twojego kodu mam takie ustawienia w urządzeniu w ESPHome:

natomiast logi przy włączonym Asystencie (czyli tak jak na zrzucie powyżej takie:
Log Whisper:

[13:59:44] INFO: Successfully send discovery information to Home Assistant.
s6-rc: info: service discovery successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Sent info
DEBUG:wyoming_faster_whisper.handler:Language set to pl
DEBUG:wyoming_faster_whisper.handler:Audio stopped. Transcribing with initial prompt=null
INFO:faster_whisper:Processing audio with duration 00:15.000
DEBUG:faster_whisper:Processing segment at 00:00.000
DEBUG:faster_whisper:Log probability threshold is not met with temperature 0.0 (-1.965979 < -1.000000)
DEBUG:faster_whisper:Compression ratio threshold is not met with temperature 0.2 (11.754386 > 2.400000)
DEBUG:faster_whisper:Compression ratio threshold is not met with temperature 0.4 (17.882353 > 2.400000)
DEBUG:faster_whisper:Log probability threshold is not met with temperature 0.6 (-1.734122 < -1.000000)

Log Piper:

[13:49:02] INFO: Successfully send discovery information to Home Assistant.
s6-rc: info: service discovery successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service piper: starting
s6-rc: info: service piper successfully started
s6-rc: info: service discovery: starting
INFO:__main__:Ready

Log openWakeWord:

[14:03:22] INFO: Successfully sent discovery information to Home Assistant.
s6-rc: info: service discovery successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
DEBUG:wyoming_openwakeword.handler:Client connected: 2842962425326
DEBUG:wyoming_openwakeword.handler:Sent info to client: 2842962425326
DEBUG:wyoming_openwakeword.handler:Client disconnected: 2842962425326
DEBUG:wyoming_openwakeword.handler:Client connected: 2875055529844
DEBUG:wyoming_openwakeword.handler:Sent info to client: 2875055529844
DEBUG:wyoming_openwakeword.handler:Client disconnected: 2875055529844
DEBUG:wyoming_openwakeword.handler:Client connected: 2907136330013
DEBUG:wyoming_openwakeword.handler:Sent info to client: 2907136330013
DEBUG:wyoming_openwakeword.handler:Client disconnected: 2907136330013
DEBUG:wyoming_openwakeword.handler:Client connected: 2939215276741
DEBUG:wyoming_openwakeword.handler:Sent info to client: 2939215276741
DEBUG:wyoming_openwakeword.handler:Client disconnected: 2939215276741
DEBUG:wyoming_openwakeword.handler:Client connected: 2971302583015
DEBUG:wyoming_openwakeword.handler:Sent info to client: 2971302583015
DEBUG:wyoming_openwakeword.handler:Client disconnected: 2971302583015
DEBUG:wyoming_openwakeword.handler:Client connected: 3003386896928
DEBUG:wyoming_openwakeword.handler:Sent info to client: 3003386896928
DEBUG:wyoming_openwakeword.handler:Client disconnected: 3003386896928
DEBUG:wyoming_openwakeword.handler:Client connected: 3035476542263
DEBUG:wyoming_openwakeword.handler:Sent info to client: 3035476542263
DEBUG:wyoming_openwakeword.handler:Client disconnected: 3035476542263
DEBUG:wyoming_openwakeword.handler:Client connected: 3067574634806
DEBUG:wyoming_openwakeword.handler:Sent info to client: 3067574634806
DEBUG:wyoming_openwakeword.handler:Client disconnected: 3067574634806
DEBUG:wyoming_openwakeword.handler:Client connected: 3099672165513
DEBUG:wyoming_openwakeword.handler:Sent info to client: 3099672165513
DEBUG:wyoming_openwakeword.handler:Client disconnected: 3099672165513
DEBUG:wyoming_openwakeword.handler:Client connected: 3131771453846
DEBUG:wyoming_openwakeword.handler:Sent info to client: 3131771453846
DEBUG:wyoming_openwakeword.handler:Client disconnected: 3131771453846
DEBUG:wyoming_openwakeword.handler:Client connected: 3163870640985
DEBUG:wyoming_openwakeword.handler:Sent info to client: 3163870640985

Dalej nie działa wywołanie, działa tylko sterowanie z HA (czyli Asystent, Led strip, mute, On board light) i to w obie stronu (tzn. włączenie przycisku mute w “głośniczku” zmienia stan w HA i powoduje świecenie paska LED (różne w zależności od stanu Mute.

boskikak · 19 Styczeń 2025 14:50

Będą Cię interesować tylko logi z ESP. Musisz mieć też pewność że mikrofon działa bo bez tego to będziemy szukać bez końca

Krzysiek2 · 19 Styczeń 2025 15:23

Sorki, może głupie pytanie, a jak sprawdzić czy mikrofon działa (co prawda już podmieniłem na drugi, ale efektu brak)

boskikak · 19 Styczeń 2025 17:05

np tak: