Introduction

This project implements a chat interface with voice input and output capabilities, running on a microcontroller with a display. It utilizes various modules for Wi-Fi connectivity, audio processing, and UI rendering. The script interacts with an AI model (likely GPT) for generating responses and can execute various tools based on the AI's instructions.

Main Components

  1. Imports and Initializations

    • Imports necessary modules and initializes hardware and UI components.
    • Sets up Wi-Fi connection using provided credentials.
  2. User Interface

    • Utilizes the LVGL library for creating a graphical user interface.
    • Includes a chat container, status label, and a record button.
  3. Audio Processing

    • Implements recording and playback functionalities.
    • Uses OpenAI's API for speech-to-text conversion.
  4. AI Interaction

    • Communicates with an AI model (likely GPT) using Anthropic's API.
    • Supports tool execution based on AI responses.
  5. Asynchronous Operations

    • Uses asyncio for managing concurrent tasks.

Key Functionalities

1. UI Initialization (init_ui())

2. Message Handling

3. Input Processing

4. AI Interaction

5. Tool Execution

6. Audio Playback

7. Main Loop (main())

Usage

The script is designed to run on a microcontroller with the necessary hardware components (display, buttons, microphone, speaker). It creates an interactive chat interface where users can have voice conversations with an AI assistant.

Dependencies

LLM and audio APIs

Overview

The api_utils.py file contains utility functions for interacting with various APIs, primarily focused on AI language models and speech processing. It provides asynchronous functions for tasks such as communicating with AI models, text-to-speech conversion, and speech-to-text transcription.

Functions

1. llm(api_key, messages, max_tokens=8192, temp=0, system_prompt=None, tools=None)

Interacts with an AI language model (likely Claude by Anthropic) to generate responses.

2. text_to_speech(api_key, text, acallback=None)

Converts text to speech using OpenAI's API.

3. speech_to_text(api_key, bytes_io)

Transcribes speech to text using OpenAI's API.

4. atranscribe(api_key, bytes_io) (Deprecated)

A deprecated function for transcribing speech to text. It's recommended to use speech_to_text instead.

Helper Class

FormData

A utility class for constructing multipart form-data for API requests.

Usage Notes

  1. All main functions (llm, text_to_speech, speech_to_text) are asynchronous and should be used with await in an asynchronous context.

  2. The llm function is designed to work with a specific version of Claude (claude-3-5-sonnet-20240620) and includes support for tool use.

  3. The text_to_speech function is set to generate Spanish speech by default. Modify the language parameter in the payload if a different language is needed.

  4. The FormData class is used internally by speech_to_text to prepare the audio data for the API request.

  5. Error handling is implemented in each function, with errors being printed to the console. In production, you might want to implement more robust error handling and logging.

  6. The deprecated atranscribe function uses a lower-level networking approach. It's kept for reference but should not be used in new code.

Dependencies

Ensure these dependencies are installed and available in your environment when using this module.

LLM Tools

Overview

The tools_default.py file contains utility functions and their descriptions for performing various file operations and running tests. These tools are designed to be used within a larger system, possibly as part of an AI-assisted development environment.

Tool Descriptions and Functions

1. Read File

Reads the contents of a specified file.

2. Write File

Writes data to a specified file.

3. Run File

Executes a specified Python file and returns its output.

4. Run Unit Test

Runs unittest on a specified test file and returns the results.

Key Features

  1. Error Handling: All functions include try-except blocks to catch and report errors.
  2. Output Capture: The run_file_tool and run_unittest_tool functions use os.dupterm to capture stdout, allowing them to return the output of executed files.
  3. Output Limitation: The run_file_tool function limits the returned output to the last 1024 characters to reduce token usage.
  4. Dynamic Import: The run_unittest_tool function dynamically imports the test module based on the file path.

Usage Notes

  1. These tools are designed to be used programmatically, likely as part of a larger system that can interpret the tool descriptions and call the appropriate functions.
  2. The tool descriptions (*_tool_desc) include an input_schema that defines the expected input format. This can be used for validation or documentation purposes.
  3. File paths are assumed to be relative to the current working directory or absolute paths.
  4. The run_file_tool function executes Python code in the global namespace, which could potentially modify the global state. Use with caution.

Dependencies

Security Considerations

  1. The run_file_tool executes arbitrary Python code, which could be a security risk if used with untrusted input.
  2. File operations are performed without explicit path sanitation, which could potentially allow access to sensitive files if not properly controlled.

Potential Improvements

  1. Add path sanitation to prevent unauthorized file access.
  2. Implement more robust error handling and logging.
  3. Consider adding a configuration option for the output character limit in run_file_tool.
  4. Add support for passing arguments to the executed Python files in run_file_tool.

Prompt contents

Overview

The prompt_utils.py file contains a system prompt (system_prompt) designed to guide an AI assistant in creating or modifying games for the MicroPython 1.23.0 platform. This prompt sets the context and behavior for the AI, ensuring it follows a specific process and considers important factors in MicroPython game development.

Contents

system_prompt

A string variable containing detailed instructions for the AI assistant. The prompt covers the following main areas:

  1. Role Definition: The AI is positioned as an expert game developer specializing in MicroPython 1.23.0.

  2. Process Steps:

    • Determine if the user wants to create a new game or modify an existing one.
    • Guide the user through requirement gathering or modification.
    • Manage requirement documentation in requirements.txt.
    • Read and modify game code (either from game_template.py or existing game.py).
    • Consider game_logs.txt for modifications to existing games.
  3. Development Considerations:

    • Optimize code for speed and limited resources.
    • Utilize MicroPython-specific libraries and functions.
    • Implement sprite usage.
    • For classic games, aim to match NES version features.
  4. Hardware Specifications:

    • Describes the user interface layout, including joystick, display, buttons, speaker, and RGB LED.
  5. Communication Guidelines:

    • Instructs the AI to respond briefly in Spanish, except when thinking or using tools.
    • Advises against showing full requirement lists, preferring brief summaries.

Key Features

  1. Structured Development Process: Provides a clear, step-by-step approach for game development or modification.
  2. MicroPython Optimization: Emphasizes the importance of optimizing for the MicroPython 1.23.0 environment.
  3. User Interaction: Guides the AI to engage with the user for requirement gathering and confirmation.
  4. File Management: Specifies how to handle various files (requirements.txt, game_template.py, game.py, game_logs.txt).
  5. Hardware Awareness: Includes information about the target hardware's interface.

Usage

This system prompt is intended to be used as input for an AI model, setting the context and behavior for game development interactions. It should be provided to the AI at the beginning of a conversation or task related to MicroPython game development.

Important Notes

  1. The prompt is in English, but instructs the AI to respond in Spanish for most interactions.
  2. The AI is instructed not to start coding until requirements are gathered and confirmed.
  3. The prompt assumes the existence of certain files (game_template.py, requirements.txt, game.py, game_logs.txt) in the development environment.

Audio input and output

Overview

The i2s_utils.py file contains utility functions for audio input and output using the I2S (Inter-IC Sound) protocol. It provides asynchronous functions for playing audio (text-to-speech) and recording audio, designed to work with a microcontroller that supports I2S.

Functions

1. play(key, text, cb_complete=None)

Converts text to speech and plays it through an I2S audio output.

2. record(cb_complete=None, timeout=5)

Records audio from an I2S microphone input.

Helper Classes

1. asyncio_Queue

A simple asynchronous queue implementation used in the play function.

Constants and Configuration

Usage Notes

  1. The play function uses the api_utils.text_to_speech function to convert text to speech. Ensure that the necessary API key and network connectivity are available.

  2. Both play and record functions are asynchronous and should be used with await in an asynchronous context.

  3. The cb_complete callback in both functions can be used to implement interrupt logic (e.g., stop playback or recording based on a button press).

  4. The recording function creates a WAV file in memory. Be mindful of memory constraints when recording for extended periods.

  5. The I2S configuration is hardcoded. Modify the constants if different pin assignments or audio settings are needed.

Dependencies

Ensure these dependencies are available in your MicroPython environment when using this module.

Error Handling

The functions in this module do not include explicit error handling. In a production environment, you might want to add try-except blocks to handle potential errors, such as I2S initialization failures or API communication issues.