We support system development centered on the following software and technologies, tailored to your specific needs.
We may also be able to accommodate software and technologies not listed here. Please feel free to contact us.
Speech Recognition
-
Whisper
A highly accurate speech recognition model developed by OpenAI. Supports multiple languages and various audio environments.
Spoken Dialogue & Avatar Interaction
-
MMDAgent-EX
An avatar-based voice interaction toolkit enabling the construction of dialogue systems with 3D characters. Developed by our director, Akinobu Lee.
LLM & Generative AI
-
ChatGPT / GPT API
OpenAI's large language model. Supports a wide range of applications including conversational AI, text generation, and code generation.
-
Claude / Anthropic API
Anthropic's large language model. Delivers high-quality AI responses with emphasis on long-document understanding, analysis, and safety.
-
Gemini / Google AI
Google's large language model. Excels at multimodal processing and integrates seamlessly with Google services.
-
LLaMA / Open-Source LLMs
Meta's open-source LLM. Enables on-premise deployment and customization for specific use cases.
Others
-
Google Speech-to-Text API
Cloud-based speech recognition service provided by Google. Supports real-time recognition and batch processing.
-
Multimodal Device Integration
Building complex interaction systems combining various sensors and devices for visual, auditory, and gesture-based interactions.