A React + Express application demonstrating Azure Speech Services Avatar capabilities with real-time text-to-speech synthesis and visual avatar representation using WebRTC.
- Real-time Avatar Synthesis: Visual avatar with synchronized lip movements
- WebRTC Integration: Low-latency streaming of avatar video and audio
- Chat Interface: Interactive conversation with message history
- Azure Speech Services: Powered by Azure Cognitive Services
- TypeScript Support: Full type safety for both frontend and backend
- Node.js (v18+)
- npm or yarn
- Azure Speech Services subscription (Standard S0 tier required for Avatar)
- Modern browser with WebRTC support (Chrome, Edge, Firefox, Safari)
test-AZ-speech/
├── backend/ # Express.js server
│ ├── src/
│ │ └── index.ts # Main server file with API endpoints
│ ├── dist/ # Compiled JavaScript
│ ├── package.json
│ ├── tsconfig.json
│ └── .env.example # Example environment variables
│
└── frontend/ # React application
├── src/
│ ├── App.tsx # Main component with avatar logic
│ ├── App.css # Styling
│ └── index.tsx # Entry point
├── public/
└── package.json
git clone git@github.com:marcus888-techstack/test-AZ-Speech-Avatar.git
cd test-AZ-Speech-Avatarcd backend
npm install
# Copy the example environment file
cp .env.example .envEdit .env with your Azure credentials:
AZ_SPEECH_KEY=your_azure_speech_key_here
AZ_SPEECH_REGION=your_azure_region_here
PORT=3000cd ../frontend
npm install- Start the backend server:
cd backend
npm run devThe server will run on http://localhost:3000
- Start the frontend (in a new terminal):
cd frontend
npm startThe application will open at http://localhost:3000
- Build the backend:
cd backend
npm run build
npm start- Build the frontend:
cd frontend
npm run buildServe the build folder with any static file server.
- Open the application in your browser
- Wait for the avatar to initialize (you'll see "Initializing avatar...")
- Type a message in the chat input
- Press Enter or click Send
- The avatar will appear and speak your message with synchronized lip movements
- Continue the conversation!
| Endpoint | Method | Description |
|---|---|---|
/api/health |
GET | Health check endpoint |
/api/speech/token |
GET | Get Azure Speech token and region |
/api/speech/ice-token |
GET | Get ICE server configuration for WebRTC |
/api/speech/synthesize |
POST | Optional: Server-side speech synthesis |
- Create an Azure Speech Services resource in the Azure Portal
- Select Standard S0 pricing tier (required for Avatar)
- Copy your key and region
- Add them to the backend
.envfile
- lisa (default)
- jason
- And more available in Azure documentation
- casual-sitting (default)
- technical-standing
- business-standing
- And more available in Azure documentation
- Check browser console for errors
- Verify Azure credentials in
.env - Ensure you're using Standard S0 pricing tier
- Check if your region supports Avatar feature
- Ensure both backend and frontend are running
- Check CORS configuration
- Verify firewall settings allow WebRTC connections
- Check browser permissions for audio/video
- Ensure WebRTC is supported in your browser
- Try using a different browser
- Never commit
.envfiles with real credentials - Use environment variables for production deployments
- Consider implementing authentication for production use
- Use HTTPS in production for WebRTC security
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Azure Speech Services team for the Avatar API
- Microsoft Cognitive Services Speech SDK
- React and Express.js communities
Built with ❤️ using React, Express, and Azure Speech Services