Week1

Week 1 #

Project description #


Project name: VoiceDiary

Code repository: link

VoiceDiary is an AI-powered voice journaling tool that analyzes tone, emotion, and key themes in spoken entries. It generates personalized emotional insights and well-being suggestions based on recorded reflections.

Team #


Team memberTelegram aliasInnopolis EmailResponcibilities
Dziyana Melnikava@meldilen24dz.melnikava@innopolis.universityPM, Frontend
Anastasia Kuchumova@n_rngka.kuchumova@innopolis.universityFrontend, UX/UI
Dzhamilia Fatkullina@jam11ad.fatkullina@innopolis.universityML
Elina Kuzmichyova@lin_anilee.kuzmichyova@innopolis.universityML
Olesia Novoselova@doiwannaknoww8o.novoselova@innopolis.universityBackend
Danil Davydyan@chocopd.davydyan@innopolis.universityBackend

Brainstorming #


Ideas during brainstorming #

| Rank | Idea | | 1 | An AI-based voice journaling platform that processes spoken reflections to detect emotional tone, mood trends, and core topics, offering personalized mental well-being feedback. | | 2 | A self-growth tracker where users build and evolve virtual ā€œselvesā€ representing different aspects of personality. | | 3 | A sleep-integrated dream and emotion journal that analyzes voice-recorded dreams and detects subconscious emotional trends. |

Problem validation #

1. VoiceDiary

Problem: People often struggle to identify and understand their emotional patterns, lack time or motivation for traditional journaling, and find it hard to reflect consistently.

Validation:

  • Voice is perceived as a more natural and less effort-intensive method of self-expression in self-reflection contexts.

2. PersonaPulse

Problem: Many people feel disconnected from different parts of their personality (e.g., professional vs emotional self) and lack structured tools to develop personal traits in a holistic and engaging way.

Validation:

  • Behavioral patterns suggest that people engage more when self-development is approached playfully or symbolically rather than abstractly.

3. Sleep-Sleep

Problem: Dreams are a rich source of emotional insight, but most people forget them quickly or lack a structured way to reflect on their subconscious thoughts.

Validation:

  • Cognitive psychology suggests that reflecting on dream content can increase awareness of unresolved emotional issues.

Basic Requirements #


Target Users and Their Primary Needs #

Reflective Individuals: People interested in understanding themselves better, journaling, or mental wellness

  • Need: A natural, non-intrusive way to reflect emotionally

Busy Individuals: People with limited time for self-reflection or therapy

  • Need: A low-effort tool to track emotions and stress without interrupting routine

User Stories #

  • As a user, I want to speak freely into the app so I can reflect without needing to write or plan.
  • As a user, I want the experience to require minimal effort so I can use it even on busy or stressful days.
  • As a user, I want the app to identify patterns in my mood and emotions over time so I can understand my mental state (statistics).
  • As a user, I want the system to recognize tone and key themes in what I say so I can gain deeper self-insights.
  • As a user, I want to receive recommendations so I can take small steps toward emotional well-being.
  • As a user, I want the app to feel like a personal, private space so I can be honest and unfiltered in my entries.

Initial Scope (MVP) #


IN Scope (W1–W3)OUT (Post-MVP)
Basic homepage with embedded calendarAuthentication + Enhanced homepage with achievements
Integration with 1 emotion recognition modelVoice record analysis + recommendations
No statisticsEmotion statistics (weekly/monthly/yearly)

Tech Stack #


ChoiceJustification
FrontendReactFor building a responsive, modern web interface with reusable components and efficient state management.
BackendGolangHigh performance, concurrency support, and fast API response times; suitable for real-time web services.
ML APIPythonExtensive machine learning libraries and seamless integration with pretrained models.
ML ModelsWhisper (by OpenAI), Emotion_Recognition_From_Speech, xlsr-wav2vec-speech-emotion-recognitionOpen-source, speech-focused models with high accuracy; suitable for emotion detection from voice; will be tested and selected based on performance.
LLaMATo generate contextual emotional feedback and personalized recommendations based on analysis.
DatabasePostgreSQLReliable relational database with strong consistency guarantees and support for structured data (e.g. logs, users, entries).
AuthOAuthStandard and secure method for third-party authentication, reduces password handling and improves UX.
InfraDocker + Docker ComposeEnsures consistent development and production environments, simplifies deployment.
GitHub Actions (CI/CD)Enables continuous integration, testing, and deployment with minimal manual effort.

Collective commitments #


Met with the team, Discussed our project visions, Finalized the system architecture and technology stack after thorough discussion of all ideas.

Individual commitments #


Team memberTelegram aliasContributionLink
Dziyana Melnikava (Lead)@meldilen24Developed the frontend project structure and finalized the UX/UI design—agreeing on the layout and visuals, with references sourced from Pinteresthttps://github.com/IU-Capstone-Project-2025/VoiceDiary/commit/77cfcc49a1ce1f5ee2a28c466bec73250009cc60, https://pin.it/35So4qndM
Anastasia Kuchumova@n_rngkDeveloped the frontend project structure and finalized the UX/UI design—agreeing on the layout and visuals, with references sourced from Pinteresthttps://github.com/IU-Capstone-Project-2025/VoiceDiary/commit/d81e7231d7c82c923c435448f21307478cb15807, https://github.com/IU-Capstone-Project-2025/VoiceDiary/commit/2684b933021ed546455311b5e23e13603908d098, https://pin.it/35So4qndM
Dzhamilia Fatkullina@jam11aResearched emotion recognition models and datasets, selecting the most suitable ones for the projecthttps://huggingface.co/harshit345/xlsr-wav2vec-speech-emotion-recognition, https://huggingface.co/dmdoy/Emotion_Recognition_From_Speech, https://github.com/openai/whisper/blob/main/model-card.md, https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio, https://www.kaggle.com/ejlok1/cremad, https://www.kaggle.com/ejlok1/toronto-emotional-speech-set-tess, https://www.kaggle.com/datasets/ejlok1/surrey-audiovisual-expressed-emotion-savee, https://github.com/salute-developers/golos/tree/master/dusha#dusha-dataset and a few others
Elina Kuzmichyova@lin_anileRead research papers on the selected models, conducted a comparative analysis, and outlined the machine learning implementation planhttps://arxiv.org/abs/2212.04356, https://arxiv.org/abs/2212.12266, https://huggingface.co/dmdoy/Emotion_Recognition_From_Speech
Olesia Novoselova@doiwannaknoww8Wrote project report and user storiesthis commit ; )
Danil Davydyan@chocopSet up the backend infrastructure, started developing backend of the projecthttps://github.com/IU-Capstone-Project-2025/VoiceDiary/commit/ab02bd635be5262cbbfa3920b1994cfb2e57dda4

Confirmation of the code’s operability #

We confirm that the code in the main branch:

  • [āœ“] In working condition.
  • [āœ“] Run via docker-compose (or another alternative described in the README.md).