Week #6

Week #6 #

Final deliverables #

Project overview #

Our project is one of the services of the InNoHassle ecosystem - a unified digital platform created to facilitate the routine of Innopolis students.

Problems of IU students:

  • All information is scattered across different sources
  • There is no single reliable knowledge base

Solution:
The Search service produces information searches for all relevant resources available to Innopolis students.

Features #

  • Search engine for IU-related sites

    • User hints in the form of default queries
    • Filtering answers by sources
    • Visual preview of websites
    • Text preview of the answer from the website
    • The answer link leads directly to the desired section of the page
  • AI system answering questions about IU

    • LLM answers (openai gpt-4.1-mini)
    • Links to references
  • AI interface performing actions for the user

    • Booking a music room or stating the impossibility of booking in response to a user request in free form

Sources of information: #

Sites available without authorization as an Innopolis student are parsed every hour. Our search engine provides answers based on their current content.
These sources include:

Sites available only to students are considered confidential. We have independently created and manually maintain a database of their functionality, describing in general terms what can be found on them. Only this information is available to users of our service until they go to the original site.
These sources include:

Tech stack #

Backend:

  • Python 3.11: The main programming language used for backend development.
  • poetry: Dependency and package manager for Python projects; used to manage virtual environments and libraries.
  • FastAPI: High-performance web framework for building RESTful APIs.
  • Pydantic: Data validation and parsing with type hints
  • MongoDB: NoSQL document-oriented database used to store application data.

Frontend:

  • TypeScript, Node.js: Type-safe programming language (TypeScript) and runtime environment (Node.js) used for frontend tooling and development.
  • React: JavaScript library for building user interfaces.
  • Vite: Fast build tool and development server for modern frontend frameworks like React.
  • TailwindCSS: Utility-first CSS framework used for styling components with minimal custom CSS.

ML

  • Lancedb: Vector database optimized for similarity search and storing embeddings.
  • Bi-encoder, cross-encoder: Models used for semantic search and ranking; bi-encoders for fast retrieval, cross-encoders for accurate re-ranking.
  • Infinity: High-performance inference engine for deploying and running ML models efficiently.
  • langdetect: Language detection library to identify the language of input text.
  • LLM openai/gpt-4.1-mini: Lightweight version of OpenAI’s GPT-4 model used for tasks like summarization, generation, or semantic understanding.
  • facebook/m2m100_418M: Multilingual translation model from Facebook (Meta) used for machine translation across multiple languages.

Setup instructions #

See the repositories README.md for detailed instructions:

Presentation draft #

Use vpn to access our presentation

Weekly commitments #

Team management #

Anna prioritized existing tasks (see Kanban board, search project), assigned top-priority tasks to performers, reviewed all actions of team members, controlled the redesign process, communicated with customers from one-zero-eight.

ML #

Anna corrected the transmission of the prompt system and changed its structure (see commits #1, #2).

Sofia has implemented:

  • Encoder change: bi- and cross-encoders have been updated, which has increased the accuracy and speed of the search (see commits #1 and #2)

  • Optimization of the LLM response backend: if the LLM returns that there is no information, the backend does not send resources to the client; when the information is found, it returns only filtered, maximally relevant results (see commit)

  • Implementation of the ACT module: added a service for booking a music room with endpoints /act/availability and /act/book and saving statistics to the database (see commits #1 and #2).

Backend #

Anna added typing of static resources to the backend (commit #1, #2) Anna also added processing of the innohassle user token so that the act service could execute requests on his behalf (see commit).

Vladimir extended resources we use with ITHelp Wiki. He analysed each page and prepared it for databases(see commit). Also Vladimir migrated our project from poetry to uv (see commits #1 and #2)

Azaliia fixed parsers for eduwiki and campus_life. Redirection issue is solved, section parsing is corrected (see commits #1, #2)

Frontend #

Anna added correct filtering of static resources: InNoHassle, My University, ITHelp Wiki (commits #1, #2)

Aliia:

  • Made redesign in accordance with user testing feedback: see Figma
  • Updated the design, made it adaptive for small screens: see commit #1
  • Added links highlighting in the ask response: #2
  • Updated the code according to the new design: #3, #4, #5
  • Added 502 processing for ask: #6
  • Connected act page to backend: #7

Overall progress #

Redesign #

Our most significant advances include the redesign we made based on the results of user testing sessions. We moved the functionality to separate sections of the site instead of buttons as it was before because the functionality of the buttons raised questions among users. We also changed the filter field.

Old design: Image

Updated design: Image

Act functionality #

Another important feature was the connection of the act functionality. Now it allows users to book a music room for the time they want using a free-form request.
Request:
Image Result:

Individual contribution of each participant #

Team MemberContribution
Anna Belyakova (Lead)See team management, backend, frontend and ML sections
Vladimir PaskalSee backend section
Azaliia AlishevaSee backend section
Aliia BashirovaSee frontend section
Sofia PushkarevaSee ML section

Plan for Next Week #

  • Separate static resources on the frontend
  • Implement the ability to continue the conversation in the ask section
  • Add visual previews to static resources
  • Fix the preview text in the search section
  • Prepare for the presentation

Confirmation of the code’s operability #

!!! The working code in the backend repository is in the main branch, and in the frontend repository in the capstone branch (difficulties due to automatic deployment)

  • In working condition.
  • Run via docker-compose (or another alternative described in the README.md).