Week #5 #

Summary of feedback collection methods and key findings. #

Sessions # Conduct at least three sessions with potential users, describe detailed user reviews here (potential users should NOT be your team members).

Analyze # Describe the important points that you received from the user feedback, what issues and with which priority you created.

Link to updated backlog reflecting feedback. #

Description and demonstration (screenshots/GIFs, links to PRs) of implemented improvements and bug fixes. #

Updated link to the deployed application. #

Frontend Part #

Overview #

This section describes the key improvements made to the frontend interface, focusing on mobile adaptation, user experience enhancements, and API integration optimizations. Major changes include complete mobile responsiveness, redesigned navigation, and improved data loading patterns.

1. Mobile Adaptation Improvements #

Implementation:

Restructured header navigation for narrow mobile screens
Implemented responsive breakpoints for different device sizes
Optimized touch targets for mobile usability

Technical Details:

Used CSS media queries for viewport-specific styling
Implemented hamburger menu pattern for mobile
Commit Reference

Full Mobile Adaptation #

Implementation:

Comprehensive responsive design implementation across all components
Viewport-optimized layouts for all mobile size
Mobile-first approach for critical user flows

Technical Details:

Used CSS media queries for viewport-specific styling
Commit Reference

2. Prediction Interface Enhancements #

Instruction Panel System #

Implementation:

Added interactive tutorial for prediction workflow
Implemented panel transition between instructions and prediction view
Fixed positioning for persistent visibility during scrolling

Technical Details:

React state management for panel visibility
CSS position: sticky for persistent display
Smooth animations for panel transitions
Commit Reference

Forecast Description Window #

Implementation:

Interactive popup explaining prediction methodology
Context-sensitive positioning relative to forecast elements
Accessible design with keyboard navigation support

Technical Details:

Portal-based implementation for proper z-index management
Dynamic positioning calculations
Commit Reference

3. Data Presentation Improvements #

Statistics Panel Enhancements #

Implementation:

Multiple simultaneous stat panel support for comparison
Revised opening/closing logic with improved state management
Conditional rendering optimizations

Technical Details:

Array-based state management for multiple panels
Memoized component rendering
Commit Reference

Team Logos Integration #

Implementation:

Added visual team identifiers across all interfaces
Responsive logo sizing system
Fallback handling for missing assets

Technical Details:

Asset pipeline integration
CDN-based image delivery
Commit Reference

4. API Integration & Data Flow #

Prediction API Optimization #

Implementation:

Restructured response handling for better performance
Percentage calculation fixes
Error state management

Technical Details:

Response normalization layer
Client-side calculation validation
Commit Reference

Team Statistics API Separation #

Implementation:

Decoupled team metadata from statistical data
Parallel loading for improved performance
Independent error handling

Technical Details:

Separate Redux actions/reducers
React Query implementation
Commit Reference

5. Testing & Deployment #

E2E Testing Pipeline Fixes #

Implementation:

Docker container integration for testing
Environment configuration management
Pipeline reliability improvements

Technical Details:

Docker Compose test configuration
CI/CD workflow adjustments
Main Commit | Final Commit

Browser Tab Metadata #

Implementation:

Favicon integration
Title tag standardization
PWA compatibility

Technical Details:

Webpack asset processing
HTML meta tag configuration
Commit Reference

Backend & Data Updates #

Key Improvements #

1. Automated Daily DB Updates #

Implemented cron jobs for scheduled database refreshes
Ensures predictions always use fresh data

2. Enhanced Tournament & Match Data #

Added:
- Country information to tournaments
- Club logos for better visualization
- Precise match timing data
Fixed Predictions API accuracy issues

3. Feature Research for ML Model #

Identified key performance factors for teams
Evaluating data collection methods:
- API integrations
- Web parsing solutions
- Manual data curation

ML part #

Overview #

This part describes considerations for further ML improvements, as well as a description of the implemented parts. Also this week, the 3rd class classifier was replaced with a 2nd class classifier, which allowed us to increase the prediction accuracy from 0.48 to 0.69. This was done based on the considerations that the 2nd class brothel is more in line with the requirements of the product, and it will also increase the accuracy of the prediction right now.

1. API-Compatible Feature Selection #

Objective #

Filter training features to only include those accessible through external APIs to ensure real-world applicability.

Implementation #

Data Audit: Review current feature set and match with API availability
Feature Mapping: Create mapping between internal features and API endpoints
Validation Pipeline: Implement checks to ensure feature consistency between training and prediction

Expected Outcome #

Reduced feature set but improved model reliability in production environment.

2. Feature Categorization #

General Football Features #

Team statistics (goals scored/conceded, possession, shots)
Historical match results
League standings and form
Weather conditions and venue factors

Player-Specific Features #

Individual player statistics (goals, assists, minutes played)
Player ratings and performance metrics
Injury status and availability
Transfer market values

Rationale #

Separating features allows for specialized processing and better model interpretability.

3. Player Embedding Layers (done) #

Implementation #

Overview: #

Implemented player embeddings using the European Soccer Database with 11,060 unique players and 35 numerical attributes per player.

Core Implementation Steps #

Data Processing Pipeline:

Source: European Soccer Database (183,978 player records)
Players: 11,060 unique players with attribute data
Features: 35 numerical attributes including overall_rating, potential, crossing, finishing, dribbling, etc.
Missing Values: Handled via mean imputation (< 1.5% missing for most attributes)

Temporal Weighting System:

# Exponential decay implementation
weights = np.array([0.95 ** (n_records - i - 1) for i in range(n_records)])

Decay Factor: α = 0.95 (configurable)
Logic: Recent player performance weighted higher than historical data
Application: Applied to all 35 numerical attributes per player

Player Profile Creation #

Aggregation: Weighted averages calculated for each attribute per player
Categorical Data: Latest record used for preferred_foot, work_rates
Profile Structure: Each player represented by 35-dimensional feature vector

Embedding Preparation #

Normalization: StandardScaler applied to all numerical features
Player Mapping:
- player_id_to_idx: Maps API IDs to sequential indices
- idx_to_player_id: Reverse mapping for inference
Serialization: Complete embedding data saved as embeddings_data.pkl

Technical Specifications #

Embedding Dimension: 35 features
Data Structure: Normalized profiles ready for embedding layer input
Scalability: Mapping system supports easy addition of new players

Integration is Ready to Integration.

4. Temporal Weight Implementation #

Exponential Decay Formula #

weight(t) = exp(-α * (current_time - match_time))

Where α is the decay parameter.

Implementation Strategy #

Decay Parameter: Start with α = 0.1 and tune based on validation performance
Time Windows: Apply stronger weights to matches within last 6 months
Seasonal Adjustments: Account for transfer windows and season breaks

Benefits #

Recent matches have higher influence on predictions
Adapts to team form and tactical changes
Reduces impact of outdated information

5. Ensemble Architecture #

Proposed Models:

XGBoost Component
LSTM Component
Ensemble Combination
Method: Weighted averaging or stacking
Weights: Determine through cross-validation
Meta-learner: Consider logistic regression for final predictions

References #

Razali et al. (2024) showed XGBoost effectiveness for football prediction
Zhang et al. (2022) demonstrated LSTM success in sports forecasting

6. Evaluation Metrics #

Primary Metrics #

Accuracy: Overall prediction correctness
F1-Score: Balanced precision and recall, especially important for draw predictions
Log-Loss: Measures probability calibration quality
Brier Score: Evaluates probabilistic prediction accuracy

7. Regularization Strategy #

L1 Regularization (Lasso)
L2 Regularization (Ridge)

Monitoring #

Validation Loss: Track to detect overfitting
Learning Curves: Monitor training vs validation performance
Early Stopping: Implement based on validation metrics

Individual Contribution of Each Participant #

Arina Zimina:
- Provide daily database updates | Add cron scheduling | Link to commit in GitHub repository
- Backend improvements | Fix predictions API, add country to Tournament API, add clubs’ logos, add time of matches | Add logos, Add time, Fix API
- Search data for dataset
- Search for features to improve the ML model | Help with finding relevant features for our teams participating in the tournament to determine predictions (parsing/api/manual copying) | Link to doc
Artem Panov:
- PM:
  - Template for individual contribution of each participant part in report.
  - Meeting organization and Task distribution | Link to Kanban board in Weeek
- ML:
  - Exploring approaches and ideas to improve the ML part | part of report
  - Player Embedding Layers implementation | Link to commit in GitHub repository
  - Replacing a 3-class classifier with a binary classifier | Link to commit in GitHub repository
Karina Siniatullina:
- Adapted the navigation in the header for the mobile version. | Adapted the navigation in the header for the mobile version so that it narrows down to a narrower version. | Link to commit
- Added instruction panel for forecast. | Added instructions for predicting the match, which are initially on the page, and then go to the prediction panel and back. | Link to commit
- Fixed instruction panel. | This panel is fixed on top of the screen. | Link to commit
- Prediction panel output. | Changed the order of prediction output. | Link to commit
- Statistic panel clickability. | Fixed the condition for the statistics panel to appear. | Link to commit
- Prediction api. | Changed api responses and fixed percents. | Link to commit
- Teams’ statistics api. | Separated api responses for team and its statistics. | Link to commit
- Opening statistics. | Now user can open more than one stats panel and compare them. | Link to commit
- Window for forecast description. | There is a pop-up window near the forecast that describes how the forecast is calculated. | Link to commit
- Logos for teams. | Added logos for all teams. | Link to commit
- Adaptation. | Added adaptation of the entire design for the mobile version of the site. | Link to commit
Egor Sergeev:
- Created a list of different questions for a survey to find improvements
- Conduct surveys | Link to google disk
- Send to different groups in telegram to collect feedback
- Aggregate survey results and create a list of improvements for the project based on this | Link to google doc
- Create a document of overall feedback from survey | [Link to google doc] ( https://docs.google.com/document/d/1gPtPhgrDNgGuZ1pvSb2B4HGsGZYB8Rv5YHTEdbQJRe0/edit?usp=sharing)
- Write a weekly report.
Egor Agapov:
- Search for features to improve the ML model | Help with finding relevant features for our teams participating in the tournament to determine predictions (parsing/api/manual copying) | Link to google doc
- Website improvement based on feedback requests | Below I described in detail exactly what I did
- Fix the tab in the browser | Changed the tab’s logo instead of the base one and changed the name of the tab itself | Link to commit
- Fix the data loading process on both pages | I made sure that the entire skeleton of the page was loaded immediately and the data was dynamically loaded later | Link to commit
- Fix the e2e test in the pipeline | To Complete the test, it was necessary to run the docker containers of the entire project in the pipeline. | Link to main commit | Link to final commit
- Fixing the navigation menu on the website | Fixed hovering the cursor over a page where the user is already on, so that hovering does not merge with the background. | Link to commit

Weekly Commitments #

Plan for Next Week #

Sprint Goal #

Finalize the project, prepare all deliverables, and craft a compelling presentation.

Frontend #

Ensure design documents (Figma) are up to date and reflect the final product - Karina, Egor A
Add a disclaimer stating that the model may be wrong and the accuracy of the model is 0.69 - Karina, Egor A
Add a representation of list with features of model - Karin, Egor A
Cleaning the code (removing unused parts) - Karina, Egor A
Comments on the code - Karina, Egor A

Backend #

Cleaning the code (removing unused parts) - Arina
Comments on the code - Arina
Filling the DB with new data - Arina
API docs - Arina

ML #

Creating new, more relevant data from the dataset API for training - Arina
Cleaning the code (removing unused parts) - Artem
Comments on the code - Artem
Description of further improvements - Artem
Training a model on new data - Artem
Adding time scales for all model features - Artem

Presentation #

Create a draft presentation - Egor S

Sprint acceptance criteria #

A fully completed project
Comprehensive final documentation (README, API docs, Figma)
Draft of presentation slides
Plan for the live demo

Week #5 #

Summary of feedback collection methods and key findings. #

Link to updated backlog reflecting feedback. #

Description and demonstration (screenshots/GIFs, links to PRs) of implemented improvements and bug fixes. #

Updated link to the deployed application. #

Frontend Part #

Overview #

1. Mobile Adaptation Improvements #

Header Navigation Redesign #

Full Mobile Adaptation #

2. Prediction Interface Enhancements #

Instruction Panel System #

Forecast Description Window #

3. Data Presentation Improvements #

Statistics Panel Enhancements #

Team Logos Integration #

4. API Integration & Data Flow #

Prediction API Optimization #

Team Statistics API Separation #

5. Testing & Deployment #

E2E Testing Pipeline Fixes #

Browser Tab Metadata #

Backend & Data Updates #

Key Improvements #

1. Automated Daily DB Updates #

2. Enhanced Tournament & Match Data #

3. Feature Research for ML Model #

ML part #

Overview #

1. API-Compatible Feature Selection #

Objective #

Implementation #

Expected Outcome #

2. Feature Categorization #

General Football Features #

Player-Specific Features #

Rationale #

3. Player Embedding Layers (done) #

Implementation #

Overview: #

Core Implementation Steps #

Player Profile Creation #

Embedding Preparation #

Technical Specifications #

4. Temporal Weight Implementation #

Exponential Decay Formula #

Implementation Strategy #

Benefits #

5. Ensemble Architecture #

References #

6. Evaluation Metrics #

Primary Metrics #

7. Regularization Strategy #

Monitoring #

Individual Contribution of Each Participant #

Weekly Commitments #

Plan for Next Week #

Sprint Goal #

Frontend #

Backend #

ML #

Presentation #

Sprint acceptance criteria #