Week #4 #
External feedback, testing, and Progress report #
During this week we managed to get some progress done and gather initial feedback from the people. #
External feedback:
The most crucual feedback we got was the one from our professor (or superviser) - Rustam Lukmanov. He was impressed by the quality imporvement of 3D models that were generated by our main AI pipeline. Additionally, we showed the working pipeline to a people that work in architecture field. Shortly speaking, they were impressed by the ease of use and sufficient quality for the “baseline” model.
Testing:
During the testing phase, it was identified that the system was not compatible with older versions of Python. Additionally, a couple of minor bugs were detected, affecting both the machine learning pipeline and the API. These issues were thoroughly investigated and successfully addressed, ensuring smoother operation across all supported platforms.
Iteration and Refinement: Iteration and Refinement Report This week, our team focused on significant enhancements and refinements in our project, which are detailed below:
- Model Replacement: We replaced the wonder3d model with the microdreamer model. The microdreamer model offers faster inference and higher generation quality compared to wonder3d.
- Pipeline Improvement: During the pipeline creation, we decided to replace the combination of LLM for prompt engineering and stable diffusion 1.5 with stable diffusion 3. This decision was based on stable diffusion 3’s superior ability to understand prompts with ‘white background,’ which is crucial for our model.
- 3D Model Export Fix: We addressed and resolved the issue of incorrect 3D model exports using trimesh. As a result, the quality of the models is now considered acceptable for a prototype.
- Quality vs. Inference Time: While the output quality of the 3D models can be slightly improved by increasing the number of iterations, this also leads to increased inference time.
- API Development: Our team developed the first basic API using FastAPI. Currently, the API supports a single request: it takes a prompt as input and returns a 3D model.
Overall, these iterations and refinements have significantly improved the functionality and performance of our project, moving us closer to a robust and efficient prototype.
Here are some generated 3D models using the updated pipeline:
Challenges & Solutions #
The most difficult was the organization of GitHub - Dmitry spend a lot of time organizing the whole code part of our GitHub to make the installation process of our project to any other machine easy. However, he did a great job.
Conclusions & Next Steps #
- Integrate the queue into an API;
- Introduce a new function - user can choose an image (out of few images that were generated from given prompt). After that - a 3D model is generated from the selected image. Such feature will allow people to choose the prefered look of their future 3D model;
- Finally connect the frontend and backend to make the whole system work.