NeRF or Nothing

Our Flowchart Design.
Somewhat realistic image of our friend sleeping.
Our friend T Posing with some mentioned ghosting.
NeRF's guess on what the ceiling should like without ever seeing it.

Intro

At first glance you may think that the thumbnail is a picture of our good friend enjoying a nice nap. Well, you would be half right! That image is actually a frame in a 3D reconstructed, custom video I made using the power of a relatively new computer vision task known as Neural Radiance Fields.

Inspiration

A while a ago, I (Joshua) discovered a research paper on Apple's Machine Learning Research. The research paper was about Neural Radiance Fields (NeRFs), which is a type of computer vision task that is fairly new. I found it really interesting and I wanted to showcase it to others.

Well what are Neural Radiance Fields?

NeRF's can create realistic 3D representations (3D simulations and objects) from just 2D images

What it does

For our project, we used videos to create an interactable 3d environment. We prefer to use Kiri Engine because LiDAR based apps work better with NeRF over raw video.

How we built it

We used Eraser.io for a create a flowchart and Figma for a simple front end design. Next.js was used for the frontend some server sided operations and we used Fast API to manage python scripts.

Challenges we ran into

We ran into a lot of challenges:

1) We spent the first hour trying to get a provisioned A100 or at least V100 virtual machines on Google Cloud, but no matter what hardware configuration we tried, there would be some type of error. A V100 GPU would allow us to render NeRF's in real time for a more interactive project.

2) Since NeRF is relatively new (the very first publication came out in 2020), most APIs's are relatively new and not as rigorously tested. We decided to use NeRF Studio, which is a collection of NeRF libraries from the original NeRF to models built with Stable Diffusion (text to Image)

3) This was our first ever time connecting a React Frontend to a lightweight backend in over a year. We had to learn how to use Flask/Fast API so frontend could interact with our AI Scripts.

4) Training and evaluation time took a really long time. Since we couldn't use IAAS GPU's, we had to resort to remotely connecting to my home pc that has a 9th gen i7 and a GTX 1660S which are way inferior. Most of the time i was using 100$ of my CPU, RAM, and GPU.

5) NeRF can produce a lot of ghostly artifacts if recordings are not done properly or trained under limited resources(This happened to me on every example. I am not a cinematographer😢). But hey, are the bright side they look kind of cool.

Accomplishments that we're proud of

Despite the many challenges we had to overcome, we still managed to get a working model that can do a decent job and generating interactive, 3D environments with images, video, and photogrammetry apps like Kiri Engine and Polycam. I'm also really happy I have something to show during the expo.

What we learned

The main thing we learned was how to deploy our own NeRF's and how to deploy them on the web instead of a standalone Jupyter Notebooks. We also learning more web dev design techniques such as using API endpoints and Figma efficiently.

What's next for Construct?

I want to integrate it with Blender first because I wanted to do that for this project but I ran out of space (2 gb's left LOL, which is also why I don't have a video demo) and then I want to try using it with stable diffusion so I can dynamically add meshes.