Students Excel in Computer Science Project Competition
The Computer Science project is a crucial mandatory course in NYCU’s Computer Science Department. Students voluntarily enter project competitions, where they collaborate and observe each other's work, highlighting the department’s diverse and substantial research achievements. This experience also fosters greater cohesion among Computer Science students. Below, we introduce the award-winning projects:
Excellence Award
Project Title:Designing Area-Efficient Ray-Triangle Intersection Hardware Unit in GPU RT-Core
Student:Yu-Lun, Ning
Advisor:Dr. Tsung-Tai Yeh
Project Introduction:
This project focuses on the resource-consuming ray-triangle intersection unit within the GPU RT-core. We broken down the Moller-Trumbore intersection algorithm into three stages. Through a sieve-like hardware design, each stage eliminates triangles that may not intersect to reduce the need of hardware in subsequent stages to decrease the overall area. Our method achieves an area advantage in FPGA synthesis, and the performance loss is acceptable. We have also designed the intersection unit to comply with the AXI4 protocol, partially integrating it into system includes Xilinx's AXI4 cache, and verified it using the Genesis ZU FPGA platform.
First-Class Award
Project Title:Research on Predictive Networks Based on Multivariate Short Time Series Inputs and Transformer Architecture
Students:Lai Yi-Xuan, Yang Chien-Hua, Wu Yi-Jing
Advisor:Dr. Ching-Chun Huang
Project Introduction:
With advancements in time series forecasting technology, Transformer-based models have demonstrated impressive performance in predicting tasks involving long time sequences. However, their performance in short-term sequence forecasting still falls short. This may be due to the model architecture's limitations in effectively capturing features of short sequences, thus restricting prediction accuracy. To enhance the performance of Transformer-based models in
short-term sequence forecasting, we optimize the state-of-the-art (SOTA) Crossformer model with the following improvements: Firstly, we introduce ProbSparse Self-attention, which refines the original Crossformer Router mechanism. By performing preliminary filtering on the Query matrix before calculating the Attention Scores, we retain only the important Query values, enabling the model to make more accurate predictions based on significant features. Secondly, we introduce Mixture of Experts (MoE) to handle the diverse distribution of input data and the varied characteristics of time series. MoE utilizes different Feed Forward Networks (FFNs) experts to process distinct features within the sequence while accommodating multivariate data, allowing information to be processed by multiple FFNs to improve the model's adaptability. These enhancements enable the model to more effectively handle local features in short-term sequences, thereby improving prediction accuracy on both small datasets (ILI) and large datasets (ETTh1).
First-Class Award
Project Title:Learning Diffusion Models with Occlusion Handling for Facial Landmark Detection
Students:Bosyuan Hou, Chiayu Tseng
Advisor:Dr. Yen-Yu Lin
Project Introduction:
Facial landmark detection(FLD), aims to detect specific key points on facial images. The accurate detection of facial landmarks is essential for many applications, including face recognition, expression analysis, and virtual reality.
In our work, we propose to adapt a new approach towards handling the task of FLD, by conditioning diffusion models on the desired facial landmark points. We utilize the strength of diffusion model learning to recover correct key points from noisy samples, and train our model to recognize the backward process from erroneous key points to correct facial landmark locations.
Diffusion models yield a strong performance on recovering noisy samples. Therefore, we propose to explore its potential on dealing with detection error in current FLD methods caused by facial image occlusion. Our model has the ability to refine erroneous facial landmarks, and eventually act as a refiner for all FLD methods.
Merit Award
Project Title:Video Change Detection via Transformer-Based Architecture
Students:Kai-Siang Ma, Shu-Chieh Chuang, Chieh-Dun Wen
Advisor:Dr. Kuan-Wen Chen
Project Introduction:
Our project explores the use of TransCD, a transformer-based model, to detect changes in videos. Unlike traditional methods that require alignment in both temporal and spatial dimensions, our approach aligns only the temporal dimension, enhancing performance under challenging conditions like poor lighting and extreme viewpoint changes. We implemented adaptive frame search to dynamically align frames and fine-tuning techniques to adjust model weights for better accuracy. Our results show that this method not only improves efficiency but also maintains high detection quality, demonstrating the potential of transformer architectures in advancing video analysis technology.
Merit Award
Project Title:Designing Fixed-Point Transcendental ISAs for Heterogeneous TinyML Acceleration
Students:Cheng-Han Tsai, Chun-Hong Fan
Advisor:Dr. Tsung-Tai Yeh
Project Introduction:
This project designs fixed-point transcendental ISAs for enhancing TinyML on RISC-V CPUs, focusing on the MobileViT model. By decomposing complex operations into basic components and implementing specialized ISAs and designing a corresponding hardware unit, we achieve significant acceleration. Using the CFU-Playground platform on NEXYS A7-100T FPGA boards, our custom instructions in TensorFlow Lite for Microcontrollers (TFLM) kernels result in MobileVit model inference a speedup of 1.56X over the baseline CPU. Our approach offers greater flexibility and memory efficiency compared to traditional Lookup Table (LUT) methods, supporting multiple activation functions without hardware modifications.
Merit Award
Project Title:Video Deblurring and Interpolation with Motion-Aware Transformer
Students:Chu Chih Ling
Advisor:Dr. Yen-Yu Lin
Project Introduction:
We propose a novel Motion-Aware Transformer model for the dual task of video deblurring and interpolation. This model addresses motion blur by recovering high-frame-rate, clear videos from multiple blurry images, achieving both video deblurring and frame interpolation simultaneously. Motion blur in images is caused by the continuous movement of objects during exposure. To address this, our Motion-Aware Transformer fully utilizes temporal information through Intra-Motion and Inter-Motion Prompts, shared among multiple consecutive blurry images in videos. The motion prompts store the magnitude and direction of pixel motion. The Intra-Motion Prompt captures pixel motion within a single blurry frame, while the Inter-Motion Prompt captures pixel motion between adjacent blurry frames. By predicting motion prompts with a UNet-like motion extractor and using these prompts as input to the video deblurring and interpolation transformer, we reduce the complexity of these tasks and improve model performance.
Our work demonstrates that the proposed motion extractor significantly enhances the performance of video deblurring and interpolation tasks. We are continuing to improve the blending and utilization of motion prompts in the video deblurring and interpolation transformer model.
Merit Award
Project Title:BEVGaussian: Generate Scene-level 3D Gaussian from BEV image
Students:Jie-Ying Lee
Advisor:Dr. Yu-Lun Liu
Project Introduction:
In this work, we aim to generate scene-level 3D Gaussians from bird's-eye view (BEV) images, including satellite images, heightmaps, and semantic maps. Existing methods are limited to object-level generation and cannot produce scene-level 3D Gaussians. To address this, we leverage existing object generation techniques to create high-quality 3D Gaussians from BEV images and then integrate these objects into the scene. Our approach is training-free and allows for easy modification of the 3D scenes by using BEV images as input.