Hsiao-Tzu (Anna) Hung

I am a master student at National Taiwan University, Taiwan. My main research interest is deep learning on multimedia, especially on audios and images. Under the supervision of Dr. Yi-Hsuan Yang (Academia Sinica), I’m now doing an interesting research on automaic music generation based on transformer-based model. Currently I’m seeking for ML/AI/Data/software engineering Full-time position. Please reach me if you have anything to share :)

RESUME

Education

  • M.S. in Department of CSIE,
    National Taiwan University, Taiwan, Feb 2022
  • B.S. in Department of Physics,
    National Tsing Hua University, Taiwan, June 2014

Work experiences

2021 June - Aug
Acoustic Engineering Intern
Amazon Ring, Taiwan

2020 Feb. - 2022 Feb.
Research Assistant
Institute of Information Science, Academia Sinica, Taiwan
supervised by Dr. Yi-Hsuan Yang: Music and AI Lab

2019 - 2020
Full-time Machine Learning Research Internship
Taiwan AI LAbs, Taiwan

2018 - 2019
Research Assistant
Institute of Information Science, Academia Sinica, Taiwan
supervised by Dr. Hsin-Min Wang: Speech,Language and Music Processing Laboratory

2017 - 2018
Physics teacher
National Lan-Yang Girls’ Senior High School

2014 - 2016
Physics teacher
National Chu-Pei Senior High School

Publications

EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation

Hsiao-Tzu Hung, Joann Ching, Seungheon Doh, Nabin Kim, Juhan Nam and Yi-Hsuan Yang
Published on ISMIR 2021
Paper, Demo, Code
We collected a dataset called “EMOPIA”, which is a shared multi-modal (audio and MIDI) database focusing on perceived emotion in pop piano music, to facilitate research on various tasks related to music emotion. I also made an attempt to control the emotion of the generated piano music generated by Transformer using this dataset.

Improving automatic Jazz melody generation by transfer learning techniques

Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, Hsin-Min Wang
Published on APSIPA 2019.
Paper, Demo, Code
In this paper, I use two transfer learning methods to improve the performance of a VAE-based music generation model given the limited training data. Both the objective and subjective test shows that the two methods indeed improve the performace.

MediaEval 2019 Emotion and Theme Recognition task: A VQ-VAE Based Approach

Hsiao-Tzu Hung, Yu-Hua Chen, Maximilian Mayer,Michael V¨otter, Eva Zangerle, Yi-Hsuan Yang
Published on MediaEval 2019 .
Paper, Code
In this work, we try to use the VQ-VAE as feature extractor and two kinds of classifier to automatically classify the genre, theme, or mood of a given audio song. The dataset given by the host is quiet noisy and so we decide not to move forward on this task, but still it’s an interesting experience.

Mini Projects

Implementing MidiNet by PyTorch

Code
In the beginning of my ML journey, I start by translating MidiNet from TensorFlow to PyTorch. MidiNet is an well-known GAN-based framework in music automatic generaion field. This project currently has 40 stars on GitHub.

covid19-prediction

Code
This is a small project I’ve done when I’m taking the course “Artificial intelligence” @ NTU. I use X-ray as input to predict covid19. In this work, I try to use focal loss to deal with data imbalance.

Context-Aware Music Recommendation Systems for Driving - a MIR Approach

Report This is the final project I’ve done for the course “Introduction to Intelligent Vehicles” @ NTU. This project got the “Top 20% (~20 ppl) reports” award from the Instrucor!
A context-aware music recommendation system (CARS) means to recommend music based on the environmental factors such as the driver’s mood, the weather condition, or the traffic conditions. On the other hand, in some research, they use music to regularize the driver’s emotions, making driving safer. Especially, the tempo of the music is a common feature to do that. So I try to use the tempo information of the music as an additional feature for the CARS, and the result shows that the tempo information can improve the accuracy of the CARS.