Hsiao-Tzu (Anna) Hung

I am a master student at National Taiwan University, Taiwan. My main research interest is deep learning on multimedia, especially on audios and images. Under the supervision of Dr. Yi-Hsuan Yang (Academia Sinica), I’m now doing an interesting research on automaic music generation based on transformer-based model. Currently I’m seeking for ML/AI/Data/software engineering Full-time position. Please reach me if you have anything to share :)



  • M.S. in Department of CSIE,
    National Taiwan University, Taiwan, present
  • B.S. in Department of Physics,
    National Tsing Hua University, Taiwan, 2014

Work experiences

2021 June - Aug
Acoustic Engineering Intern
Amazon Ring, Taiwan

2020 - current
Research Assistant
Institute of Information Science, Academia Sinica, Taiwan
supervised by Dr. Yi-Hsuan Yang: Music and AI Lab

2019 - 2020
Full-time Machine Learning Research Internship
Taiwan AI LAbs, Taiwan

2018 - 2019
Research Assistant
Institute of Information Science, Academia Sinica, Taiwan
supervised by Dr. Hsin-Min Wang: Speech,Language and Music Processing Laboratory

2017 - 2018
Physics teacher
National Lan-Yang Girls’ Senior High School

2014 - 2016
Physics teacher
National Chu-Pei Senior High School


EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation

Hsiao-Tzu Hung, Joann Ching, Seungheon Doh, Nabin Kim, Juhan Nam and Yi-Hsuan Yang
Published on ISMIR 2021
Paper, Demo, Code
We collected a dataset called “EMOPIA”, which is a shared multi-modal (audio and MIDI) database focusing on perceived emotion in pop piano music, to facilitate research on various tasks related to music emotion. I also made an attempt to control the emotion of the generated piano music generated by Transformer using this dataset.

Improving automatic Jazz melody generation by transfer learning techniques

Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, Hsin-Min Wang
Published on APSIPA 2019.
Paper, Demo, Code
In this paper, I use two transfer learning methods to improve the performance of a VAE-based music generation model given the limited training data. Both the objective and subjective test shows that the two methods indeed improve the performace.

MediaEval 2019 Emotion and Theme Recognition task: A VQ-VAE Based Approach

Hsiao-Tzu Hung, Yu-Hua Chen, Maximilian Mayer,Michael V¨otter, Eva Zangerle, Yi-Hsuan Yang
Published on MediaEval 2019 .
Paper, Code
In this work, we try to use the VQ-VAE as feature extractor and two kinds of classifier to automatically classify the genre, theme, or mood of a given audio song. The dataset given by the host is quiet noisy and so we decide not to move forward on this task, but still it’s an interesting experience.

Mini Projects

Implementing MidiNet by PyTorch

In the beginning of my ML journey, I start by translating MidiNet from TensorFlow to PyTorch. MidiNet is an well-known GAN-based framework in music automatic generaion field. This project currently has 40 stars on GitHub.


This is a small project I’ve done when I’m taking the course “Artificial intelligence” @ NTU. I use X-ray as input to predict covid19. In this work, I try to use focal loss to deal with data imbalance.

More about me

Outside of work, I’m a scuba diving lover, a book addict, and a Spotify loyal user!

Diving in sea of Kenting, Taiwan (A must-visit place!):

My current favorite playlist: