Machine Learning has already become a crucial piece of our lives, as it powers a wide range of applications. From image recognition, self-driving cars, to natural language processing, and knowledge extraction, ML has fundamentally transformed every domain it has touched. Despite all the success, a key problem preventing the wider adoption of ML is its non-transparency: ML models appear to us as complex numerical black-boxes and we do not understand how they operate. Despite numerous efforts from the community in recent years, we still see limited progress towards complete understanding of ML. Meanwhile, the implications of non-transparency are imminent and severe. It prevents us from understanding when and where ML systems would fail, which enables adversarial attacks against ML models. More concerning that that, it enables the possibility of hiding malicious behaviors inside ML models, i.e. backdoors. Such backdoors could compromise the ML model and alter its behavior once activated by the attacker.
The threats are imminent, putting currently deployed ML models at risk. But the progress towards fully understanding ML models is limited so far, and unlikely to significantly advance in the near future. This conflict between imminent threats and limited progress calls for immediate solutions to address some of the most important issues mentioned above.
In this talk, I will cover our work to improve the security of ML as opaque systems. First, I will present a new adversarial attack against DL models we propose in the practical setting. We specifically focus on the context of transfer learning, which is the most promising solution for normal users to train high-quality DL models with limited data. Second, I will discuss backdoor attacks in DL models, where hidden behaviors are injected into DL models, and cause misclassification for any input containing certain trigger. I will present our design to detect and mitigate such hidden backdoors.