Creating a filter like the hotdog eating/donut eating filter

Hello all! I wanted to get some perspective on what it may be like to create a filter similar to the hot dog eating filter where the user opens their mouth and a hot dog flies in and their cheeks get fatter and their score goes up for every hot dog they “eat”. There is another similar filter that when the user opens their mouth, a donut flies in and their cheeks get fatter. Could anyone explain what the development process for a filter like this might be like? Or could anyone recommend any tutorials that would help me learn more about how to create a filter like this? I have some basic ideas like possibly setting up a physics collider patch that sets off a score counter each time a donut model collides with the users’ mouth, and possibly a face stretch effect with a face tracker that warps the user’s face a bit larger each time the score counter is changed. But I do not know if this line of thinking is accurate, so would love to hear anyone’s thoughts or suggestions. Thank you if you read this all the way through! Have a good day yall :3 Happy creating!