You should basically choose and work on only 1 type of generation only for the assessment, whichever you are comfortable with w.r.t tech stack (Computer Vision, Speech or NLP)
We are basically looking to build a tool that can generate a lot of new age alt-form content from a research paper as input. You can either build 1 out of the 3 mentioned but build it with as many nuances and user level input customizations as you can think about.
A Research Paper/ multiple papers
API demo with input and output files or workbench like UI where you can input the research paper(s) and get output as any of the variants.
Google Illuminate: https://www.youtube.com/watch?v=59bU5zrgPkc
There is an AI for That: https://theresanaiforthat.com/s/graphical+abstract+generator/
Video Generator overview: https://www.synthesia.io/post/best-ai-video-generators
https://paperswithcode.com/task/text-to-video-generation
Fill the form and register yourself for the challenge, Good Luck