Midjourney’s first-ever AI video generation model, V1, is now official. This new model from the AI startup will allow users to transform photos into five-second video clips. Users can either upload their photos or use images created by other Midjourney models to generate a set of four distinct five-second videos based on the picture provided to the image-to-video model. This model places Midjourney alongside other companies developing AI video generation models, including OpenAI’s Sora and Google’s Veo 3. While several companies are working on controllable AI video tools for commercial use, Midjourney has taken a different approach by focusing on AI image models aimed at creative users.In a blog post, David Holz, the company’s CEO writes: “As you know, our focus for the past few years has been images. What you might not know, is that we believe the inevitable destination of this technology are models capable of real-time open-world simulations.”
Midjourney V1 video-generation AI model: Availability and how to use
Similar to Midjourney’s image generation tools, V1 is accessible exclusively through Discord and is currently only available on the web. To access V1, users have to purchase Midjourney’s Basic plan, priced at $10 per month. Meanwhile, users who subscribe to the $60-per-month Pro plan and the $120-per-month Mega plan can generate unlimited videos using the platform’s slower “Relax” mode. Midjourney has stated that it will review its pricing for video models over the coming month.V1 includes several custom settings that give users control over the video model’s output. Users can choose an automatic animation mode, which applies random movement to an image, or a manual mode, where they can describe a specific animation through text input. The settings also allow users to adjust the level of camera and subject movement by selecting either “low motion” or “high motion.”Videos generated with V1 are initially five seconds long, but users have the option to extend them by four seconds at a time, up to four times, allowing for a maximum duration of 21 seconds.