AI tools for audiovisual production

IA production

Generative AI for image production

This is obvious today, and we have been convinced of it for a while at the Plaine Images: artificial intelligence will necessarily impact the professions, processes, training and the daily life of students, entrepreneurs and entrepreneurs and collaborators and collaborators of our ecosystem.

Back on the first Iapéro of the Plaine Images, a monthly event designed by and for professionals in the cultural and creative industries.

For this first edition, we received Julien Frisch, former incubated, Consultant in AI, and one of the BPI France referents for the IA Booster France 2030 , and Rémi Auguste, computer doctor and founder of the Weaverize , located Plain images for 7 years.

Generative artificial intelligence offers increasing cost reduction opportunities for the retail sector, large providers of business visuals.

Historically, retail companies had to orchestrate photo sessions, involving the transport of thousands of products to places specially rented for the occasion, situation, lighting management, photo taking, postproduction work, etc. With AI, Rémi Auguste presents us in detail the tools with which he rework the workflow of this substantial production:

IA
In summary, the products are now photographed in a classic studio, and virtually integrated into different contexts, thanks to AI solutions. But what tools for each step? A small review is essential, with a predilection for open source software:

Ia cutting

It can be carried out almost instantly thanks to free internet services (a simple research “Remove background”) but some tools reach the expected professional standards, such as Removebg .

Anything model segment (SAM) , a new AI model developed by Meta R&D, allows him to cut any object, in any image, in one click, thanks to semantic discrimination. On the side of the open source software, Yolo is a good alternative.

Scenes produced by IA

To create visuals in which the photographed product fits, use stable diffusion or flow . The commercial alternatives are Midjourney , Dall-E, + Stable Webui forge diffusion . In reality, each tool has its peculiarities of control and rendering of the output : flow is a choice of first order if you want to finely control the desired specificities in your generation.

These tools also manage the upscale of the image (by adding pixels), essential for a professional rendering or for certain services, such as the print.

Videos with AI

Multiple solutions exist but here is a selection tested and approved:

  • Runway for cinematographic renderings
  • Open Source Cogvideo
  • Kling , which allows you to animate static images
  • Synthesia , which generates avatars from a voiceover


Know how to use the lora

In addition to this first stack of tools, Rémi Auguste offers us a parenthesis on a way of going further with the generative models: the Lora.

The lora meant low-skiling adaptation and correspond to a method to create light submodels which will be grafted on existing AI models, as a stable diffusion. Interest? Instead of causing a complete model, with the data processing requirements that go with it, LORAs allow you to add new overlaying styles, with only 10 to 20 MB of additional parameters. Training can be done with a minimum of 10 images and you will have the mastery of a small collection of objects.

This way of doing fine tuning is appreciable for those who want a well -determined and recognized style (think about the Lego rendering for example!), Bookstores like Hugging Face (among others) offering a selection of preconceived lora.


Generative AI for sound processing

On the vocal synthesis side, progress has been exponential in recent years: the approach of tools has long been concatenative, that is to say that we aligned phonemes by moderating syllables on sounds, which was effective but very unnatural. Now, generative AI allows a much more efficient rendering. Elevenlabs is one of the flagship tools on the market, and includes a multitude of features to give voice to your projects.

Transform text into sound with AI

Elevenlabs allows you to treat a flow of text to make it a qualitative audio, which is called text-to-speech .

 

The tool allows many uses around the voice, by training on your own voice (via what is called cloning) so that the rendering is as natural as possible. It therefore allows you to manage:

  • “classic” vocal synthesis, with the possibility of translating directly into another language
  • Voice and dubbing
  • The rapid creation of Audiobook

Cloning in a voice is done in 1:30-2 p.m., because you have to train the model to apprehend the voice by offering an audio set. Then, nothing could be simpler, the tool generates the desired production in a few seconds.

Eleven Labs
An example of a possible configuration on Eleven Labs

Labial synchronization with AI

Videoretalking is an open source model to edit the faces of a video so that the lips come alive in synchronicity with a new audio.

Videoretalking

Image animation by video with AI

Several open source tools accelerate part of the animation work:

  • AniPortrait allows you to animate the lines of a static image from an audio or a video
  • Efficient-Live-Portrait allows you to animate the features of an image by cloning the features of a video
  • Liveportrait specializes in painting animation:

You have understood, there is a profusion of tools that should be targeted according to its use: this is one of the wishes of the Iaméros Plaine Images, but not the only one! 

In future meetings, we will particularly focus on a use, a problem, a virtuous case or a demo of tools ... so that this monthly meeting is a powerful transformation vector for audiovisual pros!

Want to follow the news of the audiovisual industry?

Standby service

Subscribe to our
audiovisual monitoring letter

: Decryption of news, trends, current changes in industry, market development and technology ... to read every month!

Do not miss the next Iapéro!

Next meeting soon ...

Our events are announced every month in our newsletter and every week on our social networks. 

welcome