I have been trying to connect stable diffusion’s API to a speech to text pi via python for several days now. Basically, this should initialize the spoken text and convert it into a prompt that can then be displayed via image generation. However, I cannot get it to work via my codes, do you have any tips on how to link both of these APIs?
Ideally, you would want to put another filter between the speech to text and text to iage that only enters useful design terminiologies within the system. If you happen to know more about this, please also let me know