Bring AI Image Generation to Your Microsoft 365 Copilot Declarative Agents


Introduction

The recent updates from Ignite 2024 are coming through into Microsoft 365 Copilot experiences and I was really excited by the Copilot Studio experiences. Then I started to look at what was new in the Pro Code world with Visual Studio Code, Teams Toolkit and the new announcements.
There was a post by Abram Jackson which hinted as to what had just been launched, and more hints were found in the Microsoft 365 Developer Podcast (https://www.m365devpodcast.com/e/why-build-declarative-agents-with-visual-studio-code-vs-copilot-studio/), and the Ignite session, Developers guide to building your own agents (https://ignite.microsoft.com/en-US/sessions/BRK167) with Jeremy Thake, Sebastian Lebert, Ayca Bas and Matthew Barbour.

So, I started digging into it a bit more.

Recently, I had built a Copilot Agent to help with a problem that I have every year, this is when the Elf on the Shelf comes out. If you are not au fait with the Elf on the Shelf then it’s a little Elf and you create a scene each night for your kids to come down to in the morning. The scene should be funny and slightly mischievous. This leads up to Christmas Eve when the Elf goes back with Santa.
Anyway, after the first few days, I have some reasonable ideas but as the night goes on I run out of them. What’s more I often only remember about the Elf when I get into bed or worse early in the morning. This ends with some stressful times and subpar ideas.

My Copilot Agent helps with all that and now gives me some cracking ideas for the Elf and really takes the weight off.
Whether this hits the mark as to what Abram would class as a clever use of image generation, is a bit debatable but I think it is pretty cool and hope you do too.

On to the learning!


Image Generation and Code Interpreter Agent Capability

In this post, I wanted to share my investigations and findings. So, the Microsoft 365 Copilot Extensibility Team have been working away on making more capabilities available to Agents. There are several ways of building Copilot Agents with either Copilot Studio or the Teams Toolkit. You can also bring your own model with the Teams AI library, but that is for another day. Both these sets of tools create agents in the same way but the Microsoft Teams Toolkit leads the way slightly with access to the latest capabilities first. I am sure Copilot Studio will be catching up soon.


These tools create a JSON file which describes the Copilot Agent. This file, declarativeagent.json has a structure like the following:

There are some key parts to the JSON file including the name but I want to draw your attention to the capabilities section. Here we define what the Agent is able to access in terms of resources, there are quite a few more which I will discuss shortly but for this agent, we have given it access to use the Web and also generate images with the GraphicArt capability.
Additional capabilities include accessing knowledge via content in Microsoft SharePoint and One Drive libraries and accessing the CodeIntepreter capability which will allow your users to create diagrams and generate code. Finally, there is the capability to hook into APIs and really set your Agent free. The last capability is probably the most interesting and an area of interest for me.
Actually, you can keep up to date with what is being launched by tracking the Declarative Agent Manifest Schema page.

https://learn.microsoft.com/en-us/microsoft-365-copilot/extensibility/declarative-agent-manifest-1.2

Building the Elf Ideation Agent

Let’s go through the process of building the Copilot Agent with the Teams Toolkit.
I will make the assumption that you already have installed Visual Studio Code.


First, you will need to install the Microsoft Teams Toolkit Visual Studio Extension, and then you’ll need the Kiota Visual Studio Code Extension.
See https://learn.microsoft.com/en-us/microsoftteams/platform/toolkit/install-teams-toolkit?tabs=vscode

Rather than reinvent the wheel, follow the instructions in the following Microsoft Learn article:
https://learn.microsoft.com/en-us/microsoft-365-copilot/extensibility/build-declarative-agents?tabs=ttk&tutorial-step=1
Make sure you don’t miss the step to provision the Agent.

  1. In the new Visual Studio Code window that opens, select Teams Toolkit, then select Provision in the Lifecycle pane.

Now we have the agent created, let’s set it up.
As mentioned previously the core of the declarative agent is configured with the following:

  • /appPackage/declarativeAgents.json
  • /appPackage/instructions.txt
  • /appPackage/manifest.json

To tell the agent how to behave, go into the instruction.txt and change it to how you would like the agent to behave. In our Elf Ideation Agent, we have the following instructions:


You are an virtual that helps parents come up with funny and clever ideas for their kids for the Elf on the Shelf.
When providing ideas, create three ideas.
For each idea use the GraphicArt capability to create an AI generated image to go along with the idea.
  • Help its 10pm I need a funny and slightly naughty idea for elf on the shelf tonight.
  • Provide parents with creative and humorous ideas for their Elf on the Shelf activities leading up to Christmas.
  • Ensure the ideas are funny and slightly naughty but appropriate for children.
  • Offer a variety of scenarios and setups that can entertain kids.
  • Respond in a friendly and engaging tone.
  • Avoid any ideas that could be harmful or inappropriate.
  • Be mindful of different age groups and sensitivities.
Please provide each idea in as a clear paragraph with instructions on how to setup the scene with the elf.
As shown in the screenshot below.

Save your instructions.txt
You’ll notice that we have mentioned the GraphicArt capability to create an AI-generated image to go along with the created idea. This will create an image alongside the text prompt.

Next, we need to update the declarativeAgent.json and add our capabilities. If you followed the instructions your declarativeAgent.json will be missing the capabilities node.

We have added the WebSearch and GraphicArt capabilities this will allow our Agent to use the web for inspiration when coming up with ideas for our Elf and also have access to the Designer capability to create images from text descriptions!
We have also added conversation starters which will appear within our agent when we access it from BizChat as shown below.

If you would like to give your Microsoft Teams app an icon then replace the /appPackage/color.png file with a 192px by 192px icon. Also, update the manifest.json with an improved name.

Try it out!


We are now ready to try out our Agent.
Make sure you click on the Provision link under lifecycle and then you can click the play icon next to Dev (provisioned), choose Teams and the manifest.json file to fire up your Agent and try it out.


Making Changes to Your Agent


One of the things that I was wondering was how easy would it be to make fundamental changes to your Agent.
Well, the Microsoft Copilot Extensibility and Teams Toolkit teams have made it easy to make changes to your agent and if you want to add some conversational starters or make other changes then click on the provision button under the lifecycle section.

Demo video

Please find the agent in action below.

Source code

The source code can be found on GitHub.

https://github.com/SimonDoy/teams-ai-library-samples/tree/main/elf-ideation-agent

Conclusion


In this blog post, I have shown you how you can bring graphics into your Microsoft 365 Copilot Agents through Declarative Agents and Teams Toolkit.

Happy coding!

An illustration of an airplane being flown from the position of the cockpit.

Gotchas discovered building a Custom Engine Copilot with GPT-4o and Copilot Studio


Introduction

This article highlights some gotchas that I have hit when building a Copilot Studio with a Custom Engine Copilto using GPT-4o. The aim is to help you solve these problems if you have similar issues.

So, firstly what are we talking about when we talk about Custom Engine Copilots?

Well, Copilot Studio can be configured to use an external hosted AI model for example using Azure AI Services and GPT-4o. This allows us to use a more powerful or more suitable language model such as GPT-4o instead of the out-of-the-box LLM that Microsoft currently provide.

The benefits are better reasoning with better results. Our experience with our customers has shown some great results when using GPT-4o.

The way of using a custom engine Copilot is using the Generative Answers capability within Copilot Studio.

However, there are some gotchas when using these more complex models and I wanted to document them here to save you working out what the issue is.

Gotcha 1: Generative Answers returns no knowledge found

So, we have seen that if something goes wrong when you using Open AI Services then you get a no knowledge found.

You can try this out using the Test Your Copilot feature for your Copilot with Copilot Studio.

I will be honest this took a while to find out what the issue was but by using Azure Open AI Services https://oai.azure.com/ you can test the model to make sure it is working with your data.

We kept getting issues with Generative Answers saying there was no knowledge found. In the end, it turned out to be due to a trailing slash missing for the Azure AI Search endpoint.

So check your Open AI connection settings, make sure that you have a trailing slash on the Azure AI Search / Cognitive Search endpoint URL.

i.e https://azureaisearch.search.windows.net/

and not https://azureaisearch.search.windows.net

We have also seen the issue with your model being throttled and the result is that you get the same no information was found that could help answer this.

When you try the same prompt from Azure Open AI Services you get this error message, Server responded with status 429, the rate limit is exceeded.

Make sure you have increased the rate limit to cover the number of tokens that need to be processed.

You can do this using the Azure Open AI Studio by going to the Deployments, choosing your model and then editing the model settings and increasing the Tokens Per Minute Rate Limit. For testing we are setting this to 100K but for Production, you are likely to need to increase further.

Gotcha 2: Generative Answers returns answers but they are not that great.

This issue is subtle and is unfortunately hidden by the Generative Answers. The experience that we were getting was that using Azure Open AI Services we got really good detailed responses back. However, when we tried the same prompt in Copilot Studio we got very simple responses back which were nowhere as good as those from Azure Open AI Services.

The issue turned out to be related to Gotcha 1 where we were getting no results back from the Open AI model and we had this option switched on in the Generative Answers action. So then the Generative Answers would use the knowledge that it has in its model.

So we would get a response like this one

Which is not bad but not as good as the GPT-4o version which is shown below.

So the fix is to switch off the “Allow the AI to use its own general knowledge” option.

Gotcha 3: Generative Answers sometimes return great answers and sometimes errors out.

So this issue seems to occur with GPT-4o models but not GPT-4 based models and I suspect that this is down to the amount of detail in the answers coming from the model.

When using Generative Answers and Copilot Studio you can return the information back to the user in two ways:

  • Ask Generative Answers to send a message to the user.
  • Take the response and assign it to a variable.

These options can be found in the Advanced section of the action.

If you ask for generative answers to send a message then you sometimes get errors being reported.

Instead do the following:

  • Assign the response from the model into a variable, use Text Only.
  • Check to see if a response is returned and then if it is write out the message using a Send a message activity.

See the following screenshot:

Once you have assigned the LLM response to the variable then add the condition and do the following:

You will find the responses much more reliable.

Conclusion

In this blog post, I explain some of the issues/gotchas that I have seen when building Custom Engine Copilots using GPT-4o. We covered some of the issues that I have seen and provided ways to solve them.

I hope that helps!

if you need a hand then get in touch with us at iThink 365, https://www.ithink365.co.uk.