How to format record data to send it to OpenAi

Let’s say that I’ve got an app that collects timesheets from my employees, and I wish to create a summary of these timesheets for each week using Ai. To accomplish this task, I have to somehow get my timesheet data to the Ai… but how do I structure things so it makes sense? So that the Ai understands that a dateTime goes with a user, etc. etc.

  • How do we keep everything straight?
    • That’s what we’re going to cover in this article.
Example Scenario Details

Tables

  • Timesheets
    • ref connected to Employees
    • collects individual timeclock records each day an employee works
    • ref connected to Weeks
  • Weeks
    • New record created each week
    • Serves as the parent level for a weekly summary system
      • Timesheets serving as the child level

Each day an employee creates a new Timesheet record, recording their clock in and out times - with the total and other data points as well. These are automatically ref connected to the appropriate Weeks record - giving us a [Related Timesheets] on the Weeks table.

We’ve got a bot that runs at the end of the week, creating a summary of the details from all the Timesheet records for that week.

  • Part of this is our desired output - an Ai generated summary

General Overview

In order to accomplish this, there are a few layers that need to be created:

  1. On the individual timesheet level:
    • The specific data you wish to show for that record must be formatted appropriately to work in the markdown table
  2. On the Weeks (or other parent/aggregate) level:
    • All the individual timesheet data must be combined together along with the rest of the table code, creating the formatted text that is our markdown table
  3. On the script execution level:
    • The system and user messages must be created
    • The markdown table must be included, making sure to encode the data, in case of special characters

Sending Data With Tables

By and large the best way to send table-structured data to an Ai… is through a table. (^_^) Thankfully these Ai system understand markdown, so we can use this as a means to format the data for the Ai to ingest easier.

Markdown Table Code Example
image

Things to notice:

  • Columns are encapsulated by vertical bars
  • The header and data rows are separated by a divider

Note: No limit to the number of columns or rows visible, and column width will grow to fit contents.


Row Data Preparation

From the example above, you can see that for each row we need to encapsulate each piece of “column” data with vertical bars " | " - one for each column in the table we’re creating.

  • This means that we need to create this string for each row that we wish to include in the table.

    • Which means that if we want to show record data in the resulting markdown table, we need to take that record data and prepare it in the format required for the markdown table row.
  • This means that we need to create a CONCATENATE() formula that combines everything together for that row:

Concatenate(
  "| ", 
  [Timesheet_Employee_Link].[Employee_Name], 
  " | ", 
  TEXT([Timesheet_Date], "MM/DD/YY"), 
  " | ", 
  TEXT([Timesheet_Start_Time], "HH:MM am/pm"), 
  " | ", 
  [Timesheet_Total_Hours],
  " | ", 
  [Timesheet_Notes], 
  " |"
)

Notice the starting vertical bar DOES NOT have a space before it, and neither does the closing bar have a space after it… but the middle bars have spaces on both sides.

  • This formula would live on the Timesheets table, inside a LongText column type (which could be physical or virtual, physical preferred).
    • For our example, let’s call this column [Timesheet_Row_Code]

This creates the individual entries required to build the rows for the table; now to build the entire table we need to combine all of these together, along with the header row.


Table Creation Concatenate() Formula

To combine all the rows together with the header, you’ll need a list of all the [Timesheet_Row_Code] data from each of the records involved.

  • You could get this through a [Related Whatever] list dereference
  • You might get this from a slice
  • You could even get it from a select() formula - but avoid this, as it has performance impacts.

It doesn’t necessarily matter how you create the list of [Timesheet_Row_Code] values, what matters is that we’ve got a LIST of the [Timesheet_Row_Code] values from the records that matter.

Pulling it all together

In order to assemble the text together and create the complete markdown table, we need to add the header info with the row data. I find it’s best to build this inside a new VC on your parent/aggregate table; let’s call ours [Week_Markdown_Table].

The following is an example of what that might look like if you were using a list dereference and a [Related Timesheets] VC on the Weeks table:

Concatenate(
"| Employee | Date | Time | Hours | Notes | 
| --- | --- | --- | --- | --- | 
", 
Substitute(Concatenate([Related Timesheets][Timesheet_Row_Code]), " , ", "
")
  • The formula starts with a string, establishing the column headers followed by the divider line
  • Next we bring in the row data
    • We start with the list dereference [Related Timesheets][Timesheet_Row_Code] that creates a comma separated list of all the values from the [Timesheet_Row_Code] column from the related records.
Example of what that looks like
| Employee1 | 1/1/24 | 12:30 pm | 8.5 |   | , | Employee2 | 1/1/24 | 12:30 pm | 7.5 |   | , | Employee3 | 1/1/24 | 12:30 pm | 2.3 |   | , etc...

Notice that it’s all one line, with a “space comma space” separating the entries

  • We need to take this list, and replace the separator with a line break (this way each row is on it’s own line
    • We accomplish this through the use of Substitute(Concatenate(...
More Details

Each element in the list consists of a string, which contains everything necessary for that particular row in the markdown table.

  • The formula Concatenate() takes this list and converts the whole thing into a string - which gives us a standard list separator of “space comma space” between each item.
  • From here we can then use Substitute() to replace this standard separator with a line break - thus getting each line element on it’s own line.

The result is a dynamically created markdown table containing your record data. :muscle:

You can then take this (which might live in a virtual column on some table in your app), and the pass this value to your Ai when asking questions.


Presenting the table to Ai

Now that you’ve got your record data formatted into a markdown table, next we need to take this and present it to the Ai in some way that makes sense to it. Since we’re dealing with OpenAi, we’ve got two objects that we need to make in order to get the data to the Ai:

1. System Message

We can use the system message as the place to insert our instructions for what we want the Ai to do. Literally just a small sentence or two that describes the situation, and what we want to happen.

For example:

{“role”: “system”, “content”: “The following is a collection timesheets from our business for a week; your task is to read through everything and create a summary of what happened this week. Include a list of employees with hours that week, a count of how many there were, and some additional KPI stats you feel relevant to include.”}

  • This is incredibly weak, fyi; you’d want to get specific here about what you really want out of your summaries.
  • It’s also better to show an Ai what you want (structure wise), rather than describe it; so if you can include an example of the output you might want to see - the Ai will run with that, populating it with the details from the Timesheet records.

2. User Message

I typically find it better to place the actual data inside a user message (vs. the system message), as it helps isolate the instructions more clearly.


Combining them together

To create the messages object for OpenAi, you’ll combine these together (drawing in the markdown table columnn [Week_Markdown_Table]) to put together the complete package we need to send off to OpenAi.

  • This includes the JSON wrappers for the system and user messages
  • This includes a way to “encode” the text entered, so it doesn’t cause errors
    • This is what the nested Substitute( formulas are doing.

Here is an example of what that might look like:

Concatenate(‘{“role”: “system”, “content”: "’,
Substitute(Substitute(Substitute(Substitute(

“The following is a collection timesheets from our business for a week; your task is to read through everything and create a summary of what happened this week. Include a list of employees with hours that week, a count of how many there were, and some additional KPI stats you feel relevant to include.”

, "", “\”), "
“, “\n”), '”‘, ‘"’), " ", “\t”),
"} , {“role”: “user”, “content”: "’,
Substitute(Substitute(Substitute(Substitute(

Concatenate("The following is the timesheet data for week ", [Week_Number], ":

", [Week_Markdown_Table]
)

, “", “\”), "
“,”\n”), ‘"’, ‘"’), " “, “\t”),
'”}’
)

Using this formula will create the Message array object (in a string format, which can then be parsed through JSON.parse() ), that’s required by OpenAi. It will contain all the information from your records, as well as whatever else you provide, and allow the Ai to create a summary of the data in an intelligent way.


Conclusion

After many months of testing I’ve found this to be the best strategy for structuring your inputs for Ai. Previously I tried inserting the data into the system message, but sometimes things just weren’t quite what I was expecting - switching to inserting the data into the user section seems to have really helped make the Ai processing more robust.

This is just one use-case for presenting record-level data to an Ai like this. You can use the strategies described here as a means to present records of data to your Ai, no matter the scenario.

1 Like