Skip to content

IM8/SNDGO guidelines on the use of Cloud Services for Gen AI applications

Note

The IM8/SNDGO guidelines shown here are what we have assessed to be relevant for Gen AI app development. Your requirements might be different and other guidelines not specified here might apply. Furthermore, the guidelines are up-to-date only till the time of publication. You are advised to consult the latest version of the IM8/SNDGO guidelines on the intranet to verify that your Gen AI app is compliant.

According to the circular by SNDGO1, large language model based applications (such as RAG) must only ingest data that will not cause damage to an agency (i.e. up to OFFICIAL (CLOSED) / NON-SENSITIVE). If the RAG application requires a knowledge base beyond OFFICIAL (CLOSED) / NON-SENSITIVE, approval from the AI Policy Group (SNDGG) has to be sought. Approval will take into account whether any data in transit for the system are encrypted end-to-end, and whether no data will be logged within overseas servers. In addition, measures must be in place to ensure that failures or disruptions to the RAG system do not cause failures or disruptions to NE-critical, high criticality (CII), or high significance systems (SII). The system should abide by existing IM8 cybersecurity policies on connectivity ot Government Enterprise Networks (GEN) if GEN connectivity is required.

The system should also include the following categories of risk-mitigating measures to address the accuracy and accountability risks inherent in the use of large language models in government applications:

Education The system should include visual UX cues to educate users on proper use. This includes informing users that the agency (owning the RAG system) is piloting the use of large language models, that the results are AI-generated, and that the users should always double check and adapt generated output for use.

Minimise hallucination Steps should be taken to minimise hallucinations for the RAG system. Measures include limiting output to information drawn from a reliable knowledge base, enabling the user to verify the output with relevant citations, and conducting accuracy / robustness tests for a range of commonly asked questions. The RAG system should ensure that the model temperature for the large language model used internally should be set to 0, as this will also reduce the likelihood of hallucinations.

Adverserial testing Tests should be conducted to ensure that the RAG system performs satisfactorily against queries deliberately designed to elicit inappropriate responses. There should be a comprehensive plan to systematically test and measure the system’s performance (read the section on evaluation), recommended to be performed by a red-teaming regime. The RAG system should consider rate limiting queries to deter users from “brute force” approaches to following the system.


  1. Refer to PMO (SNDGO) CIRCULAR NO. 1/2023, titled USE OF LARGE LANGUAGE MODELS IN THE PUBLIC SECTOR, dated 9 May 2023